Information Theory for Machine Learning: Theorems, Proofs, and Python Implementations

Author: Yehuda Setnik
Publisher: Independently Published
ISBN:

9798273722620

Pages: 374
Publication Date: 09 November 2025
Format: Paperback
Availability: Available To Order

We have confirmation that this item is in stock with the supplier. It will be ordered in for you and dispatched immediately.

Our Price $211.17 Quantity:

Share |

Information Theory for Machine Learning: Theorems, Proofs, and Python Implementations

Overview

The complete graduate-level reference for entropy, divergence, and mutual information in modern machine learning, rigorously developed from measure theory to contemporary estimators and algorithms. Measure-theoretic foundations: sigma-algebras, Radon-Nikodym, conditional expectation, change of measure. Core measures: entropy, cross-entropy, KL, mutual information; f-divergences and Renyi divergences with variational dualities (Fenchel, Donsker-Varadhan). Data processing and fundamental inequalities: log-sum, Pinsker, Csiszar-Kullback-Pinsker, Fano, Le Cam, Assouad; equality conditions and sufficiency. Gaussian tools: entropy power inequality, de Bruijn identity, Fisher information, I-MMSE, Gaussian extremality. Maximum entropy and exponential families; log-partition convexity, Bregman geometry, Pythagorean theorems. Fisher information and asymptotics: score, Cramer-Rao bounds, LAN, Bernstein-von Mises, asymptotic efficiency. Information geometry and natural gradients: Fisher-Rao metric, dual connections, mirror descent. Source coding and MDL: Kraft-McMillan, NML, universal coding, compression-generalization links. Generalization: PAC-Bayes bounds, mutual information bounds I(W;S), stability of SGD. Concentration via information: DV method, log-Sobolev and Poincare inequalities, transportation T1/T2, hypercontractivity. Variational inference and divergence minimization: ELBO, alpha-divergences, EP, black-box VI with reparameterization. Estimating entropy and MI: plug-in, kNN, KDE, Kraskov, MINE, InfoNCE; minimax rates and consistency. Rate-distortion and information bottleneck: Blahut-Arimoto, optimal encoders, sufficiency-compression trade-offs. Contrastive representation learning under augmentations: alignment vs uniformity, identifiability, sample complexity. Generative modeling: VAEs, bits-back coding, beta-VAE, TCVAE; likelihood calibration and posterior collapse. Score matching and Stein: Fisher divergence, kernel Stein discrepancies; diffusion models as score-based SDEs with likelihood estimation. Optimal transport with entropic regularization: Kantorovich duality, Sinkhorn, Schrodinger bridges; OT vs f-divergence objectives. Distributed and federated learning under communication limits: quantization, gradient coding, lower bounds via information. Privacy and leakage: differential privacy, Renyi DP, moments accountant; accuracy-privacy trade-offs and inference risks. Active learning and Bayesian experimental design: expected information gain, submodularity, scalable estimators.

Full Product Details

Author: Yehuda Setnik
Publisher: Independently Published
Imprint: Independently Published
Dimensions: Width: 21.60cm , Height: 2.00cm , Length: 27.90cm
Weight: 0.866kg
ISBN:

9798273722620

Pages: 374
Publication Date: 09 November 2025
Audience: General/trade , General
Format: Paperback
Publisher's Status: Active
Availability: Available To Order

We have confirmation that this item is in stock with the supplier. It will be ordered in for you and dispatched immediately.

Reviews

Author Information

Tab Content 6

Author Website:

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Information Theory for Machine Learning: Theorems, Proofs, and Python Implementations

9798273722620

Availability Information

Overview

Full Product Details

9798273722620

Table of Contents

Reviews

Author Information

Tab Content 6

Countries Available

Sign up now