×

High-dimensional probability. An introduction with applications in data science. (English) Zbl 1430.60005

Cambridge Series in Statistical and Probabilistic Mathematics 47. Cambridge: Cambridge University Press (ISBN 978-1-108-41519-4/hbk; 978-1-108-23159-6/ebook). xiv, 284 p. (2018).
This is a clearly written textbook in probability in high dimensions that illustrates the utility of the material through modern data science applications. The prerequisites for reading the book are a rigorous course in probability theory, a good understanding of undergraduate linear algebra, and general familiarity with basic notions of functional analysis; knowledge of measure theory is not essential but would be helpful.
Basically, the author studies random vectors, random matrices, and random projections in \(\mathbb{R}^n\) with very large \(n\). Amazing applications are presented in calculus, statistics, theoretical computer science, signal processing, and optimization. The core of the book is formed with concentration inequalities; the simplest result of this kind states that a point cloud generated by a standard Gaussian vector in \(\mathbb{R}^n\) concentrates around a sphere with radius \(\sqrt n\). The core covers classical results and modern developments such as the matrix Bernstein inequality; powerful tools are elaborated based on stochastic processes such as Slepian’s, Sudakov’s, and Dudley’s inequalities, as well as generic chaining and bounds based on the Vapnik-Chervonenkis (VC) dimension. A broad range of modern applications is given, including covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, statistical learning, compressed sensing, and sparse regression.
The proofs of the statements are concise and elegant. Interesting and useful exercises are integrated into the text; in most cases a hint is available at the end of the book. Each chapter concludes with a Notes section, which has pointers to other related texts.
The book is intended for doctoral and advanced masters students and beginning researchers in mathematics, statistics, electrical engineering, and computational biology. It can be used as a textbook for a basic second course in probability with a view toward data science applications. Also it can serve as an excellent reference book for researchers in statistics and data science.

MSC:

60-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to probability theory
62-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics
Full Text: DOI