Abstract
We study conditions under which, given a dictionary $F=\{f_{1},\ldots,f_{M}\}$ and an i.i.d. sample $(X_{i},Y_{i})_{i=1}^{N}$, the empirical minimizer in $\operatorname{span}(F)$ relative to the squared loss, satisfies that with high probability
\[R(\tilde{f}^{\mathrm{ERM}})\leq\inf_{f\in\operatorname{span}(F)}R(f)+r_{N}(M),\] where $R(\cdot)$ is the squared risk and $r_{N}(M)$ is of the order of $M/N$.
Among other results, we prove that a uniform small-ball estimate for functions in $\operatorname{span}(F)$ is enough to achieve that goal when the noise is independent of the design.
Citation
Guillaume Lecué. Shahar Mendelson. "Performance of empirical risk minimization in linear aggregation." Bernoulli 22 (3) 1520 - 1534, August 2016. https://doi.org/10.3150/15-BEJ701
Information