online read us now
Paper details
Number 2 - June 2025
Volume 35 - 2025
The generalization error bound for a stochastic gradient descent family via a Gaussian approximation method
Hao Chen, Zhanfeng Mo, Zhouwang Yang
Abstract
Recent works have developed model complexity based and algorithm based generalization error bounds to explain how
stochastic gradient descent (SGD) methods help over-parameterized models generalize better. However, previous works
are limited by their scope of analysis and fail to provide comprehensive explanations. In this paper, we propose a novel
Gaussian approximation framework to establish generalization error bounds for the U-SGD family, which is a class of
SGD with asymptotically unbiased and uniformly bounded gradient noise. We study U-SGD dynamics, and we show both
theoretically and numerically that the limiting model parameter distribution tends to be Gaussian, even when the original
gradient noise is non-Gaussian. For a U-SGD family, we establish a desirable iteration number independent generalization
error bound at the order of O((1 +√log(p√n))/√n), where n and p stand for the sample size and parameter dimension. Based on our analysis, we propose two general types of methods to help models generalize better, termed as the additive and
multiplicative noise insertions. We show that these methods significantly reduce the dominant term of the generalization error bound.
Keywords
stochastic gradient descent, Gaussian approximation, KL-divergence, generalization bound