Computer Scientists Prove Why Bigger Neural Networks Do Better | Quanta Magazine
Two researchers show that for neural networks to be able to remember better, they need far more parameters than previously thought.
Hasnain says:
“Bubeck and Sellke showed that smoothly fitting high-dimensional data points requires not just n parameters, but n × d parameters, where d is the dimension of the input (for example, 784 for a 784-pixel image). In other words, if you want a network to robustly memorize its training data, overparameterization is not just helpful — it’s mandatory. The proof relies on a curious fact about high-dimensional geometry, which is that randomly distributed points placed on the surface of a sphere are almost all a full diameter away from each other. The large separation between points means that fitting them all with a single smooth curve requires many extra parameters.”
Posted on 2022-02-11T07:11:28+0000