placeholder

Computer Scientists Prove Why Bigger Neural Networks Do Better | Quanta Magazine

Two researchers show that for neural networks to be able to remember better, they need far more parameters than previously thought.

Click to view the original at quantamagazine.org

Hasnain says:

“Bubeck and Sellke showed that smoothly fitting high-dimensional data points requires not just n parameters, but n × d parameters, where d is the dimension of the input (for example, 784 for a 784-pixel image). In other words, if you want a network to robustly memorize its training data, overparameterization is not just helpful — it’s mandatory. The proof relies on a curious fact about high-dimensional geometry, which is that randomly distributed points placed on the surface of a sphere are almost all a full diameter away from each other. The large separation between points means that fitting them all with a single smooth curve requires many extra parameters.”

Posted on 2022-02-11T07:11:28+0000