The quantization breakthrough here is amazing: can run ~200B parameter models on...

sigmoid10 · on Aug 18, 2022

From a neuroscience perspective it would seem obvious that neural networks can work with less than 8 bits. According to a study from 2015 [1], the synapses in the Hippocampus can store about 4.7 bits of information (26 discrete connection strengths). While the real brain graph is very different from a transformer, I think this should still be achievable for other architectures as it is most likely just a question of extra stabilization during training.

[1] https://elifesciences.org/articles/10778

SideQuark · on Aug 18, 2022

That paper showed a minimum of 26 states, not a maximum. Later papers have increased this significantly.

[1] for example increased the number 10-fold. Papers like [2] have pushed this complexity per synapse into a much higher level of complexity (so much they don't even put a number on it).

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247597/

[2] https://www.nature.com/articles/s41598-020-64874-9

sigmoid10 · on Aug 19, 2022

Tbh I wouldn't be surprised if the sizes are not quantised at all. But if you look at the histogram in [1], most synapses fall into the low number of states range. This is probably related to the aforementioned sparsity of certain neural network layers. Pruning outliers from the brain is really difficult from an evolutionary perspective, but the approach linked in this post where you simply treat them differently than other parts seems like a reasonable way to go for artificial neural networks.