Achieving <80% on CIFAR10 in the year >2020 is an example of a *failed* toy mode...

hiddencost · on Jan 12, 2023

Hinton is doing basic science, not ML, here. Given who he is, trying to move the needle on traditional benchmarks would be a waste of his time and skills.

If he invents the new back propagation, an army of grad students can turn his ideas into the future. Like they've done for the last 15 years.

He's posting incremental work towards rethinking the field. It's pretty interesting stuff.

Edit: grammar

alper111 · on Jan 12, 2023

Plain MLP acc. is 63% vs 59% with FF, not so bad? By the same logic, MLP is a failed toy model.

jasonjmcghee · on Jan 12, 2023

I haven't seen this to be the case, fwiw. There was a paper in 2016 that did this and most were in the ~40% range.

But "any ml algorithm" isn't the point. It's a new optimization technique and should be applied to models/architectures that make sense with the problems they are being used on.

For example, they could have used a pretrained featurizer and trained the two layer model on top of it, with both back prop and FF and compared.

whimsicalism · on Jan 12, 2023

> For example, they could have used a pretrained featurizer and trained the two layer model on top of it, with both back prop and FF and compared.

Making the assumption that weights/embeddings produced by a backprop-trained network are equally intelligible to a network also trained by backprop vs. one trained by this alternative method.

jasonjmcghee · on Jan 12, 2023

I have personally seen them used successfully with all kinds of classic ml algorithms (enets, tree-based, etc) that have nothing to do with back prop.

whimsicalism · on Jan 12, 2023

Any ML algorithm that already has tooling written, CUDA scripts, etc. to run it faster.

That said, I am also short-term bearish on backprop-free methods (although potentially long-term bullish).