Hacker News new | past | comments | ask | show | jobs | submit login

That's seems silly, it's not poisonous to talk about next token prediction if 90% of the training compute is still spent on training via next token prediction (as far as I am aware)





99% of evolution was spent on single cell organisms. Intelligence only took 0.1% of evolution's training compute.

Are you making a claim about evolution here?

What you just said means absolutely nothing and has no comparison to this topic. It’s nonsense. That is not how evolution works.

ok that's a fair point

I don’t really think that it is. Evolution is a random search, training a neural network is done with a gradient. The former is dependent on rare (and unexpected) events occurring, the latter is expected to converge in proportion to the volume of compute.

why do you think evolution is a random search? I thought evolutionary pressures, and the mechanisms like epigenetics make it something different than a random search.

Evolution is a highly parallel descent down the gradient. The gradient is provided by the environment (which includes lifeforms too), parallelism is achieved through reproduction, and descent is achieved through death.

The difference is that in machine learning the changes between iterations are themselves caused by the gradient, in evolution they are entirely random.

Evolution randomly generates changes and if they offer a breeding advantage they’ll become accepted. Machine learning directs the change towards a goal.

Machine learning is directed change, evolution is accepted change.


It's more efficient, but the end result is basically the same, especially considering that even if there's no noise in the optimization algorithm, there is still noise in the gradient information (consider some magical mechanism for adjusting behaviour of an animal after it's died before reproducing. There's going to be a lot of nudges one way or another for things like 'take a step to the right to dodge that boulder that fell on you').

> Machine learning is directed change, evolution is accepted change.

Either way, it rolls down the gradient. Evolution just measures the gradient implicitly, through parallel rejection sampling.


Evolution also has no "goal" other than fitness for reproduction. Training a neural network is done intentionally with an expected end result.

There's still a loss function, it's just an implicit, natural one, instead of artificially imposed (at least, until humans started doing selective breeding). The comparison isn't nonsense, but it's also not obvious that it's tremendously helpful (what parts and features of an LLM are analagous to what evolution figured out with single-celled organisms compares to multicellular life? I don't know if there's actually a correspondance there)



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: