Hacker News new | past | comments | ask | show | jobs | submit login

> there is a lot of evidence that many of the concepts ARC-AGI is (allegedly) measuring are innate in humans

I'd argue that "innate" here still includes a brain structure/nervous system that evolved on 3.5 billion years worth of data. Extensive pre-training of one kind or another currently seems the best way to achieve generality.






So all billons spent finding tricks and architectures that perfom well haven't resulted in any durable structures in contemporary LLMs?

Each new training from scratch is a perfect blank slate and the only thing ensuring words come out is the size of the corpus?


> Each new training from scratch is a perfect blank slate [...]?

I don't think training runs are done entirely from scratch.

Most training runs in practice will start from some pretrained weights or distill an existing model - taking some model pretrained on ImageNet or Common Crawl and fine-tuning it to a specific task.

But even when the weights are randomly initialized, the hyperparameters and architectural choices (skip connections, attention, ...) will have been copied from previous models/papers by what performed well empirically, sometimes also based on trying to transfer our own intuition (like stacking convolutional layers as a rough approximation of our visual system), and possibly refined/mutated through some grid search/neural architecture search on data.


Sure and LLMs ain’t nothing of this sort. While they’re an incredible feat in technology, they’re just a building block for intelligence, an important building block I’d say.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: