Opus seems to be much better at that. Probably why it’s so much more expensive. ...

nerdponx · on May 21, 2024

Right, but this is also a core limitation in the transformer architecture. You only have very short-term memory (context) and very long-term memory (fixed parameters). Real minds have a lot more flexibility in how they store and connect pieces of information. I suspect that further progress towards something AGI-like might require more "layers" of knowledge than just those two.

When I read a book, for example, I do not keep all of it in my short-term working memory, but I also don't entirely forget what I read at the beginning by the time I get to the end: it's something in between. More layered forms of memory would probably allow us to return to smaller context windows.

swax · on May 22, 2024

I mean we have contexts now so large it dwarfs human short term memory right?

And then in terms of reading a book, a model's training could be updated with the book, right?