You shouldn't fit an entire database in the context anyway.
btw, 10M tokens is 78 times more context window than the newest GPT-4-turbo (128K). In a way, you don't need 78 GPT-4 API calls, only one batch call to Gemini 1.5.
I don't get this why is it people think that you need to put an entire database in the short-term memory of the AI to be useful? When you work with a DB are you memorizing the entire f*cking database, no, you know the summaries of it and how to access and use it.
People also seem to forget that the average is 1b words that are read by people in their entire LIFETIME, and at 10m, with nearly 100% recall thats pretty damn amazing, i'm pretty sure I don't have perfect recall of 10m words myself lol
You certainly don't need that much context for it to be useful, but it definitely opens up a LOT more possibilities without the compromises of implementing some type of RAG.
In addition, don't we want our AI to have superhuman capabilities? The ability to work on 10M+ tokens of context at a time could enable superhuman performance in many tasks. Why stop at 10M tokens? Imagine if AI could work on 1B tokens of context like you said?
btw, 10M tokens is 78 times more context window than the newest GPT-4-turbo (128K). In a way, you don't need 78 GPT-4 API calls, only one batch call to Gemini 1.5.