are you going to upload 10M tokens to Gemini on every request? That's a lot of data moving around when the user is expecting a near realtime response. Seems like it would still be better to only set the context with information relevant to the user's prompt which is what plain rag does.