Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't disagree but context limits are expanding rapidly. Gemini 2.5 Pro which was used here has a 1 million token context window with 2 million coming soon. Cost will be a concern but context size limits will not.


Totally agree, I mentioned it in another comment but Gemini was a game changer for allowing me to increase the size of the project I can feasibly have AI work on.

Only issue is Gemini's context window (I've seen my experience corroborated here on HN a couple times) isn't consistent. Maybe if 900k tokens are all of unique information, then it will be useful to 1 million, but I find if my prompt has 150k tokens of context or 50k, after 200k in the total context window response coherence and focus goes out the window.


I'd love some more innovation on increasing context size without blowing up RAM usage. Mistral small 2503 24B and Gemma 3 27B both fit into 24GB at Q4, but Mistral can only go up to about 32k and Gemma about 12k before all VRAM is exhausted, even with flash attention and KV cache quantization.


What editor are you using with gemini 2.5 pro? I really don't like their vscode extension.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: