This is an automated archive made by the Lemmit Bot.

The original was posted on /r/artificial by /u/brainhack3r on 2024-02-21 20:38:39.


The Gemini release was really interesting in that they sort of buried the lede by not mentioning the 99% accuracy of the context window.

The 128k context window of OpenAI will fall down pretty quickly and really is only 32k-64k if you care about your context actually being used.

Ideally you would just fit all your data into the 10M token context window but that’s going to be about $5 as per my understanding.

That’s going to get expensive quickly for a lot of applications.

The questions is how long will this be the case. If RAG is only about cost savings I can see it starting to fade away in use over the next 1-2 years and most people just wanting to push everything into the context window.