good stuff
I enjoy your coverage of AI and it isnt too far out of date and much improved.
And I was so enjoying your article until... you shoehorned RAG in! Are you an old database man that is trying desperately to keep traditional databases in the previously elevated position? Kind of reminds me of the news hacks when the internet came along and they tried to get Quark to do HTML! Oh how I laughed.
RAG is alright but what is the point of having another massive database of structured data again! So, with your beloved RAG, we will get to the situation where more and more content providers are added and then we end up back where we started except this time the answers are in NLP. RAG is much more processor intesive too.
OpenAI have a much more complicated system then just RAG. RAG isnt used there, they have their own 4 part modular data system of which a highly advanced version of RAG (CRS) WITHOUT the structured databases is part of.
You mention that the downside of RAG is the purity of the data, but that's an old database man argument that is applicable to any data. That isn't noting the negative impacts of running what will be DB #3. #1 the internet as a structured (of sorts) database that goes into #2 AI model and becomes abstract data that is then superseded by another #3 structured database containing the same data - albeit limited to only the major 'trusted' outlets of propaganda. And not only that but that #3 step means lag, big lag. Im sure this deal isn't API calls. It's their archive. It will be hosted by OpenAI. It's too slow otherwise and energy costs will rocket as RAG is a very CPU intensive task.
And that's just the archive. What about the stream of content for the next 5 years! How can you integrate real-time into a snapshot model LLM. You cant. You can fudge it like RAG but it will bite u in the arse for the reasons mentioned above.
RAG is useful but it is a best a clumsy stop gap that is a fake real-time model. The only way we can get LLMs or similar to work properly is for them not to be snapshot databases that we have now, but real-time models.
We have just finished a model specially designed to estimate how long it will take for desktop/mobile devices to be able to run real-time neural networks based on the current trajectory of tech and including random speed bursts to account for little discoveries on the way, like RAG. We were convinced it was wrong for about a month but could not disprove or disagree with its projections no matter how much we tried.
Server-based real-time 10-15 years. Commercial rollout 15-20. Local versions using same model: 2 decades. 2 decades till we get the thing to even the level of the most stupid of humans. Even then, we will probably discover that there is a whole load of new problems that will stop as ever getting there.
RAG is a waste of time. Spend the time and research on real-time fully blown AI models - not crappy little Apple basic OS actions AI. RAG is XLST - a hopeless deadend that offers initially interesting and powerful results until you use it in production and everybody hates it.