back to article From RAGs to riches: A practical guide to making your local AI chatbot smarter

If you've been following enterprise adoption of AI, you've no doubt heard the term “RAG” tossed around. Short for retrieval augmented generation, the technology has been heralded by everyone from Nvidia's Jensen Huang to Intel's savior-in-chief Pat Gelsinger as the thing that's going to make AI models useful enough to warrant …

  1. Doctor Syntax Silver badge

    So we either have to explicitly tell it what document the answer's in or tag the documents so it can find them and then it tells me what I could find out by reading the document.

    I'm not sure this helps look for stuff in a laptop without any fancy accelerators but with a big pile of downloaded historical source document.s Text search and a bit of reading and thinking are going to save the day, as ever.

    1. druck Silver badge
      Facepalm

      FFS you could cut out all the pages of crap about the AI, and just tell people to RTFM in the first place.

      1. jake Silver badge

        I've been suggesting that people RTFM for half a century or so. Somehow I don't think this particular approach to that problem will help any.

        1. b0llchit Silver badge
          Trollface

          They do not and will not RTFM. But they will read the AIFM(*).

          The AIFM is a real improvement because AI will present you with the manual to read. A perfect world is about to become. Begone the need for shouting at people who cannot and will not read that oh so very fine manual. AI will gently nudge all those people to read the fine print and will enjoy reading twice the amount without in(ter)ference.

          (*) Artificially Interesting Fucking Manual. The ultimate truth because it was presented by AI.

          1. Doctor Syntax Silver badge

            TAIFM will be based on TFM but with variations for the veracity which nobody will be able to vouch.

            1. b0llchit Silver badge
              Coat

              TAIFM will come and take our job! Only AI will need to read TAIFM to make clear what TFM is rambling about and infuses TAIFM through the AInterface directly into your brAIn, which you no longer need because TAIFM can do perfectly well without you, thank you, and will do very much better without your in(ter)ference, cutting you out of the loopyour job completely.

              There, you see, RTFM can be replaced by TAIFM and will do much better then us meatbags.

              Now, if only I could get some Web3Crypto in there as well. Then I'd become rich beyond riches and I too can ask the shareholdersrich-club-buddies to give me more Web3Crypto because I managed to combine Web3Crypto with AI and AI with Web3Crypto. That must be worth all the riches in the world!

            2. jake Silver badge
        2. Anonymous Coward
          Anonymous Coward

          The problem with you telling someone something, Jake the Joke, is you're a cantankerous knownothing

          I can imagine you now. Advising some 24 year old student about optimal backup routines. You think you're giving held.

          She wants to get away from the grim old man, with nonce vibes, he stinks and his eyes are crawling all over her curves.

  2. chuckufarley Silver badge

    This RAG seems to have potential...

    ...but a lot of it seems to come from the UI in front of the model. Does it support whitelisting sites? Could there be limitation/side effect of unexpected or unsupported file formats? Do different RAG implementations have compatibility issues? Lastly, can it be effectively outside of a chatbot setting?

  3. Grunchy Silver badge

    I haven’t even read any Big Data books yet!

    I’m way behind on this. I ordered some crappy images from Dall-E and asked Chat GPT something or other one time, but that’s as far as I got so far.

    Meanwhile I never even cracked any of my Big Data resources yet, let alone explored any Machine Learning concepts.

    The problem with “fuzzy logic” is that the computer is incapable of giving you a definite result in response to a general query. You can ask for something to be generated, even with a fair degree of specificity; but the result is always a surprise. Most of the time, when I’m doing something on the computer, I have a specific end goal in mind. Admittedly, it’s rare and wonderful when the computer works as expected. But whenever the result is indefinite, it’s up to the operator, the human, to evaluate and vet the result before you let a computer “commit.” Recently I did my taxes, on the computer, but I had to triple check the accuracy of everything before I would let it Netfile the result. I don’t like artificial intelligence. You have to watch like a hawk every decision it tries to make. AI is liable to drive your expensive Tesla car underneath a tractor trailer because it doesn’t “recognize” the obstacle. Even worse, the AI has zero recognition of how dire a situation it is when the human operator’s CPU gets chopped off; doesn’t that just plug back in again? NO?

    1. heyrick Silver badge

      Re: I haven’t even read any Big Data books yet!

      "but the result is always a surprise"

      This.

      I've had some good results from the online AI tools, but let's not ignore the hours of weird, wrong, sometimes nightmarish results it took to get there, and the fact that if you look too closely, the weird/wrong/nightmare is still there, barely visible, but still there.

  4. Anonymous Coward
    Anonymous Coward

    Getting there

    So an LLM is a parametric model that generates likely statements given some prompt(s). Whereas the RAG database is a non-parametric model (tree, nearest neighbor, other lookup algorithms). The users input is used to lookup text from the database, which are used as prompts which the LLM is instructed to summarize, and those summaries are displayed with actual links to the database texts.

    Now I am wondering what it would take to create an LLM+ that would enable an associative memory to recall the "addresses" of the sources from which it learned. Those addresses need not be exact URL's or Titles/Authors, but just the kind of word vector indices with are used to allow lookups in databases (e.g., the databases used for RAG). This is not possible with current LLM's.

    Then an LLM+ could verify every statement it made and attribute source, or simply reply "associative memory failure, but it's on the tip of my tongue", meaning it hadn't yet generated a good enough word vector index to trigger the non-parametic model lookup.

    I realize such an LLM might be boring, but at least it would be reliable and trustworthy.

    1. Doctor Syntax Silver badge

      Re: Getting there

      "Then an LLM+ could verify every statement it made"

      That requires the LLM to understand the meaning of each statement.

      If you make a statement you intend it to have meaning because you attribute meanings to the individual words because they are symbols referring to things in the real world or abstract concepts based on that reality. You attach further meaning to the way in which those words are combined to arrive at your intended meaning.

      The only information the LLM has is the set of associations between strings of characters. Even calling the strings "tokens" is stretching things a bit too far. It would be nonsense to call them symbols in the way in which our minds treat them as symbols. They assembled on the basis of statistical associations with the prompt. The result appears to have meaning to us because we parse the result and construct an apparent meaning based on that analysis and the meanings which we attribute to the words.

      But the statement will have no meaning to the LLM because the LLM does not deal in meanings. It cannot verify that which is not in its scope.

      1. Anonymous Coward
        Anonymous Coward

        Re: Getting there

        That requires the LLM to understand the meaning of each statement.

        I would say not - it only requires that an LLM be able to make a statistical estimate if two statements are related.

  5. jake Silver badge

    Yeahbut ...

    ... what does it do for me?

    Seriously, even if I temporarily ignore the non-zero chance of "hallucinating"[0], what does this do that is useful? Why do I want it?

    [0] "Hallucinating" my left nut ... call it what it is. It's an automated bad results generator, NOT something I need or want running on my computer.

    1. Anonymous Coward
      Anonymous Coward

      Re: Yeahbut ...

      SHUT UP

      DROP DEAD YOU SLY OLD MONG JUST DROP DEAD

      JAKE SHOULD BE BANGED UP TICK TOCK

  6. Zack Mollusc

    Hallucination problem solved

    Well, that has cleared that up. AI has no intelligence or reasoning powers, so it will be fine to have it refer to a business's internal processes, procedures, and support docs because they are comprehensive, correct and free from any ambiguity.

  7. steelpillow Silver badge

    Progress of sorts

    One the face of it this is all a bit shit, as observed by others already. But it does change the paradigm from ML+BD to ML+BD+AndNinthly. Must have potential somewhere, if only the next small step towards genuinely better things.

    For example if it can even FTFM (find the fucking manual) so that you can then read it, that will be progress where I work.

    1. AVR Bronze badge

      Re: Progress of sorts

      It can find the manual if the manual is correctly tagged, or possibly if it's pointed at a great wodge of your internal documents it'll come up with something like the right one. Or possibly it'll work from an out of date version and not tell you; no way of telling from this if I'm reading it correctly.

      1. steelpillow Silver badge

        Re: Progress of sorts

        But then, we do that all the time already. It would at least save us vast swathes of time by ploughing through the eternal SharePoint cock-ups for us.

        1. tfewster
          Facepalm

          Re: Progress of sorts

          Is the documentation on Teams, Confluence, ServiceNow, the intranet, various generations of SharePoint, shared drives, Wikis...? An internal search engine that could index those repositories and just return links to the results would be nice.

  8. Bebu
    Windows

    Perhaps an oversimplification...

    seems to me here that these pre-trained RAG LLMs are interpreters or virtual processors that "execute" the data from the database to process input and emitting output.

    The only difference between the silicon processors from NVIDIA, AMD or Intel and these RAG processors is the latter's slightly more demented design process. :)

    While Intel and co. might have to deal with spectre design flaws, AI designed machines might well harbour ghosts. :)

    When it doesn't work it's a DRAG (Defective Retrieval Augmented Generation) - use your own imagination for BRAG or RAGE.

    RAG = Really About Greed. :))

  9. jake Silver badge

    RAG = Rather Arrogant Gabbling

    1. Anonymous Coward
      Anonymous Coward

      Jake = Just Another Knobhead Eejit

  10. Cincinnataroo

    I'd be interested to see a technical assessment or two by people who've, done a job with and without a RAG system. They've had enough time to work with both systems and have informed opinions.

    My guess is that with some jobs it can be an advantage, and with others not. Also depends on those involved obviously.

    I didn't find that yet in my searches.

    1. Anonymous Coward
      Anonymous Coward

      I have

      RAG is really a dirty solution to an insurmountable problem in AI: lack of clock speed.

      It is a useful bridging gap, but it goes against the purpose of machine learning somewhat. Now you have another traditional database which will expand and expand.

      This is a fundamental problem with LLMs and the only fix is real-time models. And the time-span for clock speeds to get to a level where we can run these real-time Progressive Neural Networks is at best decades.

      AI has reached the clock speed wall. Take the money and run.

  11. Anonymous Coward
    Anonymous Coward

    Tried ChatRTX

    I've not tried RAGs in nearly this detail. The closest I came was to install Nvidia's ChatRTX, which uses RAG to access external libraries.

    Then, to test it, I downloaded numerous texts about Ancient Egypt from Wikipedia and a few other sources. I then asked it questions, checking the results against the pages I gave it.

    The results were utter garbage. Not only was it often incorrect, but at times, it brought up information that contradicted what I had supplied.

    When I asked Nvidia about this, they said, "Give it more data". But unless companies are in the habit of duplicating and rewriting their documents, I'm not even sure what "more data" means in the context of RAG use cases.

    I'll work through this tutorial to see if I get better results. But I wonder if people are glossing over how inaccurate these systems can be. For example, would you trust a model like this with your tax calculations?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like