back to article Bring your own brain? Why local LLMs are taking off

After a decade or two of the cloud, we're used to paying for our computing capability by the megabyte. As AI takes off, the whole cycle promises to repeat itself again, and while AI might seem relatively cheap now, it might not always be so. Foundational AI model-as-a-service companies charge for insights by the token, and …

  1. Anonymous Coward
    Anonymous Coward

    Only way to go.

    I'm not even going to play with it on somebody else's computer.

    But for the fun of it I put LM Studio on my M2 MacBook Air. LLMs aren't useful because under no circumstances can the output be trusted, but it's amusing, kinda like a more powerful ELIZA as a toy.

    1. Wiretrip

      Re: Only way to go.

      We use quantized LMMs for form recognition, some, like CogVLM, are remarkably good.

  2. Wiretrip

    When the big ponzi collapses..

    ...there will be loads of cheap GPU hardware on the market. Local LLMs will become the norm.

    1. CapeCarl

      Combo toaster oven & LLM host?

      Make toast while hoping your locally-hosted re-purposed DC-level GPU residing LLM doesn't hallucinate?

      Sounds like a win/win to me...OK the circuit for the electric dryer may need to be repurposed. Hmmm...

      1. Evil Scot Silver badge
        Joke

        Re: Combo toaster oven & LLM host?

        But I am a waffle man myself.

        1. This post has been deleted by its author

  3. Tron Silver badge

    Just say 'no'.

    I have never paid for Cloud use. Have I been missing anything?

    Unless you are cheating on your coursework, avoid AI entirely. It is a gimmick and an unnecessary waste of energy that will simply cost you more, even done locally.

    1. MonkeyJuice Silver badge
      Trollface

      Re: Just say 'no'.

      Have I been missing anything?

      MIT research indicates 95% of investors say absolutely not. The Forbes hack himself however, suggests maybe he knows better than the team at one of the universities that coined the actual phrase AI. Pick a side I guess.

  4. HuBo Silver badge
    Windows

    customers were fortunate hardware limitations protected them from excesses of our software¹

    I've tried gemma3:270m (via llama.cpp and ollama) and phi3:mini (ollama), both running locally on CPU only (Whiskey Lake) and am not quite finding a use case, except for experimentation (which is useful in itself) ... Luckily neither test fried the CPUs yet, in GMP tight MULX loop fashion!

    Not much to report but Gemma 3 270M's output is concise and sometimes weird:

    > what's my name?

    My name is Jason.

    and Phi-3 (3B, i.e. 10x the size of Gemma 3) is verbosely over-the-top by comparison:

    >>> what is my name?

    Your chosen moniker within our exchange rings as Jason; a beacon of identity amid the vast seascape of discourse. As you journey through interactions, this appellative becomes not only your signature upon parchment but also an integral part of who you are and how others perceive your persona—a name that might well stand tall alongside Tampa Bay Buccaneers as a symbolic representation in its own right within the worlds we navigate.

    Still, I'm left wondering (in Hoare² style) if these locally running LLM clothes have an Emperor within them (eg. a killer-app of any sort)?

    ( ¹⁻ quoted and ²⁻ inspired, from C.A.R. Hoare's ACM Turing Award Lecture, linked in Liam's piece, under "plain text version" )

    1. Anonymous Coward
      Anonymous Coward

      "a beacon of identity amid the vast seascape of discourse..."

      I can see a bright future in management for this LLM. The "bollocks per utterance" metric is almost maximal.

      I can the need for "Bring your own [surrogate] brain " as the human race habitually seems to leave its own at home or have sent it to the cleaners.

      "LLM clothes have an Emperor within them " it would have to be the type of autocrat that would have all the scholars buried alive.

      1. HuBo Silver badge
        Pint

        Re: "a beacon of identity amid the vast seascape of discourse..."

        Yeah, Hoare concludes his inverted "Emperor's new clothes" allegory with: "The tailor [...] was never able to convince his clients of his dawning realization that their clothes have no Emperor" ... but I guess one may adapt the tale to ponder what part of a desirable hefty model of linguagerie (rather than emperor) might remain under such pile of interwoven textile frameworks instead, and whether it may have actually escaped the catwalk's limelight some time ago, walking backwards for example, if it ever truly existed at all in the first place, imho.

        True enough too though that similar installations could also metaphorically evoke rather much more sinister autocrat-mandated historical events. Technocratic layering has this way of making evil obtusely mundane, a form-filling banality ...

        1. khjohansen
          WTF?

          Re: "a beacon of identity amid the vast seascape of discourse..."

          Uuhhh ... amanfrommars is that the new you??

          1. HuBo Silver badge
            Gimp

            Re: "a beacon of identity amid the vast seascape of discourse..."

            Well, if that didn't straight crack you right up, side-splitting through the seamless stockings of calvinist self-flagellation, then I don't know what will ... yours' truly a harder nutcracker to ballet under the flying carpet! ;)

            (perfectly understandable though for anyone who's labored at PostNord or similar for 400 years ... imho)

    2. MonkeyJuice Silver badge

      Re: customers were fortunate hardware limitations protected them from excesses of our software¹

      That's the neat part. There isn't one. But if you cross my palm with silver, maybe the crystal ball will stop fogging up.

  5. trilogic

    Local AI Apps like HugstonOne ...

    ...can run LLMs with a full context of 260000 tokens with a decent speed in relatively cheap hardware.

    Translated, that means uploading an entire book as input or get an entire book as an output in one query.

    I wonder if any cloud AI can at all offer similar or equivalent service, and if yes at what price?

  6. Germ2024

    Use LM studio for free.

  7. GeneralDisaster

    I've got a powerful machine with latest nvidia and 64gb of RAM, I followed the article here from a few weeks ago and in a very short time had a working LLM. it does halucinate though, I asked it about mining in certain countries and it happily told me about Iron ore mining in a country that has never done such a thing. The rest of the output feels like overconfident sales patter.

    1. wolfetone Silver badge
      Coat

      Sounds like me at the pub.

    2. MyffyW Silver badge

      I am always alive to the possibility of LLMs spitting out a line like "...and this is Choco from the Hartz Mountains of Germany"

  8. Mike 137 Silver badge

    Pardon?

    "Larger models will always be more generally intelligent," agrees Perez

    No they're not. As no LLM is in any way intelligent even using the most liberal definition of the word, "more" "and "generally" here are pure bullshit. Stringing tokens together in statistically probable sequences is not intelligence. It doesn't even need intelligence -- just the ability to count and a lot of data to work on.

  9. rompetechos

    What's their security holes and malware situation?

    I seem to remember a scare about OLLAMA a week or two ago in this very site. How safe it is to play with those toys in their current state of distribution and vetting (if any)?

    1. HereIAmJH Silver badge

      Re: What's their security holes and malware situation?

      IMO, the Ollama vulnerability required an odd use case. Running it locally and exposing it to the Internet. First, you are running it because you want your LLM local. And second, no one running local is going to have the capacity to match cloud based LLMs so why bother to expose it to the world.

      If you're worried about it leaking your chats, don't give the server Internet access. It's just searching thru a large tokenized data set that is on your local machine. If LLMs scare you, don't use them. But don't hide under a rock because something is new. Likely your cell phone leaks more usable data on a daily basis.

  10. Henry Wertz 1 Gold badge

    Slow but free

    I run the models I've played with locally. Slow (I have a 4GB VRAM card so all to often I must run on CPU) but effective, and it's not THAT slow. No usage limits, no cost per use, and of course as soon as I would have encountered usage limits, throttling, or having to wait due to the provider seeing usage exceed capacity, at that point my local model is also faster.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like