back to article Buying a PC for local AI? These are the specs that actually matter

Ready to dive in and play with AI locally on your machine? Whether you want to see what all the fuss is about with all the new open models popping up, or you're interested in exploring how AI can be integrated into your apps or business before making a major commitment, this guide will help you get started. With all the …

  1. beast666 Silver badge

    AI doesn't exist except as a marketing deception.

    1. Version 1.0 Silver badge
      Facepalm

      I have always thought these days that "If you don't use AI you are uninformed, if you do use AI you are misinformed." ...

      A view inspired by Mark Twain's original quote "If you don't read the newspaper you are uninformed, if you do read the newspaper you are misinformed."

    2. jake Silver badge

      In today's incarnation, AI is nothing more than an automated bullshit generator masquerading as a meaningful information system.

    3. BenMyers

      As described in the article, I have to agree with your completely.

  2. Paul Herber Silver badge

    What's the current burning question? Do you want any toast?

    1. Steve K
      Coat

      No…

      …I’m a waffle man

    2. jake Silver badge

      No thank you. I don't like burnt currants on my toast.

    3. Boris the Cockroach Silver badge

      Which led to the accident involving me, a waste disposal and a 14 pound lump hammer.

      Hold on, we still talking about LLMs that are claimed to be AI ?

      <<hides the rest of the incriminating evidence

  3. Henry Wertz 1 Gold badge

    can't stress vram enough

    Can't stress vram enough,. Messing about with this stuff, more ram bandwidth and tops might let it run faster. But not having enough vram keeps models from runnibg at all (without just running on cpu instead.)

    I've found my GTX1650 to be rather ineffective since many models need more than 4GB VRAM no matter how you slice it. (you can run highlyt quantisized but running the text ones that qiuantisized. they get stupid and halliucinate.. well more than normal,. Image gen is right out,

    'Luckily' since the GTX1650 doesn't have Tensor untils and whatever to make models run extra fast, and my coffee lake cpu is reassonable. it's 'only' about 10x the performancer of just letting the CPU do it. Just playing, i don't care if a text response takes 5 or 10 seconds or an image takjes like 90 seconds instead of about 10 on gpu (with the one set of settings that made that model 'small' enough to run on the GPU at all.)

    1. Glenn Amspaugh

      Re: can't stress vram enough

      There's an Nvidia RTX 4060 Ti with 16 GB VRAM in the $500 / US range, but it has the 128b memory channel so not the quickest card out there.

      https://www.amazon.com/MSI-GeForce-RTX-4060-Ti/dp/B0CCFZMHSM

  4. jake Silver badge

    But ...

    ... what do you use it for?

    Or is it just an extremely compute-expensive toy with no actual meaning?

    1. Sorry that handle is already taken. Silver badge

      Re: But ...

      Haha funny pictures

      and

      Look at this bland and misleading but correctly spelled report I "wrote"

    2. Adair Silver badge

      Re: But ...

      But... seriously, I suppose it's just like any tool. In the right hands, and used for the purpose for which it is intended, great things (relatively) can be done. For the rest of us, probably not so much.

      I can see that 'AI' (whatever that misnomer actually means), properly tuned and applied in very specific and controlled circumstances can be used to great effect, e.g. medical analysis, etc. But, at the same time we're talking about a system that in uncontrolled situations effectively feeds on excrement, and we are surprised when shit is what it generates.

      'Garbage in; garbage out' - same as it ever was.

      1. Michael Wojcik Silver badge

        Re: But ...

        In the right hands, and used for the purpose for which it is intended, great things (relatively) can be done.

        I'm still waiting for an example. And I follow a fair bit of the research, as well as commentary from people like Zvi Mowshowitz. Many things which are impressive, in the context of machine-learning research, natural language processing, and so on, have indeed been done. Things which are in some sense "great", outside the context of research? I can't think of any.

        In fact, I have argued that LLMs and diffusion models are, at least in their common uses, in the long run counterproductive, as competitive cognitive technologies. They encourage shallow thinking and discourage understanding.

        There are tools which are not, in fact, particularly useful. People invent all sorts of things. Some of them are misses.

    3. Anonymous Coward
      Anonymous Coward

      Re: But ...

      Oh shut up you know nothing

      1. Michael Wojcik Silver badge

        Re: But ...

        We have a particularly trenchant and insightful AC with us today, don't we?

  5. Fazal Majid

    A M1 or M2 Mac Studio has far more unified RAM available to its GPU/NPU than even the $36,000 nVidia H100 to run large models. Unfortunately the Apple Silicon GPU is nowhere near as fast as nVidia's.

    Primate Labs, makers of Geekbench, have a ML/AI benchmark tool. The results are finally available on the genersl Geekbench browse (but still not searchable)r:

    https://browser.geekbench.com/ai/v1

  6. Charles E

    New Geekbench AI benchmarks

    Geekbench AI was just released and runs on most platforms. I'm not sure what to make of the stats yet, but at least it is a quantitative measurement of some sort or another.

    https://www.geekbench.com/ai/

    1. HereIAmJH Silver badge

      Re: New Geekbench AI benchmarks

      At a glance, my first thought was why didn't they make it a portable app so installation isn't required. (AppImage, etc) I could see putting it on a thumbdrive to benchmark a variety of machines, but other than hardware upgrades how often would you benchmark the same machine?

      An interesting note; with ONNX I can benchmark both of my GPUs. With OpenVINO it only sees the Intel UHD, and not the NVIDIA. i don't know enough about AI yet to know if these number are telling me anything useful. But based on bigger numbers are better, even my old 1660TI beat the crap out of an i7. So repurposing an old server isn't going to make a good AI workload box without investing in a decent GPU. Maybe I'll have to look for a gaming desktop around Black Friday.

  7. Kevin McMurtrie Silver badge

    If the "70 billion parameter" LLM is llama3.1:70b, it runs fine on 128GB of DDR5 RAM without a GPU. Not fast, but fine. It can reply faster than you can search the Internet.

    I managed to get llama3.1:405b running with 128GB plus a sacrificial Gen5 NVMe stick for swap. It takes it two days to complete a response so it's not at all usable. DDR5 motherboards for "desktop" computers require unbuffered memory that currently maxes out at 192 GB. Apple's M2 chips hit the same 192GB limit even if you have the wealth for more. Only a "server" motherboard taking buffered DDR5 can reach the 256GB needed. The two types of DDR5 are not interchangeable, of course.

    All of this compute power doesn't even cover training LLMs or pre-loading them with a lot of context. That $$$$$$ is not in my range even if it was a hobby.

  8. neurochrome
    Happy

    Hobby-level hardware for ML tyre-kicking

    Image gen - maximise amount of vram even if it's slower chippery. Best value/compatibility I could find was Nvidia's 3060 with 12GB. Doesn't need to be the Ti version either. Along with that, system RAM is important - doesn't need to be super fast, but the more the better.

    For Stable Diffusion SDXL using Forge webui on Win10 with 64GB and the 3060/12 I found system RAM usage hovering in the low 30s. Flux1.dev also fine, but system RAM usage regularly into the mid 40s. SDXL training of TIs and LoRAs is doable, but model fine-tuning is not. It looks like Flux1.dev LoRA training *might* be possible - I see a lot of effort on Github to fit the training into 12GB gfx cards.

    Laptop with Nvidia 2060/6GB + 16GB system RAM: SDXL image gen with Forge is doable (around 1min for a 1MP image). Scaling up to larger images does become a bit drawn out, and I had to add a fan base under the machine. Flux1.dev was painfully slow with plenty of crashes.

    For LLMs, the same laptop works OK with Kobold and a variety of 7B models at usable speed (bit less than reading speed), but it does max the laptop out. Interesting to try out, but not terribly practical.

    1. Glenn Amspaugh

      Re: Hobby-level hardware for ML tyre-kicking

      There's an Nvidia RTX 4060 with 16GB in the $500 range (US vendors)

  9. xyz Silver badge

    Adenoid heaven...

    <nasal whine>

    Well what you really want is your vRam connected to your gpu flops and delivering in excess of 40gb/s blind speed processing.

    </nasal whine>

    Other reason not to go down the pub.

  10. Bitsminer Silver badge

    This space is evolving quickly...

    The CPU / GPU demands will drop for a while as the software gets faster and models improve in quality while staying at about the same size.

    Then it will all change as "40 TOPS" minimum becomes "60" then "90"....

    By coincidence I had to buy a new laptop last year. 8core+SMT. 32GB of RAM was only a few hundred more. 1TB was only a hundred more. Faster video was only a hundred more.

    Turns out it runs ollama or llamafile quite readily, but only on the CPU. It's a laptop so the thermal regulator cuts in pretty quick and the cores slow down to 1.5GHz or less. The 4-bit models work pretty well in spite of this.

    My advice is to buy a very new CPU; the old ones don't have the AVX512 variants and that makes a big difference. If you do try to recycle old hardware you will need to fit in a (supported) 16GB video card.

    1. BenMyers

      Re: This space is evolving quickly...

      Which makes and models of video cards come with 16GB?

  11. BenMyers

    This article, though very thorough, makes no mention whether multi-core CPUs have a positive effect on the time it takes to run an AI model.

    1. Bitsminer Silver badge

      They do give a substantial improvement.

  12. StargateSg7 Bronze badge

    Right now, I'm running an old AMD S9170 GPU card with 2.6 TeraFLOPS at 64-bits that when converted to 16-bit Integers gives me 10.5 Tera-Compare Operations Per Second throughput. YES! I mean Tera-Compare operations and NOT convolution or numeric processing operations! I am talking about BITWISE COMPARE operations!

    One thing I have done for the current modality of statistics-based and CNN (i.e. typical 3x3 or 5x5 convolution kernels-based operations)-based LLM's and Stable Diffusion models is to convert those models over to bitwise-operations which is much less taxing on a GPU. By using bitwise-compares (i.e. a nested if-then-else) which lets me use only 9 compare operations and two range-clipping/rounding operations and one copy or move operation in order to evaluate a token, it means I only need to do twelve 16-bit operations per token which gives me a performance of about 850 MILLION 16-bit UNICODE character tokens evaluated per second which is GREAT for most natural language-oriented operations. If you convert that to words and sentences, it means I can get an average of 20 Million Words Searched, Indexed, Sorted and Re-Ordered for final output per second which a heck of an improvement on a GPU that normally only does 13 tokens (words) per second when using a full convolution operation or statistical analysis process for all inputs and outputs.

    That is an equivalent LLM output of around 60,000 pages of fully-evaluated words that would/could match all the requests in a typical 2000 word input prompt/end-user LLM query.

    Convert that same 60,000 pages of LLM output to Stable Diffusion-style image processing, it means I could output 3 frames of 8192 by 4320 pixels of 64-Bits per RGBA pixel video frames every second (i.e. HDR aka High Dynamic Range colour DCI-8K resolution video!) --- AND --- If I go down to a mere DCI-4K resolution at 32-bits per RGBA pixel, i could get 24 frames per second output in real-time which is the typical Hollywood production film frame-rate at 4096 pixels by 2160 pixel video!

    Simply by taking advantage of bitwise compare operations within ANY of the major GPU's means you can get ENORMOUS performance increases in LLM and Stable Diffusion output!

    V

    1. bigphil9009

      All those "brains" and still can't use an apostrophe correctly.

  13. harrys Bronze badge

    The more "boring" ML stuff happening on non LLM type data is where the real stuff is happening and gonna change peoples everyday lives again

    Why again .... ML all started with going from paper ledgers to spreadsheets/databases yonks ago and jeez.... did that change the world or what !

    Though i suppose u could argue that the abacus was a ML device too :)

    The more things change the more they stay the same, allbeit, speeded up, faster and faster and faster to ad nauseum

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like