Extra mile
> Many law firms would run toward that sort of tool, screaming "Shut up and take my client's money!"
And with luck, many clients would say to their lawyers"I can now do that without your assistance. What value do you add?"
What is it that makes a PC an AI PC? Beyond some vague hand-waving at the presence "neural processing units" and other features only available on the latest-and-greatest silicon, no-one has come up with a definition beyond an attempt to market some FOMO. While Intel suggests that application developers will soon infuse all …
With my i9 10980XE, my 64GB of DDR4, my 8TB of HDDs and my GeForce RTX 4080 Panther, it would seem, from your experience, that I have a machine that can run a chatbot.
Now if you could just convince me why I would need one. I already have a wife and a cat if I wish to talk with someone, and even the cat has more brains than a chatbot.
Of course your cat has more brains than a chatbot. You are demonstrating a common misunderstanding fostered by marketing scum. The I in AI does not stand for "INTELLIGENCE". It stands for "INCOMPETENCE". Are you really submitting the argument that your cat (or wife for that matter) is more incompetent than a sophisticated modern technology that has been lovingly tuned by mankind's best minds to produce random, unreliable, and possibly wildly incorrect answers to just about any conceivable question? Really now. How likely is that?
The authors use of the word Chatbot in the original article probably was not the best term to use to get their point across, as there are lot of the work that AI tools running locally can do are not chatbots. The author did give examples of why some people might want to run then on their own PC rather than just using Chat GPT or other online services, such as for when privacy or legal concerns means you don't want to or cannot let the information leave your system.
"Chatbot" doesn't mean something you discuss your day with, it refers to the fact that you can have a back and forth conversation with it about something - in this example some document you want it to review.
The example being if someone gives you a contract to sign, you might have a "chat" with your lawyer about its contents. (Although if you using a chatbot for legal questions, ensure the initial prompt tells it to remind you "this is not legal advice, I do not have liability insurance" with every answer!)
My system has 32GB RAM, an RTX4070 and 8GB of VRAM. It can run all of the freely-licensed chatbots from GPT4All - but maybe that's not saying so much. There is a definite difference in quality of responses between those models and the ones you get online from Google or OpenAI. It can run Stable Diffusion's image generation models ... just ... if there's not too much else running and you use the version that's been optimised to need less memory.
As to why you'd want it, they are genuinely useful. I'm using them for language learning at the moment. They're perfectly capable of setting you exercises, correcting them, explaining what you got wrong, chatting informally, assessing formal writing, introducing new grammar and so on and so on, all stuff you'd ordinarily pay a language teacher real money to do for you. Does it get the odd thing wrong? Probably. But it gives you about 98% of what a paid language tutor gives you, is free and is available 24 hours a day, for a few minutes or a few hours whenever you want it.
Now there is an idea for something that might be interesting[1] to actually try out[2]: could you feed a pre-trained LLM with all the "Arduino for Beginners" articles, plug in a few I2C sensors, then ask it to read 'em and display the values on an attached OLED? Doing all of that on one's home PC, to keep with the theme of the article.
Hmm, best to keep a fire extinguisher on hand; it shouldn't be possible to make a servo running off 5V explode, but...
[1] the assumption being made that there are one or two tech nerds around here
[2] probably already been done, but so has "blink an LED" and we keep repeating that one as well
> Does this actually make a difference to how the model behaves?
Apparently.
There was a (brief?) item about this here on The Register not that long ago (sorry, failed to find it - can't recall exact enough wording) which reported claims that you get better results by telling the model that it really likes to do something otherwise tedious, or that it is a starship captain and knows all about calculating orbits.
(I really should have made a note of that article; hopefully someone can provide the URL)
There's a whole pile of literature on the effects of offering rewards in prompts. Promising tips (in the monetary sense) is one technique that worked well with the popular models, for example — though now that SoTA public models are starting to keep more session context, people have reported models rejecting tip offers on the grounds that they didn't get the promised tip the last time.
(This is amusing but not surprising. The training corpora no doubt contain many references to being paid or not being paid tips, so there are plenty of gradients that a tipping prompt could direct the model toward.)
You don't even need a GPU if you don't mind a short wait (e.g. for instruction-following LLMs rather than real-time chatbots, or if you're happy to queue inferencing/image generating jobs to run overnight).
GPT4All provides an installer that lets you run models on the CPU. The only prerequisite is having the AVX512 instruction set. I've tried it in a VM on a Proxmox node packing a mighty i5-4570 from 2013, and a staggering 12GB of RAM. No GPU.
A 7B model gets about 2.5 tokens/second. The quality is not terrible, albeit not great either (being a 7B general purpose model). So you're watching the words come in one at a time - but a specialist (e.g. code-completion) model would be quicker, and if you were saying "write me a letter" then it's fine chuntering away in the background whilst I do something else (yes, I know. Why would I want that?).
A vaguely modern CPU like an i5-12400 or a Ryzen 5 5600 would flatten that. And GPT4All are looking at leveraging integrated graphics where available as well. The model size is really just limited by RAM. The requirement for 16GB GPUs and all the things is only really if you're serving realtime chat for lots of users.
Now, all of this leaves you asking why you'd want to do that. I was just happy to have the novelty of "GPT running in a small VM locally. Wow, that's a thing computer scientists could only dream of even a short time ago" and then turn it off again.
But pretty much anything on a >2013 CPU is technically an "AI" PC...
...Windows 10 or 11 are going to "mysteriously" forget or reset your preference not to have every last bit your data sent to MS's servers and used- amongst other things- to train their AI and slurp it all regardless.
So you may as well just use their service anyway.
Joking aside- because I'm not really joking- it's now known that LLMs can be conned into regurgitating the data they were trained on and that the industry's ironically-named "guardrails" are easily circumvented. So it wouldn't surprise me if we see lots of confidential data whose owners took for granted had never even left their PC able to be extracted from the likes of ChatGPT in future.
Which you could see being a major problem for *someone* if that data happens to belong to (e.g.) the aforementioned law firm.
OpenAI already have the logs containing all that lovely information - don't worry about it being used to train the next GPT and *maybe* being extracted by One Weird Trick Llms Hate, worry about a simple break-in and download, or just an employee with an evil streak.
Of course, finding the good stuff in amongst the dross is a massive task, so no need to worry; it isn't as if anyone has created a program that can trawl through lots of text and answer questions about it.
Both inference *and* training are just fancy 'tensor'-like product operations performed ad nauseam. Just about anything *could* execute them and nearly everything even has some form of SIMD instruction-set to accelerate them beyond primitive `for`-loops, anyway, and has had for decades. Even libraries which include highly-optimised implementations of the maths exist and have been open-source for ages, now. Particularly for the case of inference (given the weights of a trained model) proprietary and novel advances – alike – only grant a marginal speed-up.
The question is only how *fast* some given hardware can run inference and, there, I propose that the answer is entirely meaningless because the vendors behind this fad will *never* allow LLMs to truly be used in anger, offline. (And literally anything can send an HTTP request to "the cloud" to query models.)
Being the man-in-the-middle to serve the 'AI' responses – capturing usage and prompts and all the telemetry and metadata – is the very product they're building. Why would they ever let that run offline? Forgoing gatekeeper status would be arse-about-face, for them, because the perceived value in interaction data is their only business case.
Any hardware vendor punting "AI compatible" hardware is not just pulling a fast one because just about any Turing-complete machine with a product-op *could* execute the algorithms (perhaps slowly) but because users will never truly use the capabilities.
Sure, the open-source algorithms and open-source model weights are cute to run, offline, but they are only a curiosity. Although the hardware vendors are punting chips capable of slightly faster execution of these algorithms with those models, nobody with the funding or power to drive this fad forwards intends for those models to find main-stream use – I predict that they will disappear from public consciousness exceedingly quickly.
Think about how many algorithms *could* run locally, on-device, on any modern smartphone but are, instead, served from some cloud data-centre, somewhere, where some corporation receives all the data. Think about how many services *could* operate just perfectly over everyone's LAN, without ever crossing the firewall or being routed outside the subnet. Hardware vendors don't seize upon these 'capabilities' to promote their stuff only because none of these are in the headlines.
"That means a GPU with at least 8GB of VRAM (and not one from Intel – at least at the moment, for lack of driver support). That's not too pricey – which is good, because it's table stakes. For RAM, you'd be happier with 32GB than 16GB, while 8GB simply won't cut it. And that's about it. That's all you're going to need."
I can't imagine anything I'd have bought something like that for and I certainly wouldn't be buying one to run a Chatbot on. Back in pre-bloat days I've seen entire SMBs run on kit with far less than that.
That kind of spec is not unreasonable for a desktop, and if you don't have it already then it's easy enough to add a graphics card and some DIMMs.
If my IT department is anything to go by, though, no one is issued a desktop any more. And the laptops that are commonly allocated to office drones don't have proper GPUs let alone a decent amount of RAM (or the ability to add it ex post facto).
-A.
That kind of spec is not unreasonable for a desktop
It's unreadable if you don't need it to do real work. I certainly don't.
Of course, I haven't had a desktop in decades either. (And despite having been using PCs more more than half a century, I've only ever bought two for myself, and they've both been laptops. The best price to pay for a computer is nothing, IMO.)
When all of the new "AI PCs" are coming with NPU hardware? That's what they are using as the dividing line, and won't be marketing existing PCs that lack it as "AI".
Because the whole point of the AI PC hype is to get people to buy NEW PCs, not for them to figure out "hey maybe I can use my 5 year old PC that has a pretty good GPU in it for this already".
When you pay a subscription for AI, a fair chunk of that goes on the purchase of expensive servers and the electricity to power the vast amount of computation to train and run the model.
The advantage of an AI PC, is you purchase the expensive hardware and pay for the electricity, and your generosity enables multi billion dollar companies turn your subscription fees in to even more profit.
What's not to like?
For them.
Thanks for that. I rock an i7 6700 non-k myself, although I do hope to go Zen-5 later this year. 4 cores -> 16 sounds like fun.
I have had some suspicion this was the case, nice to have it confirmed by someone who has a better handle on this than I do.
That said, I am less than wildly enthusiastic about AI, despite the efforts of the press and the PR people and the social media and the internet and the Influencers, and everybody else. I know I need to hurry up and get purple & throbbing about it but I just can't muster a whole herd of enthusiasm.
For one thing, I've been a Linux-head since 1996, and one thing I know about Artificial Stupidity ~ you can't read the source-code and get an idea how it's written or how it works. Yes, I know they call it "OpenAI" but that doesn't mean you can read the code and know what it's doing, or how, or why ~ It is the exact specific opposite of Open Sauce.
... is one some idiot paid extra money for, as far as I can tell, in order to indulge in intellectual laziness.
And on a related note, Intel's alleged position that application developers will soon infuse all software with AI is patent rubbish. First, of course, a great deal of software isn't applications. And a great deal of it has no use for any of the technologies (mostly LLMs and diffusion models) currently being lumped under "AI".
And for most of the rest of it, it's dubious that "AI" brings anything of value. There's some evidence that GAI is useful in, for example, exploring the space of possible chemical compounds for various applications. Other than that, I haven't seen a methodologically-sound study that convinces me of actual utility. And that definitely includes the "ask questions about a contract" example from the article.
Intel under Gelsinger is desperately searching for a reason why anyone should consider it an innovator rather than just a manufacturer of commodity processors. No doubt they'll ride the AI hype until it crashes, but for mass-market PCs, this emperor has no clothes.
> Intel's alleged position that application developers will soon infuse all software with AI is patent rubbish.
I would say the same myself, had you not patented it.
The last time anyone* tried to foist it into general productivity software the world got Clippy.
I'm employed to write vertical apps. The problems are mostly** well defined and the solutions entirely algorithmic. Some kind of chatty interface might amuse the more puerile elements of management for, I dunno, maybe several minutes, but would otherwise be a total waste of time.
-A.
* Anyone of any consequence, alas.
** We do find some specific use for NLP, simply because our suppliers are so inexact in the data they deliver.