This is fantastic, more like this please. It's very different from the pay walled and AI generated content on many other places...
Everything you need to know to start fine-tuning LLMs in the privacy of your home
Large language models (LLMs) are remarkably effective at generating text and regurgitating information, but they're ultimately limited by the corpus of data they were trained on. If, for example, you ask a generic pre-trained model about a process or procedure specific to your business, at best it'll refuse, and at worst it'll …
COMMENTS
-
Sunday 10th November 2024 21:09 GMT Michael Hoffmann
I have am embarrassingly stupid question:
How do I easily get a list/overview of all ElReg Hands-On articles? I can choose all the various categories they fall under, I can search, but it'd be nice if I could click on the nice red HANDS ON that precedes every one of them and get the lot. I've looked at various menus, but it's not a separate entity.
Am I just that dense on a Monday morning? Icon because it's a question. That involves obtaining information. About technology.
-
Sunday 10th November 2024 21:10 GMT Anonymous Coward
Nice, and a couple questions
Great writeup! The only additional thing I could think of, to help further, is some before-and-after comparison that would give an idea of the level of reward to be expected for the effort (5 long pages per GPU worth of intense pain, struggle, and suffering ... for those with appropriate GPUs).
I wonder also if NPUs might be used, one day, to perform such fine-tuning or if it will remain the domain of discrete GPUs (eg. because of the "heavy" computational burden).
And, as an overall question, if one was to endeavor to adapt a pre-trained, plus-sized, richly sonorous, built-for-comfort, model of language to a particular purpose, would the sequence of steps to do so be to first test the model with engineeed prompts, then finetune and re-test, and then bolt-on some RAG and re-test, or is any sequence okay (equally advisable)?
Still, as stated above by TheMajectic, a great, very detailed and useful writeup (IMHO)!
-
-
-
Sunday 10th November 2024 23:10 GMT Sorry that handle is already taken.
Re: Out of curiosity ...
I still want to know what Mark Pesce allegedly discovered...
https://www.theregister.com/2024/05/23/ai_untested_unstable/
https://www.theregister.com/2024/07/10/vendors_response_to_my_llmcrasher/
-
Sunday 10th November 2024 23:30 GMT that one in the corner
Re: Out of curiosity ...
> What does spending all that time in front of the computer actually get me?
You? Nothing much. You're settled and well situated with plenty of useful work to keep you occupied.
Someone 1/2 your age or younger, a bit fleeter on their feet? A few quick gigs flogging "AI to Suit Your Business" to any longer local companies with their own websites that can be convinced they require a chatbot to Join The Revolution. Being fleet of foot is needed so young laddo can skip town and find some more mugs elsewhere before the annual returns are in and the "clients" find out exactly how much those LLMs have helped the business.
An unrepentant Moist Von Lipwig would have had a field day with tuned LLMs; it is barely even dishonest, not like waiting for the Cabbage Bank share certificates to become dry to the touch.
-
-
Monday 11th November 2024 21:58 GMT Kevin McMurtrie
How important is the GPU?
How does the performance compare to not using a GPU? The higher precision LLMs seems to want absolutely unaffordable amounts of GPU RAM, GPU drivers are usually a nightmare, and power consumption is non-trivial. I've been running ollama on an ordinary AMD CPU with 128GB of DDR5 RAM and a fast m.2 card for swap. It's not fast enough to be a server but it's not bad either. Even llama3.2-vision:90b can spew out defensive nonsense reasonably fast.
-
Monday 18th November 2024 22:22 GMT Anonymous Coward
Re: How important is the GPU?
The lowest spec machine I tried running an LLM on in CPU mode was a 10880H, and you are glossing over the 5 minute wait for it to start outputting the first half of the first word... Even on a 14900K it was in my opinion unusable.
My boss used to have a VR gaming rig in the office with a 4090 in it. I say used to, because now one of our clients has an experimental private LLM server. This can spit out 500 words in about 3 seconds. (It they OK a production server, it's getting a lower spec GPU)
I don't know what the requirements are for useful amounts of training, but I would imagine it would see a massive speed improvement from GPU acceleration.
-
-
Saturday 16th November 2024 20:10 GMT Camilla Smythe
It can't be that stupid, you must be prompting it wrong
For everything else we recommend using your previous meatsacks output so it will be better at regurgitating an answer that is more likely to match the prompt. Then you can sack the meatsacks. Good luck if you intend to start supporting a new product.