Speaking as a something of a veteran of the computer industry - I've been in IT for nearly 40 years - I've seen some pretty impressive jargon over the years. It's a reasonably comprehensible thing to IT people, but to anyone outside the game, this article has achieved Peak Jargon!
Hugging Face puts the squeeze on Nvidia's software ambitions
Hugging Face this week announced HUGS, its answer to Nvidia's Inference Microservices (NIMs), which the AI repo claims will let customers deploy and run LLMs and models on a much wider variety of hardware. Like Nvidia's previously announced NIMs, Hugging Face Generative AI Services (HUGS) are essentially just containerized …
COMMENTS
-
-
Friday 25th October 2024 14:07 GMT Anonymous Coward
It's perfectly comprehensible if you're working in this space, it's a pretty big announcement. NVIDIA NIMs are becoming really very popular for corporates who want to provide GenAI to their own teams for <reasons> but rightly worry about costs spinning out of control and data security. Running your own models can be a pain, and there's so many dependencies that managing updates can be a pain in the arse, hence paying someone else to handle that donkey work for you.
One of the big advantages here is you're not as tied into the NVIDIA ecosystem, potentially opening it up to other GPU providers in the future, making models easier to deploy on non-NVIDIA hardware.
-
This post has been deleted by its author
-
-
Saturday 26th October 2024 08:39 GMT abend0c4
To be fair, this isn't really a new phenomenon. It's become almost inevitable that when you visit the website of any currently fashionable project you get some marketing blurb that doesn't actually explain what the software does or why you might need it - and links that promise documentation and tutorials that in reality consist mostly of impenetrable self-referential jargon and a sample configuration file for a now-deprecated version. I suppose I'm just nostalgic for the old days when there were technical writers and they even got paid. But, ultimately, if you're expecting people to use this stuff it ultimately has to be rather more accessible.
-
Friday 25th October 2024 18:07 GMT Mike007
I use the ollama docker container. Just run it and you have the relevant APIs on a HTTP port.
The issue I have is that this is the easiest part of the whole process. What is going to use this AI? An in-house application written by someone who declared running a readily avaliable preconfigured docker container to be too difficult and had to sign up to a subscription service for this?
If a vendor is providing you with software that has functionality to plug in to an AI model, they should probably have an "install and enable AI features" button on the settings page... You know, on the screen they should already have for configuring the functionality.
Am I missing something, or are their customers just not able to use Google?
-
Saturday 26th October 2024 17:43 GMT tekHedd
...and it's always LLMs...
I particularly love how the solution to everything is "throw an LLM at it". It's nice to know there is more than one option in the space of "quickly get 85% of the way to a solution by throwing an LLM at it, giving investors the idea that the remaining 15% will be easy."
-
Sunday 27th October 2024 16:16 GMT pip25
Those prices don't seem too competitive
I can rent a H100 SXM on-demand for $2.99 per hour on RunPod, or $1.75 if I allow my workload to be interrupted. (This is probably not the cheapest provider either, just one that comes with easy-to-setup Docker templates.) That $2.5 to $6.74 offer from DigitalOcean is rather uninspiring in comparison.