Re: Its always the same
Not really. There are already translation layers to enable CUDA code to work on AMD GPUs...
The advantage NVIDIA has over AMD is they saturate Universities with their kit and insist on using their software with them if they want support...which unleashes loads of brainwashed people into the real world that only know CUDA.
Once we move away from AI being largely academic to being largely commercial, there will be a large shift. It's already happening to a certain degree.
There are AI libraries out there that AMD absolutely mops the floor on when directly compared with NVIDIA...the problem with CUDA is that a direct comparison cannot be made. I you do compare NVIDIA on CUDA to AMD on something else, the performance gap isn't that massive. It is significant, but not significant enough to justify the cost.
Pretty soon, performance won't really matter all that much because AI will reach a point where precision becomes more important than performance, because beyond a certain point, performance stops being important, but accuracy becomes more important.
At the moment, if you want accuracy, you need to run larger models, which is easier on AMD hardware due to more VRAM than NVIDIA hardware...NVIDIA kit might produce faster results, but AMD kit right now can produce more accurate results.
I'm not a fan boy of either side because I think both sides could do a lot better than they are and both sides have their inherent problems...but the AI race will eventually be won by AMD I think...their kit has always had more raw compute available and when solutions start moving to non-proprietary backends we'll see that extra compute making contact with the tarmac and at that point, there will be a whole back catalogue of used AMD kit that will be dirt cheap.
Right now, I currently have an AMD card in my dedicated LLM box...purely because it allows me to run larger models, I have the NVIDIA card in my desktop.
Is it slower...yes...is it slower to the point that it makes a difference to my workflow? No, not at all...with either card, it produces results faster than I can read them...therefore it came down to whether I could run larger models...AMD cards can do that, NVIDIA cards, currently, cannot.
The next generation of NVIDIA cards will likely have loads more VRAM to try and combat this, but historically, AMD has always leapfrogged NVIDIA with VRAM capacity so I don't expect NVIDIA to increase their VRAM above that of AMD...if the flagship NVIDIA card comes with 32GB of VRAM, the AMD flagship will have 48GB...purely because NVIDIA can't help itself when it comes to cutting corners and min/maxing profit.
Right now, if the technical complexity isn't a factor for you, the choice you have is:
1) Run slight older, smaller models as fast as possible...NVIDIA.
2) Run the latest, larger models...AMD.