From 8 to 17 billion
I wonder if they switched from 32-bit precision to 16-bit precision ?
Because I doubt that they doubled the amount of RAM.
Cisco is working on a new AI model that will more than double the number of parameters used to train its current flagship Foundation-Sec-8B. The networking giant’s current hero model is Foundation-Sec-8b, an eight-billion parameter effort that it uses in several products, and suggests for use on tasks such as automated …
https://huggingface.co/fdtn-ai says the 8b model and it's underlying Llama 3.1 model were already trained at BF16. I doubt they've gone lower for the new model.
I'd guess the new one is a Llama 4 variant which is also BF16 but has a mixture-of-experts architecture. If Cisco only used one expert, which is perhaps/probably sensible for a specialised task(?), it'd be 17b paramaters. Or just possibly it has multiple experts and they're underselling it a bit.
Wow, kind of impressed with that and may actually get me to have a deeper look at it.
Again, I have to state that I do not believe that LLMs are ready for the big time wrt, well, doing almost anything, but a specifically trained model such as this that can give advice on potential solutions is a better assistant than something like ChatGPT (IMHO).
I was thinking from the headline just implementing an existing security model correctly would a giant leap forward.... but then reading the subhead I realized that this "model" actually referred to more raw AI sewerage† to further contaminate the pool.
† alias liquid shit.
Raj Chopra quoted thusly: "Cisco develops LLMs because it believes organizations need a mix of generic security data and info describing their own affairs to effectively use AI in their defenses."
Of course, this LLM/AI model could NEVER be used for ATTACK!!!!! No doubt the good folk at Fort Meade are lapping this up!!!