All tease and no strip
Hmmm ... I'd say the press releases on Baidu's Kunlun M100+ baked inference chips are rather thin-&-light on details (internal arch, networking, TDP, perf, litho, chip photo, die shot, supporting software stack ...). And if the 2 trillion (2-T) parms ERNIE 5.0 is out now, it must have been trained on some other hardware, maybe some Ascend 910 (as found in CloudMatrix 384) similar to how Huawei's 1-T PanGu-Σ was trained ... or some pre-sanctions Nvidia kit? More qestions than answers here ...
As for questions, they also "announced" (TFA link) Famou: "the World's First Commercially Available Self-evolving Agent", "able to quickly abstract complex problems and iterate automatically as conditions change" ... what!!?? Details and examples would be welcome on their part on this given the boldness of the claims.
Speaking of 1-T parameters though, Argonne's Ian Foster (and team) seems to think "an AI-native Scientific Discovery Platform (SDP) that connects models to tools, data, HPC, and robotics" based on "science-tuned foundation models" of that size could be worthwhile. I have my doubts. But he'll present that concept next Friday at SC25 (and there's an ArXiv on this from last year) ... might be worth a gander (or not?).