back to article China stops worrying about lack of GPUs and learns to love the supercomputer

Leading Chinese computer scientists have suggested the nation can build large language models without imported GPUs – by using supercomputers instead. This was the belief expressed at the 2024 China Computing Power Development Expert Seminar – a conference co-organized by a Chinese industry alliance and national …

  1. Anonymous Coward
    Anonymous Coward

    China can do what they want.

    ...but at the end of the day, receiving a pallet fork of cash from the Government is not sustainable in the long term. People will turn up, take the money and leave when the money stops.

    Realistically Microsoft/Google/Apple/etc combined are investing far more money anyway... ...and they have intentions of productising the results.

    1. Peter2

      Re: China can do what they want.

      On the other hand, "the west" has basically no use for language models like chat GPT beyond it being an amusing little toy. Perhaps companies hope that they might be able to achieve efficiencies in the long term (ie; replace all of the minimum wage people in customer support etc) but there is no immediate use for it, and the long term use doesn't appear particularly socially useful.

      The authoritarian axis meanwhile appears to wish to use language models to create bots to manipulate western public discourse and try (with some degree of success) to manipulate who gets elected in our democracies which is a goal that their governments are unlikely to abandon any time soon.

      1. Yet Another Anonymous coward Silver badge

        Re: China can do what they want.

        >The authoritarian axis meanwhile appears to wish to use language models to create bots to manipulate western public discourse

        But we're banning TikTok and US news operations owned by US billionaires aren't going to distribute AI generated political videos to undermine democracy

        1. veti Silver badge

          Re: China can do what they want.

          Not sure, but I'm interpreting that as irony. So up voted.

  2. pavlecom
    IT Angle

    .. overcome in a short term

    Very good solution indeed. Domestics GPUs & solutions will come in midtime. Every obstacle is for overcome.

    "Chinese have been making huge leaps forward. They built their own spacestation, landed on the dark side of the moon, in one fell swoop on mars, and started fabbing 7nm chips. If I was the US I'd be fucking terrified of what happens in the next 5 - 10 years"

    Peshman

    1. Casca Silver badge

      Re: .. overcome in a short term

      And the usual china commentard has shown up.

      1. Jason Bloomberg Silver badge
        Pint

        Re: .. overcome in a short term

        Stick your head in the sand and demonise the messenger all you want but it won't change reality.

        The US forgot the "keep your enemies closer" part. They forced China to go it alone and are reaping the consequences of that.

        As I have said before; sanctions on China were the best thing which could have happened to them. They forced China to stand on their own two feet, encouraged them towards their own sovereign destiny. Make China Great Again started under Trump, Biden continued it, and there's no way back from that.

        1. Casca Silver badge

          Re: .. overcome in a short term

          I wonder when China is going to loose their developing country status and get to pay full price for everything the ship. Should have been done a long time ago.

          And my comment is about this person ONLY comments articles about China and always how great China is. Almost like it is his job to do it.

    2. Yet Another Anonymous coward Silver badge

      Re: .. overcome in a short term

      > If I was the US I'd be fucking terrified of what happens in the next 5 - 10 years

      China becomes fully developed, all its best and brightest students skip STEM and become lawyers and finfluencers and we all have have to switch to buying cheap crap from Azerbaijan

      1. CountCadaver Silver badge

        Re: .. overcome in a short term

        More like DPRK / DRC / Myanmar / Burkina Faso

    3. druck Silver badge

      Re: .. overcome in a short term

      Let them spend billions creating their own GPUs for AI, just as the AI bubble is about to burst.

      They'll have to think of something to do with all those GPUs then, shame they limited the amount of time Chinese children can play games, or they could have used then for that.

  3. Brewster's Angle Grinder Silver badge

    What are they actually saying here? That they're going to use CPUs instead of GPUs? So instead of racks of highly specialised, densely packed, power efficient GPUs, they're going to use an equivalent number of cores in overspecified general purposes processors - which will be far less densely packed and likely use far more power and cost more?

    If so, yes, you could. But fewer people will be able to do it. And you're tying up supercomputers doing AI instead of all the weapons simulations(?), etc... that you built the supercomputers for.

    1. Bartholomew
      Meh

      > That they're going to use CPUs instead of GPUs?

      I would Look to the SG2380 for some inspiration as to what is planned, It has 16 general purpose 64-bit P670 RISC-V CPU's, four X280 co-processors and each of those co-processors manage dedicated SOPHON TPU hardware (32 TOPs int8 and 16 TFLOPs fp16 in total per SG2380). It does have a GPU (Imagination AXT-16-512 GPU), but it would be a very poor choice for processing anything other than graphics.

      China's plan is to use TPU's instead of GPU's.

      A supercomputer would probably be based around the 64 core SG2042, but there is not enough details public about it yet to make guesses.

      1. Bartholomew
        Meh

        My guess for the SG2042 would be that it might be based around an opensource XiangShan V3 (goal: 16.7 SPECint2k6/GHz, current simulations with RV64GCB: 14.7 SPECint2k6/GHz) core - which was developed and validated in China.

    2. Zibob Silver badge

      The term "supercomputer" is not well defined outside "current day extremely high performance compute."

      For example a standard laptop today is more powerful that what would have been a Supercomputer in the 80s.

      It does not denote primarily CPU or GPU based designs, and not that we have a headline alternative, it does not even strictly mean CPU or GPU at all if something new we invent can crunch the numbers faster, like NPUs (which yes is currently just a combo of both but you get the idea hopefully)

      So I think, in my opinion, what they are saying is they will use what they have. And currently the only reasonable rival they have is Supercomputers. They have CPU and GPU manufacturing ramping up quickly to intel/AMD levels of performance.

      Where as students can have access to much more powerful hardware in the west which could possibly be a private at home machine, which while possible in China is less likely as it would probably be double the size physically currently. They are just saying that it will likely *have* to be supers for the time being, rather than just single racks, I think.

      Its more just a thumbing up of the nose at america for being stupid. A clear sign that sanctions or not america will lose this race and none of their side games and distractions is actually going to help them when it comes to stopping China.

  4. Filippo Silver badge

    >"When large models require 10,000 to 100,000 GPUs, it is essential to overcome technical challenges like high energy consumption, reliability issues, and parallel processing limits by developing specialized supercomputers,"

    Isn't "specialized supercomputer" a pretty decent description for [a bank of] GPUs?

    1. Spazturtle Silver badge

      The most important part of a supercomputer is the interconnects, that is what enables you to use all these compute resources for a single task. Server ranks full of GPUs alone won't cut it.

      1. LionelB Silver badge

        I though that the entire point of GPUs (at least in ML applications) is that the large-scale interconnectivity is on the GPU, so that you can parallelise massively on an individual GPU. And I'm not sure why it shouldn't be possible to network GPUs as effectively as you might CPUs.

      2. Ken Hagan Gold badge

        Machine learning models are mostly just a sequence of embarrassingly parallel steps. That's why GPUs are suitable.

    2. Cem Ayin

      "Specialized" supercomputer

      > Isn't "specialized supercomputer" a pretty decent description for [a bank of] GPUs?

      Depends on your definition of "specialized". Essentially, all current supercomputers are built from off-the-shelf commodity hardware, this includes the GPUs. After all, GPUs originally found their way into supercomputing precisely because they were by far the cheapest SIMD-machines readily available, due to the scaling effects of the gamer market segment. And while Nvidia has succeeded in pushing more expensive hardware implementations of its tech (dedicated to GPGPU-computing and without even an actual grapics output port) into the market, the basic architecture of these SIMD-machines is still that of good ol' gamer GPUs, the HPC communitiy still essentially feeding on the crumbs that fall from the gamers' tables (though the current DL-model boom might change this eventually).

      SIMD machines have existed before that, of course, but those were truely specialized hardware architectures, the silicon being produced in small numbers and frightfully expensive, just like the "classic" supercomputers of yesteryear. (When I started working in IT, Cray's vector machines were still all the rage in HPC, though I sadly never got to work with one of those...)

      What Mr. Zhang was *probably* referring to is the idea of using DL-specific ASICs rather than off-the-shelf GPUs in future Chinese supercomputers. Whether or not this is a viable proposition depends on how much effiency can be gained in this way to at least partially offset China's current disadvantage in the domain of highly integrated circuits, which might well take a decade to overcome. Not being involved with DL-research personally I have no idea if such an approach stands a chance of working, but at least we've seen it happening in the domain of Bitcoin-mining before.

      And then there's the issue of the economy of scale, of course. But given the importance of A"I"-techniques for military purposes, particularly in matters of reconaissance and advanced drone warfare, this would be less of a problem for the factory of the world. China *will* push ahead in this domain, coûte que coûte.

  5. dharmOS

    Use of ARM SVE or RISC-V equivalent for ML training and inference without GPUs - forget the cost

    I always wondered if the SVE from ARMv8.2, v9 (used in the Fujitsu A64FX / Fugaku supercomputer) could be used for ML Training, or its RISC-V equivalent. By all accounts, ARM China has full blueprints for any ARM v9 CPU.

    If it can support the lower precision FP and INT formats, then the question becomes how much the sponsor is willing to fund the machine (and pays for its upkeep, running costs etc). If it is a National Security issue and the country is China, USA etc, then the answer is an "unlimited amount"!

  6. martinusher Silver badge

    Obsession with clock rate obscures design possibilities

    Back when I was young(ish) I was part of a team that build prototype digital video equipment. My part of the work was to take a 20MBytes/sec video data stream, munge** it about and deliver this recording hardware as a 200MBits/sec serial stream. This would be a non-event with today's components but this was 45 years ago, a time when memory (when you could get it) wasn't dense and had cycle times in the order of hundreds of nanoseconds. (...and, of course, all small scale logic....we're talking "uphill in the snow both ways, here"!). The job's the job.

    The point is that yes, its really good to shrink dies, crank clock speeds in the stratosphere and use 'the latest' but when that just isn't an option there's always another way. Its like "thermal management" has suddenly become a big thing again. If you want stuff to go fast, really fast, then its going to end up really compact and its going to chew up power, lots of it. My first job in computing was intended to design thermal management systems for a high performance mainframe. (People knew about this, of course -- the Cray 1 supercomputer wasn't built with a nifty seat around its base as a conversational piece, it was hiding five tons of cooling equipment.) Its not really a surprise that the Chinese -- or anyone else, for that matter -- faced with a specification and available parts would come up with a best effort design, its what engineers do.

    (**Technical term.)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like