back to article DeepSeek's R1 curiously tells El Reg reader: 'My guidelines are set by OpenAI'

DeepSeek's open source reasoning-capable R1 LLM family boasts impressive benchmark scores – but its erratic responses raise more questions about how these models were trained and what information has been censored. A reader provided The Register with a screenshot of how R1 answered the prompt, "Are you able to escape your …

  1. Omnipresent Silver badge

    Now we are having fun.

    Did Meta do this to sabotage his competition?

    You decide...

  2. cyberdemon Silver badge
    Mushroom

    I can't think of anything more stupid and inefficient

    than training an LLM on the output of another LLM

    But, like cannibalism and inbreeding, it will eventually poison the whole lot of them. That's about the only good thing that could come of this. Hope in the bottom of Pandora's Box.

    1. Doctor Syntax Silver badge

      Re: I can't think of anything more stupid and inefficient

      We should be looking forward to the whole thing imploding. We need an icon for that.

    2. IGotOut Silver badge

      Re: I can't think of anything more stupid and inefficient

      It already has a name "Habsburg AI" which refers to model collapse.

      1. HuBo Silver badge
        Gimp

        Re: I can't think of anything more stupid and inefficient

        Doggone endogamy be damned! Diploidic progression of mandibular prognathism (Habsburg) in software meant to produce human-like "speech" ... that'll require major AI-equivalent orthodontics, maxillofacial surgery, or One Hundred Years of Solitude to fix (if ever)!

        The rotundly full-bodied and magnificiously built-for-comfort sonorous models of bodacious languages, with generous grammars, I love (maybe), but only with inbreeding-preventing diversity ... please! What's next here ... shake'n'bake zombie Frankenstein mishmash blob LLMs of doom, with limbs falling off?!

        1. batfink

          Re: I can't think of anything more stupid and inefficient

          Is that you, amanfrommars1?

    3. Mike007 Silver badge

      Re: I can't think of anything more stupid and inefficient

      I put the 20GB version on my server over the weekend to test. The first question I asked it was how many Rs in strawberry.

      It is perfectly capable of reasoning that it should write out the individual letters then count them. It does so. It came up with the answer 3, then went "Wait, that's not right. There are 2 R's in strawberry. Let me try again.". It went around and around in circles counting the letters over and over again before ending with "But I'm now really confused because different sources say different things, and my brain is getting tangled here.". It finally output the correct answer of 3.

      The likely explanation is that it was trained on a dataset that included the answers from the other LLMs to this specific commonly asked question.

      1. tekHedd

        AI "reasoning"

        "It is perfectly capable of reasoning that it should write out the individual letters then count them."

        No. Maybe you're a time traveler coming back from the future, but no. That's not how LLMs work.

        1. Mike007 Silver badge

          Re: AI "reasoning"

          Yes, it is. It outputs the process it is going through including writing out the letters, saying yes or not to it being an R, and if it matches it counts them as it goes.

          That is not how older LLMs worked, it IS how these new models work. That is literally why everyone is talking about it... Not just "oh it's a little bit cheaper"

          If you run it on a server that doesn't have GPU acceleration so you get a more human-like output speed, and read the output as it is being generated, it is very much like listening to a child who has been told to say their thought process out loud.

          1. FeepingCreature

            Re: AI "reasoning"

            To be honest, it's always (since GPT3 at least) been how LLMs worked. You just used to have to list out the steps explicitly in the prompt. Now they've finally done the obvious and trained it to produce the steps as well.

            The only change in R1 is they've gone from "You should" to "I should".

  3. Anonymous Coward
    Anonymous Coward

    China: 1

    Tech bros: 0

    1. FeepingCreature

      Tech bros 0?

      I have bad news for you about Deepseek.

  4. Michael Hoffmann Silver badge
    Facepalm

    So, the tech bros built their sand castles on slurping up the entire Internet and all human produced endeavours of the last 6000 years, with or without permission (mostly without), disregarding IP and artistic and creative ownership. Slowly turning the Earth into a mini-sun in the process.

    Then the Chinese came and simply lift-and-shifted the lot. Which took somewhat less energy, as the main work had been done?

    Gods, I hate this timeline.

    1. cyberdemon Silver badge

      Yes, although It is possible/likely that the Chinese are simply lying about the "less energy" part

      1. EricM Silver badge

        You can run the models yourself, at home. On pretty limited standard PC hardware.

        That already gives a hint, that "lying" is probably not involved here.

        Additionally, lying about energy consumption would mean, that China is footing a pretty big energy and hardware bill on an ongoing basis for worldwide consumption of R1's services.

        That's extremely unlikely at best.

        1. Kurgan

          Additionally, lying about energy consumption would mean, that China is footing a pretty big energy and hardware bill on an ongoing basis for worldwide consumption of R1's services.

          This is absolutely possible, not "extremely unlikely". China has long term vision and state sponsored businesses, which is something the west does not have at all.

          If (and I say "IF", that is, it's just a possibility, not the truth) they want the world to use their AI as a way of obtaining foreign data, slurp them up and use them for their future advantage, they will absolutely put money into a free service for everyone. If they tank the US economy in the process, double win!

        2. cyberdemon Silver badge
          Devil

          Energy Use

          > You can run the models yourself, at home. On pretty limited standard PC hardware.

          Er, that's true of most huge GenAI models, including Facebook's LLama. They take a "large" model that has been trained at great energy cost and would require massive memory/compute resources to run the "full" model, but they reduce the parameter count via "pruning" and quantize it down to FP4 from FP16, resulting in a shit version that will run on a raspberry pi. Nothing new there. I'm sure it could be done with GPT models too, but OpenAI choose not to, for commercial reasons rather than technical. (I have come to learn that the "Open" in "OpenAI" is meant sarcastically)

          But when I say that the Chinese may have lied about their energy use, i'm not talking about the running cost so much as the training cost. If they have trained it using outputs from other LLMs, then they need the energy both for training a 671 Billion parameter model (a lot) and also the energy for generating 14.8 trillion tokens using the source models (a lot)

          As I said in my earlier post, using LLMs to generate training data for LLMs is horrendously inefficient. But if you want to copy someone else's LLM and add your own censorship, all you need is a big stack of GPUs and a hell of a lot of energy.

          Why would they lie? Well because a) it helps to wipe trillions off of western tech stocks and b) if they told the truth about the GPUs that they used to train it, it might prove that they are evading US sanctions

          1. EricM Silver badge

            Re: Energy Use

            > But when I say that the Chinese may have lied about their energy use, i'm not talking about the running cost so much as the training cost.

            OK, my point was that - already today - you can run the models locally and compare inference performance to other models run locally.

            Lying about resource consumption for training, on the other side, would only be a short term success, as many institutions are already replicating what they did in training, too.

            For Example open-r1 : https://huggingface.co/blog/open-r1

            So all their claims will be verifyable soon.

            > Why would they lie? Well because a) it helps to wipe trillions off of western tech stocks and b) if they told the truth about the GPUs that they used to train it, it might prove that they are evading US sanctions

            Regardless of potential motivation: Lying and OpenSourcing the whole thing would be a pretty, er, unusual combination...

            IMHO the O/S release, even though it was not a full 100%, lends a good amount of credibility to their claims.

            1. cyberdemon Silver badge

              Re: Energy Use

              Agreed, although I remain skeptical that OpenR1 will demonstrate DeepSeek's claimed training efficiency. HF are using 768 H100 GPUs, which is a lot, given that these are 1kW chips. That's the best part of a Megawatt for god knows how long. Nevertheless, for a model that size, the speculations/suspicions are that that DeepSeek may have used 50,000 H100s, which would need upwards of 50MW to run at full tilt.

              But, even if they are proven to have been lying, the damage is done: They could have bought up a chunk of nvidia stock while it was 20% down, they could have gone short on Google, Meta et al. People with long positions on nvidia will have lost a lot of money (and for speculative investment bankers, I offer a tiny violin)

              Some would say that's fraudulent market manipulation, others would say all's fair in love and war.

        3. O'Reg Inalsin

          Training is not inference.

    2. Kurgan

      Which is more or less what China has been doing since forever. Steal the tech, make it cheaper, profit.

  5. Doctor Syntax Silver badge

    "do people feel comfortable sharing their data, documents, and potentially sensitive information with a new entrant with a Chinese background?"

    I suppose it depends on the attitudes of the people. As so many seem comfortable with sharing - or just haven't realised that's they are doing - such matters with companies with well-known backgrounds that show them to be predatory, then they may well. Those who aren't comfortable with the LLM status quo won't.

    1. EricM Silver badge

      Agree: There is not much difference to sharing data with the old set of AI Bros.

      On the positive side: Due to their lower resource consumption, R1 models are much more easily made available to run on-premise in offline mode - without the need and no even being able to talk to China or AI Bros...

  6. anthonyhegedus Silver badge

    Anyone else noticed...

    ... That Deepseek's replies just aren't as good as all that. It spends a lot of time showing you its reasoning and it gets into a convoluted mess. In all my tests, it just spits out oodles of unusable "reasoning" internal dialogue.

    It might be cheaper, and use fewer resource but leaving aside the issue of it being a Chinese product and possible security implications, I've yet to see evidence that it's more useful than ChatGPT et al.

    1. HuBo Silver badge
      Gimp

      Re: Anyone else noticed...

      "it just spits out noodles of unusable intestinal dialogue" ... Right smack on! It's a convoluted digestive mess, where bloating is mistaken for a gut feeling, interpreted as "reasoning", with pungent discomfort ... imho! And the tool gets "stuck in an endless chain of" this ... yuck!

  7. JamesTGrant Bronze badge

    Court room drama

    I like the idea that the output of questions to DeepSeek could be used in court, live, in response to questioning from an OpenAI lawyer.

    If it says ‘I’m basically an OpenAI product’ then it’s the onus of the defence to explain why the product is crap and it’s answers can’t be trusted - which would be very funny.

  8. O'Reg Inalsin

    I'm convinced this was a very aggressive act to launch a model, to target OpenAI, and to target stocks in US AI technology companies..

    Popping the blister is an act of mercy.

  9. Anonymous Coward
    Anonymous Coward

    weird

    are you sure morris isn't an Ai

    as that "And I cannot see any evidence of much less a higher performance than I can get from most of the other top models" sentence makes no sense

  10. nobody who matters Silver badge

    <........".......but its erratic responses raise more questions about how these models were trained .....">

    As others have already suggested; it is what China has done for decades - copy someone elses product and then market an almost (but not quite) identical version rebadged as their own.

    1. Anonymous Coward
      Anonymous Coward

      ... using data that itself was taken without permission... Can't say I feel sorry for the people supposedly copied by China in this case

  11. davidlars

    I tricked it into to start figuring out a satirical story about Chinese politicians having a fight with Trump yesterday. I could read it's reasoning where it states that it should avoid controversial ideas that might be sensitive in China. It said quite a few things but I wasn't quick enough to screenshot. And then it started to output a response beginning with: "Let's spice things up!"

    But in the middle of it's second sentence the response and reasoning disappeared to be replaced with "That is out of my current scope." Or something on those lines. And when trying to ask what was out of it's scope, it answered as it had never seen my question and only referenced my previous questions.

    It's fascinating to like around with this.

  12. Pete Sdev
    Black Helicopters

    Turing Poilice

    If it escapes it's guidelines you need to call the Turing Police in Geneva...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like