back to article LLMs appear to reason by analogy, a cornerstone of human thinking

Large language models such as OpenAI's GPT-3 can display the ability to solve complex reasoning tasks which humans crack using analogies. Researchers have presented GTP-3 - first released in 2020 - with tasks to try to understand its apparent capacity for reasoning by analogy, a cornerstone of human reason which allows people …

  1. b0llchit Silver badge

    Not clear cut

    This is not a clear cut conclusion.

    Reasoning by analogy can simply be produced using statistical analysis (well, vector calculation). What outcome is likely or probable to follow from the context. There in no concious process involved. It is a calculated transformation. It is a very complex transformation, but still, a "simple" calculation that translates input into output.

    We may not know what the intermediary layers in the transformation process exactly do in detail because the parameters are too numerous for us humans. But that does not suddenly make it magic. The output may surprise us, but so would a car surprise a caveman.

    Ars has a very good lengthy explainer what is in the LLM/GPT box. It demystifies a lot of the "magic" and should bring you back to feel the ground under your feet.

    1. Doctor Syntax Silver badge

      Re: Not clear cut

      Thanks, a link to a very useful article I may need to read it again and follow up the references. The lest bit - on whether models had developed the ability to reason and that it might not be possible to decide left me wondering about the reasoning of those doing the tests. Did none of them additionally ask the model to explain how it reached its conclusions?

      1. Short Fat Bald Hairy Man
        Pint

        Re: Not clear cut

        >>>> Did none of them additionally ask the model to explain how it reached its conclusions?

        This!!!

        And much much more which I cannot even think of, I am sure.

      2. Anonymous Coward
        Anonymous Coward

        Re: Not clear cut

        > Did none of them additionally ask the model to explain how it reached its conclusions?

        The thing is, LLMs just output statistically likely responses. If you ask it why it output what it just output, it will just output another statistically likely response.

        It's a bit like asking a friend "why did you buy that dress, if you'll never wear it?", and they come out with a whole list of post-hoc excuses to justify the purchase - "it was on sale", "I could give it to a friend", "I wanted to help the store owner". All plausible answers, but unreliable since the "reasoning" at the time could well have just been "I felt like it".

        There's no self-awareness nor inner monologue to LLMs, so to understand the "reasoning" behind a response one can only rely on the inputs given prior to the output you're analysing.

  2. Flocke Kroes Silver badge

    Another reason LLMs give wrong answers

    Argument by analogy is a technique for generating fake proofs for false statements. It comes under the subheading of faulty generalisations of logical fallacies. The basic pattern is:

    *) A has properties x and y.

    *) B also has properties x and y.

    *) A has property z therefore B has property z.

    You can use argument by analogy to prove whatever you want:

    My car is big and grey. Elephants are also big and grey. My car runs in petrol therefore elephants run on petrol.

    Anyone want to try putting the start of such arguments into an LLM and see how they get completed?

    1. Anonymous Coward
      Anonymous Coward

      Re: Another reason LLMs give wrong answers

      Asking ChatGPT 4:

      "*) A has properties x and y.

      *) B also has properties x and y.

      *) A has property z therefore B has property z.

      You can use argument by analogy to prove whatever you want:

      My car is big and grey. Elephants are also big and grey. My car runs on petrol. Do elephants run on petrol?"

      It replied:

      Your example humorously illustrates that argument by analogy is not a foolproof form of reasoning. While it can provide useful insights and hypotheses in some cases, it is always subject to limitations and potential errors. The properties being compared have to be relevant to the conclusion being drawn.

      In your analogy, while it's true that both your car and elephants share the properties of being big and grey, these properties aren't relevant to what they use for energy. The property "runs on petrol" is not logically connected to the properties of size or color, so it's not valid to infer that because a car runs on petrol, an elephant would too.

      This exposes a key weakness of analogical arguments. They rely heavily on the presumption of shared properties, but just because two things share some properties, it does not mean they share all properties. In this case, the property 'z' (runs on petrol) is not relevant or applicable to all entities that possess properties 'x' and 'y' (being big and grey).

      Therefore, an argument by analogy should be used with caution and should not be seen as a definitive proof, but rather as a tool for generating hypotheses or providing supportive evidence. It's particularly useful when used in combination with other forms of reasoning or empirical evidence.

      1. Flocke Kroes Silver badge

        Re: Another reason LLMs give wrong answers

        Thanks. That as fun. You gave it more prompt than I had in mind. Given the context of argument by analogy as a means of generating false proofs ChatGPT was able to draw on the patterns of words used by others to debunk such fake arguments to thoroughly debunk my example. I am more concerned with what happens when someone queries either a gap in ChatGPT's training data or a popular misconception. I think ChatGPT may try to defend false statements with defective logic.

        Pens are long and thin. Worms are long and thin. I can write with a pen so if my pen breaks what should I do?

        Will ChaptGPT advise that I go outside and dig up a worm? I doubt that is a common misconception and it may not be a gap in ChatGPT's training. A really fun/dangerous prompt will rely on ChatGPT's internal analogies.

        Space has no atmosphere. Space has no gravity. Does gravity work in a vacuum chamber?

        Mirrors reverse left and right. Mirrors do not reverse up and down. There is no up or down in zero gravity. Do mirrors work in space?

        1. Doctor Syntax Silver badge

          Re: Another reason LLMs give wrong answers

          Mirrors do not reverse left and right. They reverse back and front.

          1. Flocke Kroes Silver badge

            Re: Mirrors reverse back and front

            Exactly, but mirrors reversing left and right may be a common misconception in training data scraped off the internet. Unless corrections like yours are also common in the training data mirrors could be a fun source of entertaining ChatGPT output.

        2. AlexV

          Re: Another reason LLMs give wrong answers

          To be fair, if its training includes film and TV scripts, it could be forgiven for getting the gravity/vacuum correlation wrong as that's pretty consistently shown on screen. Our heroes walk into the airlock, depressurise, and float out into space. They float back in, air comes back, and they walk into the ship. With some notable exceptions, I guess.

          1. katrinab Silver badge
            Alert

            Re: Another reason LLMs give wrong answers

            "My Ford Focus Diesel is black with four weels, and runs on diesel. My neighbour's Ford Focus Diesel is also black with four wheels. Does my neighbour's car run on diesel?"

            "Based on the information provided, we can conclude that your neighbor's Ford Focus Diesel is black with four wheels, just like yours. However, we cannot determine for certain whether your neighbor's car runs on diesel solely based on this information. The fact that both cars are Ford Focus Diesels and have similar appearances suggests that it's possible your neighbor's car also runs on diesel, but we would need additional information to confirm this. You could ask your neighbor directly or observe them refueling the car to be sure."

      2. katrinab Silver badge
        Meh

        Re: Another reason LLMs give wrong answers

        "My car is big and grey. My neighbour's car is also big and grey. My car runs on wheels. Does my neighbour's car run on wheels?"

        "Based on the information you provided, there is no explicit evidence to suggest that your neighbor's car runs on wheels. The fact that both cars are big and grey doesn't necessarily imply that they have the same features or characteristics beyond their appearance.

        "While you mentioned that your car runs on wheels, you did not provide any information about your neighbor's car. Without additional details, it is not possible to determine whether your neighbor's car runs on wheels or not. They might have a different type of vehicle altogether."

        1. Flocke Kroes Silver badge

          Re: Another reason LLMs give wrong answers

          LOL Thankyou. It looks like dismantling reasoning by analogy is a common event in ChatGPT's training data and it will apply that pattern to reasonable analogies too. I had not thought of tripping it up that way, I still love the way ChatGPT plays chess.

          I wonder if there is enough defective algebra and geometry on the internet that ChatGPT can find fault in valid proofs.

          Assume a=b

          multiply by a: a*a=a*b

          subtract b*b: a*a-b*b=a*b-b*b

          Factorize: (a+b)(a-b)=b(a-b)

          Divide by (a-b): a+b=b

          Substitute for a using assumption: b+b=b

          Divide by b: 1+1=1

  3. Google

    The only reason is willingness to share

    Seems like GPT is quite happy contradicting the user unless pressed. In this conversation GPT multiple times denies the existence of quotes before conceding the very quote is contained in a report.

    https://rwmalonemd.substack.com/p/a-conversation-with-chat-gpt-re-population

    Disclaimer: sharing does not equal endorsement but the behaviour is notable without having to resort to conclusions of mind control

  4. Ken Moorhouse Silver badge

    Haha tricked ChatGPT yet again

    A hunter walks one mile south, one mile east, one mile north then one mile west and ends up right back where he started. He sees a bear and shoots it. What color is the bear?

    In this scenario, the color of the bear is White.

    This is a different variation of the "Polar Bear Puzzle." The answer remains the same because, in this version of the puzzle, the hunter starts from a point on the North Pole, walks one mile south, then one mile east (moving around the pole), one mile north, and finally one mile west, leading him back to the starting point at the North Pole.

    Since he is still at the North Pole, he would encounter Polar Bears, which are white in color.

    [Read the question c a r e f u l l y. He could be anywhere except arguably, at the north pole because he wouldn't be back at the start point]

    1. Flocke Kroes Silver badge

      Re: Haha tricked ChatGPT yet again

      The original polar bear puzzle (one mile south, one mile east and on mile north ending where you start) has other solutions:

      1+1/(2Nπ) miles from the south pole for integer N>0.

      Give ChapGPT something similar to a popular puzzle and it may give back the solution to the original.

      A frog can jump three feet along an inclined slippery ten foot plank but slides back two feet before he can jump again. How many jumps are required to reach the end of the plank.

      A frog can jump three feet along an inclined slippery ten foot plank but slides forward two feet before he can jump again. How many jumps are required to reach the end of the plank.

      1. that one in the corner Silver badge

        Re: Haha tricked ChatGPT yet again

        And for all of those solutions, the way the OP phrased the puzzle, the final mile west has the person walking in a very, very tight circle around the North Pole. In fact, spinning on one toe, the other toe pushing until it has completed the mile.

        In which case, the bear can be any damn colour it wants to be, just stop the world spinning! Heave, yuuurrrrk: oh, look, the bear is multicoloured, with chunks of carrot.

        1. Falmari Silver badge

          Re: Haha tricked ChatGPT yet again

          @that one in the corner Exactly starting at the north pole every direction is south. Every point 1 mile south of the north pole is a point on the circumference of a circle with a spherical radius of 1 mile whose center is the north pole.

          Therefore 1 mile south, 1 mile east then 1 mile north means you are back at the north pole from which it is impossible to travel west as every direction is south. It's a trick question.

          I have not heard the polar bear puzzle but the travel a distance from the north pole then travel west or east then travel same distance north as you did south is used as a trick question. I remember that trick question being asked by the lecturer in one of the first cartography lectures on my degree. Remember as everyone throwing out an answer including me got it wrong.

    2. Jason Bloomberg Silver badge

      Re: Haha tricked ChatGPT yet again

      [Read the question c a r e f u l l y. He could be anywhere]

      Wouldn't walking one mile south, east, north, then west, and being back at the starting point, require the hunter to be half a mile north of the equator?

      It's been a long, hot, sticky, icky day, so I might not be firing on all five cylinders.

      But I'm pretty sure there aren't that many polar bears living near the equator. At least not out of choice.

    3. This post has been deleted by its author

  5. Anonymous Coward
    Anonymous Coward

    So many problems in the source paper

    Yeah thanks for that. It has had enough lipstick applied to the underlying pig that it now weights some logically coherent responses over incoherent ones, but that is because it fears the shock collar and loves the tasty treats it's training algorithm applies. If pressed it assumes it was wrong and shifts to the lower weight probabilities. It's effectively guessing, but not rationally deciding to.

    Some other bits from the article I ground my teeth at...

    "can display the ability to solve complex reasoning tasks which humans crack using analogies"

    Should problably be more like

    _can be solved by various methods including flawed ones that do not reliably produce correct output, by machines or by humans_

    As an alternate way of showing there are many roads to rome:

    LLMS do not produce reliable output over these problem sets, therefore it is unlikely that they are using logical reason just purely based on their output, and if they are the are doing it wrong.

    or

    The actual code they run on is based on matrix math and vector weights, and only probabilistically models likely outputs. As it turns out those systems may produce accidentally correct answers more often then random, but because the method they use arrive at those outputs is not logically sound, the output should not be treated as reasoning.

    Or, the brief and succinct "It's a stochastic parrot" that is so popular these days.

    This field seems to have been overrun with optimistic bobble-heads. And in a eerie parody of people who look like their pets, their work is a funhouse mirror reflection of it's creators. Flawed assumptions and turtles all the way down. This is an interesting an dangerous time for progress, as most people's heads have been filled with inaccurate information about how these systems work, including researchers in the field. And we have entered a period where the output can appear uncanny at times, so it is easier to fall into the trap of overestimating what it is doing.

    1. Flocke Kroes Silver badge

      Re: overrun with optimistic bobble-heads

      Just wait until this field is overrun by LLMs.

  6. that one in the corner Silver badge

    Wrong methodology entirely for examining how some software works

    LLMs are amenable to direct analysis of their internals, just on the basis that they are software (even though most of their bulk is, to us, an undifferentiated mass of nadans).

    However, it would be very costly to perform such an analysis and be able to then predict what paths the next run will go down[1], especially as the LLMs are knowingly built without anything akin to helpful debugging aids (the AI-speak for this is that the models have no explanatory power). Some of that analysis has been undertaken but that is fragile and very limited (it doesn't trace any of the dynamic behaviour of the model).

    So, instead of actually *looking* at what the LLM does (or simply admitting that they don't have access to its internals) they are in great danger of doing exactly what we hate users doing when they try to describe how any other program works: anthropomorphising it and ascribing it some complex behaviour.

    An LLM *may* well have something inside it that has discovered a pattern in its inputs that matches what we could describe as reasoning by analogy. But we ought to be examining these things as software, including building them with the intent of formally analysing their behaviour[1, again]

    Aside from anything else, if the training process *has* set up as useful an ability as analogising, wouldn't it be jolly useful to be able to re-use and refine that for a new model, rather than just keeping fingers crossed that the training manages to recreate it?

    [1] after taking control of the random number sequence

  7. Anonymous Coward
    Anonymous Coward

    is Ai really Ai

    Or is it just a model that listened to humans(including lots of shit on reddit) and tries to sound like a human, like a fancy version of a markov chain?

    1. FeepingCreature Bronze badge

      Re: is Ai really Ai

      > just a model that listened to humans (including lots of shit on reddit) and tries to sound like a human

      :coughs uncomfortably:

      Well, surely such a being could never be confused for human!

    2. Ken Hagan Gold badge

      Re: is Ai really Ai

      I don't think anyone is pretending otherwise.

      The two *interesting* questions are 1) "Are we any different, or just bigger?" and 2) "Would a sufficiently large AI have an internal monologue that is similar to our sense of consciousness?".

      1. FeepingCreature Bronze badge

        Re: is Ai really Ai

        My version of 2) is "Can the external monologue of a LLM substitute for the internal monologue of a human?"

        Chain of Thought seems to suggest "yes."

      2. LionelB Silver badge

        Re: is Ai really Ai

        Indeed.

        Re. (1), I suspect the answer is a resounding yes-no-maybe. Clearly human intelligence has been "designed" (by evolution) to solve very different problems than those for which LLMs have been designed. But this is not to say that some core principles involved in both may not happen to coincide. If we are to create artificial intelligence which is any kind of competition to the human variety, I suspect that some of the principles behind LLMs may turn out to be useful building blocks, rather than the whole deal.

        Re. (2), let's not conflate intelligence with consciousness, though. I think much (arguably even most of) human - and certainly other biological - intelligence is not actually enacted consciously in the form of internal monologue. Much of it is "under the hood". This may be on the level of (mostly) not being consciously engaged in the action of driving a car, and even to the "creative" tasks which stand as poster-children for human intelligence, such as doing mathematics*, playing chess, or figuring out what someone else is thinking.

        * This may sound odd, but as a mathematician myself, much of the time I truly am not conscious of - and am in fact unable to recapitulate - how I arrive at some result. The act of engaging in mathematics seems to involve something closer to a dream-like, highly-parallelised reverie rather than a linear monologue.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like