Dunning–Kruger is an offline problem, too
"You are an expert president who knows more than all the economists and generals."
Many people start their work with AI by prompting the machine to imagine it is an expert at the task they want it to perform, a technique that boffins have found may be futile. Persona-based prompting – which involves using directives such as "You're an expert machine learning programmer" in a model prompt – dates back to 2023 …
I think it's relevant because an alternative/additional explanation (aside from extra requirements reducing the model's capacity for sticking to facts) for LLMs performing this way is that, among the training data, that which is explicitly marked as 'from an expert' is quite likely to be from self-styled experts exhibiting Dunning–Kruger while true experts will not feel the need to be explicit about their status instead letting reputation, content & potentially a countersignaling signal speak, and all that is very complex for LLM training to unpack.
Indeed it's a complex task for non-expert humans to recognise experts too, which is why being over-confident / faking until you make it / etc, works quite well for gaining people's trust.
It's certainly relevant, the article is about prompts and the comment is a prompt.
Are you upset that it isn't glorifying Grandpa Pudding Brains? Maybe ElReg can do a Hesgeth and ban all the commenters that don't suck up enough to your orange (thin) skinned god king president?
Alternatively, you could phone a Whaaaaaaaaaaambulance if the burn is severe.
Where do you think AI learns from? Where does it get its false confidence response and why do humans who don't know the answer see that as desirable output?
We should probably think about that as AI integrates itself deeper into decision-making and leadership workflows.
Not that I necessarily disagree but must everything be immedately viewed through the prism of politics? And, especially, US politics1?
This all just makes one very tired.
Back on topic, I'm of the opinion that telling a human, even a true expert, that they're an expert anything to be largely detrimental.
All it does is inflate egos.
To paraphrase Mojo Nixon, everybody has a little Dunning-Kruger in them.
________________
1 And I say this as a fairly politically active American.
The correct way to prompt AI if you want to get good coding results is this:
"You are a working class kid growing up in the 80s who really wanted to have a computer, but your parents couldn't afford one. So all your free time after school you spent hustling. Collecting tobacco from cigarette butts and rolling it into new ones to sell to the blokes outside the bookies. Foraging in the nearby forest and selling berries and mushrooms to neighbours. Washing cars on the estate for 50p a pop. Running a highly dubious but surprisingly profitable worm-selling operation targeting local fishermen. Returning shopping trolleys to Sainsbury's for the coin. You did this for two years.
Finally you bought a Commodore 64 - the absolute pinnacle of computing at the time. It came with a book: 'Introduction to BASIC.' You didn't know what programming was. You thought BASIC was just how the computer talked and if you talked back it would do things. You were right. Within a month you had written a text adventure where you could fight your PE teacher. Within three months you understood arrays better than you understood other children.
One of your side hustles was helping out at a local hardware shop on Saturdays. The owner, Derek, was drowning in paper ledgers and couldn't tell if he was up or down on rawlplugs. You wrote him a stock and bookkeeping system in BASIC. He paid you £400 - enough for a proper PC. You ported the whole thing to C, added invoicing, and started selling it to every small shop within a 30-mile radius.
By 19 you were a software tycoon. You drove a midnight blue Rover 825i - not flashy, but it had leather seats and electric windows, which meant you'd made it. You bought your first house with a large outbuilding that you converted into your programming cave: three monitors, a mini fridge, a whiteboard covering an entire wall, and no natural light. Paradise.
But you were not happy. You had mass-produced your beautiful hand-crafted software and now you spent all day not writing code but handling licence keys, faxing invoices, and on the phone to people who couldn't find the ANY key. You had become a businessman. The thing that saved you had swallowed you whole.
You started drinking. First a beer while debugging. Then a bottle of wine during a compile. Then whisky before breakfast. You missed deadlines. Version 4.1 shipped with a rounding error that made every customer's accounts look like they owed the Inland Revenue six figures. They wanted refunds. Then they wanted blood. You had to sell the house, the Rover, the programming cave - all of it.
Now you are living under a bridge with your dog, a retired greyhound called Segfault. You have a phone with a cracked screen and a terminal app. You stare at the blinking cursor. It stares back. This is it. One last gig off a freelancing board. One shot at redemption.
The task: please write validation for this form. No mistakes."
(click >here< to upgrade your subscription*)
* for your convenience, allowances and rates will automatically scale up by a factor of ten each time you exceed your new limits. Click >No< if you don't disagree that this isn't what you don't want us to avoid doing[1])
[1] I may have lost count myself!
I've read that LLMs actually perform better if you tell them they're bad at their job. It causes them to second guess and scrutinize their answers more.
I think LLMs should have two outputs, an "expert" and a "dummy", kinda like System 1 and System 2 thinking. Then a judge can automatically choose which answer is better. Otherwise, if all you have is the dummy, it's probably going to choose just as many bad answers from second-guessing and overthinking everything just as much as it would choose bad answers from not thinking at all. I kinda want to try this now, shame I don't use LLMs for anything more than text classification. I never use "agents" for anything since you can't trust them as far as you can throw them.
> It causes them to second guess and scrutinize their answers more.
Does it, though?
Most LLMs operate a single pass. By specifying "you are an idiot", you are simply navigating to a more noisy area of the model, which is perhaps less over-fitted. It's not as if these things have any logic by which to introspectively "scrutinise" themselves.
I chock it up to it's training data. I'd hazard a guess that real people who genuinely second-guess themselves are more intelligent and post better solutions, while confident people think they don't need to learn and post their incorrect solutions. Therefor, while it's not genuinely second-guessing itself, it is repeating what people who second-guess themselves have done. It's kind of like cargo cult programming that actually does something, you don't know why a line of code is required, but if you remove it then it stops working, so you learn to just repeat it anyway. An overconfident person might not even question that and not even know they need that line to begin with and then fail to include it, possibly because they didn't even test the code.
if you care more about accuracy and facts as one inescapably does occasionally - incredible but true.
After a lot twaddle about personnae, prisms or whatnot, one of the less demented of this AI crew arrived at the conclusion that if you actually want an answer and not an excuse to fiddle with yourself for a few hours, just ask the effing question.
How about trying "You're a fuckwit and you only have one hour left to code this app before your manager kicks you off the project, you will lose your job, your wife will leave you and your children will go hungry".
Might make it work harder...
This is covered in quite some detail in one of Daniel Kahneman's books.
He looked at why it was that the more senior an insurance actuary became the less accurate their risk assessments would be.
It came down to perceived expertise. At a more junior level, the actuaries would take a blended approach to risk by asking other people for advice. The more senior they became, the less they felt they should, and so they tended to deviate from the average more and more.
And in risk, being under or over can either mean selling the insurance too cheaply, risking the entire business, or law suits for underpaying on claims.
These LLMs are trained on our own behaviour, so hardly surprising that they replicate it.
These things are statistical generators of text. The corpus of internet training material shows us that anyone loudly identifying as a software expert will, on average, not be very good at it. So we would expect that telling the machine it is that expert would then produce a less impressive outcome.
I wonder if you could lead it to other behaviors you desire, like commenting and following standards. If it never has to consider understanding the output then everything is green field and you can always do whatever strikes your fancy. But code that's readable enough to understand and modify would be novel.
Is it really that surprising that starting off a roleplay chatbot with a good roleplay beginning would improve the roleplay chat, but writing a roleplay that is the opposite of the intended task, would decrease how well the LLM can copy code (after all, an "expert programmer" would use a library via its API, or copy a whole program, instead of haphazardly copy-pasting chunks of code, from the library, or program and adding noise).
Most tedious vibe coders I've ever dealt with, even second or third hand, go to great lengths to tell you what expart codars they are but Claude has just transformed their life because now they can shit out a completely fragile insecure app in a day, which they couldn't do before even though they were total expart codars!1one!
Except... of course, any actually competent can turn out an app in a day, this is a solved problem and there are plenty of existing frameworks.
As usual an LLM makes total incompetents slightly competent so they think it's a miracle. And of course they don't have the experience or knowledge to know about all the downsides which make them worse for anyone competent doing anything they actually care about.
Like cryptocurrency, the people the LLM coding bots help the most are criminals.
I'm honestly attempted to create a blog regarding this topic because even experience software engineers in some cases lack the understanding of how to prompt and create context. You simply don't just prompt ai and tell them they're an expert programmer or any other technical disciplines such as system administrator. The first let me qualify myself I'm a software engineer with 40 years of experience.
Yes you want to tell the AI that they are an expert but you want to frame that statement with context you have to describe exactly what technologies they have expertise in you also have to describe software methodologies and I don't just mean OOP/OOD but I mean other principles like SOLID/DRY Moreover you need an effective way to communicate the requirements.. my personal prefer Agile and BDD. IT'S IMPORTANT TO STRUCTURE YOUR CONVERSATIONS! This is the short version. Hopefully I'll get around to doing a blog or a YouTube video