We tried it
And then tried a subset of an infinite bunch of monkeys hammering at an infinite number of keyboards.
The monkeys made better code
ChatGPT, OpenAI's fabulating chatbot, produces wrong answers to software programming questions more than half the time, according to a study from Purdue University. That said, the bot was convincing enough to fool a third of participants. The Purdue team analyzed ChatGPT’s answers to 517 Stack Overflow questions to assess the …
Shuffling is another matter: after a paragraph is found in its contexts, it is rewritten anew. However, no one knows and cannot say in prior that there was something more or less true in the paragraph. Google, for example, solves the problem due to the use of popularity, when a lot of people manually decide how truthful the paragraph is.
The new paradigm does not allow the use of "popularity" and there is a solution, which this is personalization, as the creation of individual AIs. What I called lexical clones 20 years ago: you deal with individuals you trust to, instead of strangers.
Tried it with Java using a simple serializable class for holding a bit of text ... failed
Tried it with some 6502 assembly code........ illegal command .
If i've got to spend ages tuning the input to chapgpt in order to get decent code out... I may as well save myself the bother and write the code myself...
More convinient to construct a robot that programs without programmers involved. Now Gitlab and OpenAI ChatGPT act as intermediaries between the programmer and his customer on the one hand, and the computer on the other. Even though it doesn't make any sense! The customer may well do without a programmer. OpenAI got financed and soon no more programmers
"Even when the answer has a glaring error, the paper stated, two out of the 12 participants still marked the response preferred. The paper attributes this to ChatGPT's pleasant, authoritative style."
Clearly a bright future as a tech / dev manager.
And *that's* how AI will take over the world, by co-opting middle managment. The Skynet model is waaay too much effort.
I've typically gone to ChatGPT for some more obscure technical problems that I struggle finding meaningful answers to on Google (nowadays, it seems like Google gives a few pages of mostly unique answers, then it just starts repeating itself). I've asked things like how a particular daemon config needs to be written to accomplish X when the documentation doesn't give you enough details or maybe just a real quick script that I don't feel like writing, like a batch file to loop through a list of subdirectories and create a separate ZIP file of each one (I do more bash, not batch).
In most cases, I'd say the answer ChatGPT provides is at least mostly correct. Usually at a minimum, it leads me in the right direction to solving the problem. I think therein lies the difficulty it will have at being the big job replacement tool for many technical-type roles. If you don't understand the nuances of the concept you are dealing with, you probably can't figure out how to fix little things that are wrong. Your best bet is just asking again and seeing it can fix it. However, that often leads you down a rabbit hole of frustration.
For example, I was asking questions about using some PowerShell commands to do something. It kept giving me commands (which were valid) with parameters that were not. I had to keep correcting it by saying, "Command X doesn't support the Y parameter". It would apologize, then continue to give answers that simply did not work. I had similar results when asking it how to do some routing/firewall config for a switch. It kept giving me directives that didn't exist for my model or firmware version, even when I told it what I had.
That's why I see ChatGPT as simply another helpful tool, but not something that you should expect will give you exactly what you want or need.
In my experience, when prompted in subjects I know well, ChatGPT produces somewhat to very wrong answers most of the time.
Whether right, inaccurate or completely wrong, it produces equally confident language.
It is therefore extremely likely to produce confidently wrong answers in subjects that I do not know well, or at all.
I cannot tell whether the text it is spewing are approximately correct, dangerously wrong, or merely somewhat inaccurate.
Therefore it is clearly worse than useless.
When it comes to writing code GPT being 'Mostly correct' is a good start because you can generally see where it goes wrong, same applies to SO answers.
What happens if a complete novice asks for a recipe and the result includes 'Use raw chicken as a salad garnish' or a simple brake bleeding procedure that omits to explain why even a small amount of air behind the piston is really dangerous.
Having a reasonably good knowledge of a subject should be in place before asking GPT for assistance otherwise the answers will eventually kill someone.
> when they found ChatGPT’s answer to be insightful. The way ChatGPT confidently conveys insightful information (even when the information is incorrect) gains user trust, which causes them to prefer the incorrect answer.
Really stressing the key word here: INCORRECT.
This isn't exactly a new behaviour from SO participants, which has been going downhill for a while.
If anything good is going to come from this work, the very best outcome would be to give SO a damn good wake up call: your traffic is going down because you are as crap as ChatGPT but at least the LLM is polite about it! Not that management can actually do anything to change SO answers (and, more importantly wrt politeness the comments) for the better, is there?
Google results usually plops out a few SO results and SO can be useful... Gave me a nice solution about a year ago for when you don't know the JSON model when you're model binding.
The actual answer (from a Russian bloke) was only about 4 years old, but every other solution I looked at presumed you knew the incoming model and went on in full Nazi fashion like you were a cretin if you didn't model bind by default.
The model design given by the sender was quote.. "nearly right" to which I went ape.... Bloody open source people. I wasted a week over that, so thanks Russian bloke and SO.
"but every other solution I looked at presumed you knew the incoming model and went on in full Nazi fashion like you were a cretin if you didn't model bind by default."
That is the behaviour that has been dragging SO down in recent years IMO. And a lot of it feels like "I can shove my oar in here and tell everyone that I participate in The Community by giving answers on SO" without any regard for whether that is actually useful to anyone.
There is still plenty of good stuff on SO (as you found) - the older stuff is still present!
I know it’s fashionable to hate on SO, but I think it’s a great resource. Particularly the debates in the comments under the top answers. Even if my question isn’t directly answered it almost always sends me looking in the right direction.
(Of course you have to understand the answer and test that it actually works, yadda yadda.)
It scraped from stackoverflow!
The problem is it does so without understanding, so it can combine correct and incorrect information in a nice wordy way that I guess people like? I suppose if you're knowledgeable enough to know what bits it got wrong you'll be fine. Though if you are that knowledgeable you probably don't need its programming help...
I've used ChatGPT a few times recently, as I've started doing some Powershell, which isn't my forte. It took about 10 iterations to get a script that worked; testing and feeding the errors back in, tweaking the requirements and generally fiddling about. That was about half a morning, to get a script that would have taken me days to research and write on my own.
I've also fed snippets from SO into it and had the bot comment it fully, so I can understand what's going on.
Great tool; use with caution.
> I've also fed snippets from SO into it and had the bot comment it fully, so I can understand what's going on.
No doubt it was very confident and convincing when it made its comments.
Which is where the risk lies.
> use with caution
Hopefully you were sufficiently doubtful and took the time to verify what it said - and that the SO snippets were actually working and useful ones in the first place. But will everyone be so diligent?
This is it working exactly as intended, the original purpose of these LLMs was only to generate text that is stylistically indistinguishable from its training data. It does not and can not care whether what it's producing is factually correct. It's purely a side effect that they ever produce factually correct answers.
> Our analysis shows that 52 percent of ChatGPT answers are incorrect and 77 percent are verbose
Without knowing how that compares to the human-supplied answers, it is impossible to form a rational opinion.
My personal experience of seeking help on forums is that many responses are wrong. Many more are answering an entirely different question while the rest are either passive-aggressive replies, arguing that the question is wrong, showing off or disagreeing with what others have posted.
Okay, this shouldn’t be a surprise at all if you dig in to how GPT and similar are trained.
First, you get an AI that can categorise files (e.g., “this is an image of two kittens playing scrabble”, “this is C source code for a quicksort”, etc.). Then you get another AI that tries to generate files. When the second one can produce documents that the first accepts as matching the description, you are ready to release.
Now, I’m sure you can already see the problem here. The first AI has no knowledge of whether: a. the original descriptions were correct (the kittens may actually have been playing ludo, the C code could have had a bug), or b. why those descriptions were correct. All it did was build an enormous, opaque network that gave the correct description for its inputs. And the second one had no knowledge either, it just kept making shit up until the first AI accepted it.
Doesn’t that sound like that guy* you hired who could do a great interview, but didn’t know anything about anything once you put them to work?
Basically, we have created machines that can bullshit better than a human. I will leave it to you to decide what level of concern that should attract...
__
* yes, it’s almost always a guy. My interviewing experience taught me that female candidates are almost always honest in interviews, males are about 50/50 between truthful and braggarts.
Arguably, the worst one can be is "no better than a coin-flip".
If ChatGPT answers 99% wrong, it's doing really well - you just need to invert the result.
Obviously we're not talking binary options here, there are many more ways to answer wrong than to answer right - but it is then not a fair question to ask of a coin.
I would tend to agree with the main points of the article.
In my own experience, the bots tend to produce material that seems plausible to anyone who does not know the first thing about the subject at hand.
Out of curiosity I tried to make ChatGPT generate pf.conf (OpenBSD firewall config) to spec, and well, the results are available at https://bsdly.blogspot.com/2023/06/i-asked-chatgpt-to-write-pfconf-to-spec.html or trackerless https://nxdomain.no/~peter/chatgpt_writes_pf.conf.html
TL;DR: the bot produces superficially (to the ignorant) *bullshit*.
I’m surprised the “error rate” is so low.
So far every MS Bing AI response to a technical search query I’ve made via Edge has been 100 percent wrong and/or inaccurate (I’m now looking at ways including group policy settings to disable Bing AI).
The laugh is once you’ve fought your way through the obstructive AI b*llocks, an authoritative answer can usually be found with the first page of web search results; if not, switch to Google and repeat the search…