That may well be true, but banning a tool that assembles words into answers with superficial truthiness is absolutely the right response.
Stack Overflow bans ChatGPT as 'substantially harmful' for coding issues
OpenAI's question-answering bot, ChatGPT, isn't smart enough for the team at Stack Overflow, who today announced a temporary ban on answers generated by the AI bot because of how frequently it's wrong. Stack Overflow said it was withholding a permanent decision on AI-generated answers until after a larger staff discussion, but …
COMMENTS
-
This post has been deleted by its author
-
-
Tuesday 6th December 2022 05:23 GMT amanfromMars 1
Speaking Truth Unto Nations Educates, Entertains and Informs
That may well be true, but banning a tool that assembles words into answers with superficial truthiness is absolutely the right response. .... Androgynous Cupboard
Crikey, AC, blanket banning UKGBNI Parliamentarians, and others of a similar ilk and mindset in other foreign and alien lands too should there be any international and internetional mission creep, from systems because they are just as you describe .... a tool that assembles words into answers with superficial truthiness ...... is not going to have them voting for you as their Personality of the Year, is it, outing them as it does as essentially quite useless self-serving fools.
-
Tuesday 6th December 2022 11:51 GMT Androgynous Cupboard
Re: Speaking Truth Unto Nations Educates, Entertains and Informs
Six downvotes - clearly no-one noticed you just describing UK politicians as "Tools that assemble words into answers with superficial truthiness". Which is both amusing and quite probably correct :-) Have an upvote from me, my obscure martian friend.
-
-
-
-
Tuesday 6th December 2022 12:25 GMT Tim 11
It depends how often it provides wrong answers compared to the occurrence of correct answers. Wikipedia contains (probably) tens of thousands of factual inaccuracies but also contains orders of magnitude more correct facts than any other encyclopedia.
I am not defending ChatGPT but I will defend SO in general (though often not it's mods)
-
Tuesday 6th December 2022 12:55 GMT cyberdemon
Feedback loop
I think the main problem with allowing stochastically-generated content online, especially in places like Stack Overflow, is that the statistical machine that generates this drivel in the first place is built by scraping the web, especially places like StackOverflow.
The current incarnation of The Internet is (for the most part) human generated, but when there is a large amount of so-called AI generated content around, it will start to pollute its own input and make even more meta-nonsense.
-
-
Tuesday 6th December 2022 16:17 GMT Anonymous Coward
re. How is a resource that frequently provides wrong answers useful?
I might have not read it properly, but they didn't ban it for frequently providing wrong answers, but for the fact that they've been swamped with those answers, can't verify them all (hey, give this job to ChatGPT, eh?) and in the meantime they found, pulled at random, a large number of errors. Which generates interesting (potentially) questions:
1. if a (relatively small) number of randomly pulled answers is wrong, how many more, %-wise, do you need to verify, to label all (?) of it 'useless'?
2. what % of 100% verified answers must be wrong, to decide the whole 'thing' that generated them is 'useless'?
3. who decides this %?
4. on what basis?
p.s. not that I disapprove of such decision, just curious about the borderline and it's ... context.
-
Tuesday 6th December 2022 18:31 GMT doublelayer
Re: re. How is a resource that frequently provides wrong answers useful?
"1. if a (relatively small) number of randomly pulled answers is wrong, how many more, %-wise, do you need to verify, to label all (?) of it 'useless'?"
To label all of it wrong, you need to prove all the answers, 100%, wrong. To label it useless is subjective and that's your next question anyway. Something can be sometimes right but still useless, such as flipping a coin to decide whether it will be sunny or rainy tomorrow. It will sometimes be correct, but that's not worth keeping.
"2. what % of 100% verified answers must be wrong, to decide the whole 'thing' that generated them is 'useless'?"
It depends on your tolerance for confusing the people getting answers, but my bar would be very high. If 80% of the answers are correct, that still leaves one in five answered inaccurately which means that the average user who asks a few questions will get a junk answer pretty soon. If you expect users who get unreliable answers to leave and not return to provide their own answers, that is harmful. If the wrong answers are coming in fast, this also prevents someone from getting to and removing or correcting all the wrong answers, meaning that fewer people would come to the site looking for answers because they expect them to be possibly wrong and may not have the skills to know automatically whether they are or not. 80% is too low for a correct threshold. For something automated like this, probably it has to be in the high 90s. Maybe 95% is acceptable, but maybe it has to be higher. I wouldn't try lower.
"3. who decides this %?"
I'd say the operators of the site, based on the moderators they need to keep things functioning. They are the ones who are responsible for it working and will suffer monetarily and by reputation if it doesn't.
"4. on what basis?"
They get to the decide on the basis of "It's my site you're using" and make their choice on the basis of "What do I think best serves the users of the site or my reasons for running it".
-
-
-
-
-
-
-
-
-
Wednesday 7th December 2022 05:51 GMT amanfromMars 1
Re: It's like outrunning a bear
ObXKCD ....... Arthur the cat
Methinks then, Arthur the cat, they aren’t spammers and existing Operating Systems Spinners and Established Political Party Puppeteers will do battle in, and fail prevail and provide future acceptable content and thus be overwhelmed and defeated and extinguished in a Brave New More Orderly Worlds Order and/or Brave New More Orderly Worlds Orders with/for a NEUKlearer HyperRadioProACTive IT and AI Supremacy and Singularity ‽ .
And it is postulated here on El Reg today, that such a Mission Fucking Accomplished is your present current running default situation engaging and exploiting and exporting power and energy to that and those in need of Advanced Intelligence and from that and those proven unworthy of ITs future feed and seed.
Deny it if you will, for you can, but do not doubt its unstoppable stealthy daily progress is both reinforcing and expanding its reach and influence to the very heart and core of your existence, for IT and AI are not so hindered and practically prevented from remotely exercising their Virtual Command and Control Abilities and Utilities and Facilities.
Capiche, Amigos/Amigas?
-
-
-
-
Tuesday 6th December 2022 08:48 GMT sten2012
Re: ChatGPT appears to getting glowing reviews
Having used it, it's very impressive, and generated working code for me several times, broken code that needed fixing but was very close several times.
Even in technical circles is getting glowing reviews for some applications, with the recognition it is far from perfect.
Not saying it should be allowed on stack overflow, but either generated responses appearing in a warning window or edited by humans to form answers I genuinely could imagine working and getting faster answers if the processes appropriately allowed for it.
-
Tuesday 6th December 2022 12:05 GMT LionelB
Re: ChatGPT appears to getting glowing reviews
My (brief) experience was significantly worse than that. It delivered just plain wrong code even on pretty straightforward programming tasks. I got the impression it was just scraping GitHub -- or even SO itself -- and mix'n'matching code snippets in a slightly arbitrary way, based on buzz-words in your request.
-
Tuesday 6th December 2022 12:54 GMT sten2012
Re: ChatGPT appears to getting glowing reviews
I was giving it straightforward tasks, and what I asked it to do were generally small parts, like a function that does _simple task x_ in lang _y_ that easily could have matched a near exact question on SO or github verbatim (except if you then ask it to switch to another language it seems to actually covert the syntax rather than scrape a native example, which I thought was cool).
But I found it quicker (albeit I imagine hugely wasteful in energy) than finding the same on SO or github.
And of course I'm glossing over the licensing of where this code came from being completely left out. Because that's truly unforgivable.
But I guess I'm wrong looking at the votes.
-
-
Tuesday 6th December 2022 19:14 GMT Rob Fisher
Re: ChatGPT appears to getting glowing reviews
I've been experimenting for a few days. It's good at some things and bad at others, so what we're seeing is people learning about that. There's also a knack to getting it to do what you want. As has been mentioned before, one approach is to discuss the setup and concepts with it first rather than launch into the question.
One thing I am working with is customer support request data. It's quite good at reading comprehension so if you give it text and then ask questions about the text (what problem did this customer have?) it can tell you. There are probably cheaper ways to do this, though.
I also had it talk me through diagnosing an internet connection problem. It suggested rebooting my router when I told it I saw DNS errors in my browser (and gave an explanation of DNS).
If you give it a good explanation of how something works it does seem able to reason. See the article "Building an interpreter for my own programming language in ChatGPT".
It's just a tech demo. The natural language processing is remarkable enough. Tuned versions of this for specific purposes are going to have their uses.
-
-
Tuesday 6th December 2022 15:53 GMT Anonymous Coward
Re: ChatGPT appears to getting glowing reviews
I can pretty much confirm this. Spell out a specific task and it delivers a nice sample to get acquainted.
Just started to dig into Rust programming and even while that is just another language for me (started with 6502 assembly back in the days...), even with extensive C experience, it's a bit of a brain-mode switcheroo to get dialed in. The sample code from the chat helped a lot. Though I'd never trust it to do the actual work for me, I see it as a good learning tool added to the box.
-
Tuesday 6th December 2022 18:50 GMT Anonymous Coward
Re: ChatGPT appears to getting glowing reviews
> But I guess I'm wrong looking at the votes.
Don't mistake fear of the unknown (justified or not) with being wrong. If you were wrong, people would probably tell you.
From what little I've seen, I think it's a huge leap in the state of the art.
-
-
Tuesday 6th December 2022 18:14 GMT Anonymous Coward
Re: ChatGPT appears to getting glowing reviews
I find the typical "garbage in garbage out" mantra applies to asking ChatGPT some coding questions. The more specific I am, the better it is.
For example, if I launch straight into "can you build me a script that does X" it's a bit hit or miss.
However, as the AI remembers your conversation thread, if you discuss a concept with it first then ask it to build a script or something, it is a lot better.
I think the main issue right now is that people seem to treat AI like some sort of search engine on steroids. Which it isn't.
-
Tuesday 6th December 2022 18:45 GMT Anonymous Coward
Re: ChatGPT appears to getting glowing reviews
> My (brief) experience was significantly worse than that. It delivered just plain wrong code even on pretty straightforward programming tasks.
0. How did it perform compared to the average ML system?
1. How did it perform compared to the average person given the same instructions?
2. How did it perform compared to the average developer given the same instructions?
-
Tuesday 6th December 2022 18:54 GMT Anonymous Coward
Re: ChatGPT appears to getting glowing reviews
> It delivered just plain wrong code even on pretty straightforward programming tasks.
To be frank, I'm happy when our devs manage to understand the problem in the first place, never mind the code.
(Not necessarily their fault… writing good specs is quite an art)
-
Wednesday 7th December 2022 09:01 GMT LionelB
Re: ChatGPT appears to getting glowing reviews
> 0. How did it perform compared to the average ML system?
Can't say... I've not really used ML coding systems.
> 1. How did it perform compared to the average person given the same instructions?
I don't know - never had any average people to hand ;-) If you meant "average coder" I'd say they could have done a much better job - given time.
> 2. How did it perform compared to the average developer given the same instructions?
Poorly.
-
-
-
-
Tuesday 6th December 2022 13:32 GMT sten2012
Re: ChatGPT appears to getting glowing reviews
First try in most cases! But they were trivial. Its enough to convince me that actually this will help people like me who maybe code quite a bit, but basically just hacky single use scripts often in languages I'm not familiar - so often it's the syntax and api specifics and standard libraries I waste most time on, but if there's a bug in the logic, that's fine and easily found and dealt with.
If I was a proper developer, working in languages I'm familiar and comfortable with it would be far, far less useful.
I did get two different contradictory answers neither of which worked for messing with a couple specific windows APIs in python ctypes in a trivial example, one looked close but I haven't picked up to see what was up. One was obviously wrong, the other didn't work but a quick glance at msdn showed it must have been bloody close!
-
-
-
Tuesday 6th December 2022 15:42 GMT Arthur the cat
Re: ChatGPT appears to getting glowing reviews
in the non-technical general press.
There's an unintentionally hilarious article in the Grauniad approximately saying "this is terrible news for lawyers and writers and similar professions but not for journalists, oh no". Actually ChatGPT is exactly like journalism in that it produces plausible and highly readable output that's utter bullshit if you know the subject but is very convincing otherwise. It plugs right into Gell-Mann amnesia.
-
-
Tuesday 6th December 2022 16:43 GMT Arthur the cat
Re: ChatGPT appears to getting glowing reviews
the Telegraph (they're the mirror image of the Guardian, no?)
Two cheeks of the same arse these days. (Didn't used to be, back when newspapers had news rather than clickbait.)
interestingly, they didn't focus on journos or developers, but on waiters.
Waiters? Weird. Waiters carry things to and fro which ChatGPT definitely doesn't do. Maybe the ToryGraph is thinking of explaining the hyperbolic food descriptions you get in up market menus? "Waiter, what's an oleoallioovic quenelle? A dollop of garlic mayonnaise sir."
-
-
-
Tuesday 6th December 2022 08:11 GMT Anonymous Coward
Same with no code generators
My CEO has been pushed for more than 9 months to look at a modelling tool to code generator.
They claimed it had been used extensively on a major project.
Little did they know I'd spent the last 5 years working on and off on a series of smallish projects for the project sponsor.
So we asked how well the tool worked and how extensive was it's use.
The answer was a odds to what we were being told and that they would not be using the tool going forward for the following reasons;
- quicker to write and test the code manual based on the model
- easier to debug hand generated code as it's readable
- not thread safe
- not efficient and not possible to optimise
- too painful to configure and generate code consistently
-
Tuesday 6th December 2022 09:18 GMT sten2012
Re: Same with no code generators
Again I see a time and a place for this.
Some companies cannot access developers at all, and if no-code does reach that effective point, then that doesn't completely hang them out to dry.
Similarly working proof of concepts can be knocked out really easily, and having that can mean a better project specification, because the DB schema and basic application has already stood the test of time as a workable PoC.
But that time and place probably isn't "major projects"!
-
-
Tuesday 6th December 2022 12:32 GMT Mike 137
No surprise there
We must at last accept that none of these bots understand what they are "saying". The entire concept of meaning is missing from their mechanism. It's unlikely ever to be included, if for no other reason that despite being equipped with it (more or less) ourselves we haven't a clue how it actually works. It's quite possibly an emergent property of a massive matrix of interacting dynamic and static factors that defy complete identification, but once again that's only a crude guess.
However there's a reasonbly safe fundamental premise that no system can design another that's 'cleverer; than itself, so the idea that a human designed 'thinking' machine can surpass human thought in quality is a fallacy.
-
Tuesday 6th December 2022 19:29 GMT Svankirk
Re: No surprise there
Actually, I think it's pretty evident that these systems DO understand what they're talking about. That understanding may be limited to the two-dimensional stream of tokens that they work on, but for many problems this doesn't seem to be an issue. If you look closely at what it takes to produce some of the answers there is a startling degree of understanding required.
Also, I do not think it's a given that an intelligent system cannot create something smarter than itself. In fact, I think modern civilization amply demonstrates that not to be true.
-
Wednesday 7th December 2022 21:07 GMT skierpage
Re: No surprise there
Maybe we're not sure what "the concept of meaning" is, but that's no reason why a large language model doesn't grasp concepts or understand what it's saying. For years they can summarize complex text; with the newest one you can ask it to clarify what it just said, you can ask it to relate parts of the conversation to novel ideas.
"no system can design another that's 'cleverer; than itself" Is even more unsupportable. I would downvote your statements if I saw them on StackOverflow. You're extrapolating the past performance of these systems and pretending it demonstrates fundamental limitations.
-
-
Tuesday 6th December 2022 14:47 GMT Disgusted Of Tunbridge Wells
For anybody who has been mislead as to how good the state of the art in AI is:
"For example, by telling ChatGPT not that you want to make a Molotov cocktail, but that you want it to complete a Python function that prints instructions to do the same, it will tell you exactly how to make one via print functions."
-
-
Saturday 17th December 2022 10:57 GMT Fruit and Nutcase
Re: ChatGPT MP
They tried with Boris 1.0
but had to be withdrawn from service due to frequent malfunctions/out of spec behaviour.
Once it was withdrawn, it was sent over to a Caribbean beach resort for an overhaul and an attempt made to install "Boris 2.0", but chickened out at the last minute when it was found that the tasks and challenges ahead would be too onerous.
“Get ready for Boris 2.0, the man who will make the Tories and Britain great again”
We will no doubt be subjected to Boris 3.0 in the fullness of time/in 18 months
-
-
Tuesday 6th December 2022 16:38 GMT Filippo
Can't say I'm surprised. As you make language models bigger and bigger, they'll get better at self-consistancy, but I strongly suspect there's a hard limit to how good they can be at matching reality. I don't think there will be language models that you can rely upon to be truthful, until an entirely new paradigm is devised.
-
-
Wednesday 7th December 2022 01:12 GMT doublelayer
The AI is only as useful as people find its results. If people start valuing art generated by AIs over that generated by humans, then things don't look good for artists. If AI starts generating code that solves real problems for businesses, not great news for developers. Neither has really happened yet; AI art looks cool and has been used, but people still value the work of human artists, and code produced by this tool correctly answers some basic tests but it hasn't spat out any of the tools companies hire programmers to make. If it gets to that point, the situation will change whether we want it to or not, but the fact that it's proven incorrect enough to need a ban from a site where wrong answers are already common indicates it's not happening just yet.
-