IBM compiles dataset to teach software how software is made: 14m code samples, half of which actually work • The Register Forums

Wednesday 12th May 2021 07:14 GMT vtcodger

Work Product?

...half of which actaully work

Is the reg's spell checking perchance done by one of CodeNet's work products?

2 1 Reply

Wednesday 12th May 2021 07:26 GMT Pete 2

correct, secure, fast - choose one.

> About half of the samples work as expected (hopefully the authors did not expect it to fail?)

Functionality is nice, but to do it securely is better. If this IBM data can be used to re-write code so that it is hardened against hacks, then it might have some use.

And best of all, is if the code can be made to work efficiently and without bloat.

0 0 Reply

This post has been deleted by its author

Wednesday 12th May 2021 08:46 GMT don't you hate it when you lose your account

Full disclosure

Is the horizon post office code in there. And will it be used for ethics training

3 0 Reply

Wednesday 12th May 2021 08:51 GMT Whitter

AIs are very context sensitive

So they intend to build an AI that might help me code one of two specific tasks?

0 0 Reply

Wednesday 12th May 2021 09:21 GMT RobLang

Re: AIs are very context sensitive

Not necessarily. It depends on the encoding of the training set. If you encode such that the neural network learns only the structure of the test then it will do just as you say. However, if you encode the training data such that it learns the grammatical forms of function and process then it should be able to spot good and bad programming without the context of the tasks that IBM set.

0 0 Reply

Wednesday 12th May 2021 09:17 GMT steelpillow

Flying a Kite

All IBM need to do now is to specify CodeNet in terms of inputs and expected outputs, and feed it back to itself.

1 0 Reply

Wednesday 12th May 2021 10:44 GMT Howard Sway

it was collected from entries submitted to two programming contests

I find the number of entries to these contests rather staggering, but this is the ultimate form of design by committee. If choosing only on the basis of correct output from input, the system is going to label no end of bad, misguided and downright risky techniques and solutions as "good".

And I don't want that anywhere near my software, either the stuff I write or run.

I'm still sceptical about many of the efforts in this area, as to me good programming is much more about clean design than simply coding to produce expected output, and good work is much better identified by someone who appreciates this. At the very least I'd hope they trained it exclusively on examples which skilled people agree are good and demonstrate best practices.

4 0 Reply

Wednesday 12th May 2021 11:00 GMT trevorde

What IBM did next

Use this AI to automatically write their offshored code, thereby making more offshored staff redundant. Code quality is still poor but customers do not notice. IBM announces record profits. Ginni gets a new helicopter.

3 0 Reply

Wednesday 12th May 2021 18:49 GMT yetanotheraoc

Re: What IBM did next

You nailed it. They are training the AI to copy/paste code, so the code quality will be exactly the same as the current outsourced solutions. When the code doesn't work as expected (you can read that two ways), they will train the AI to ask questions on stackoverflow.

0 0 Reply

Wednesday 12th May 2021 13:36 GMT picturethis

Skeptical...

It seems like this effort is akin to converting a PLC's (programmable logic controller) ladder logic into another programming language or vice-versa. I don't quite see the utility of this. If they're just trying to covert X number of inputs to Y number of outputs, then it is just a state machine. State machines can be very elegant (but those are usually quite obfuscated) or can be very inelegant (and usually easier to understand). Computers are very good for predicable behaviour (even without AI). Granted most, correctly written, software, excluding AI, can be distilled down to gigantic (predictable) state machines. Lucky for us humans.

To extend this thought, this is what FPGA tools already do. Take verilog / VHDL and turn it into a set of bits that define a huuuuge state machine that runs in the FPGA logic gates. Again, this has been done.

Taking examples from human programming for examples of (good) security just seems..... wrong. We (humans) aren't very good at that.

And lastly, why would computer (AI) generated language (designed by humans) to run on a computer be desirable? Once it's generated, humans are going to review and comment on the correctness, after the AI has already generated it based on learned examples (from potentially billions of input examples - both good and bad)?

We go from:

problem -> human -> (programming) language source -> preprocessed -> compiled machine code of choice

And with AI:

problem -> AI -> (programming) language source -> preprocessed -> compiled machine code of choice

Why not just:

problem -> AI -> compiled machine code?

We exist to serve our AI overlords.

0 0 Reply

Wednesday 12th May 2021 14:23 GMT blah@blag.com

Countdown to ...

Skynet? We should have a Skynet Clock similar to the Doomsday Clock. I'm actually only half joking.

0 0 Reply

Wednesday 12th May 2021 17:44 GMT Anonymous Coward

Python

Looking at the first two pie charts in the paper, the Python and Wrong Answer pie slices are almost identical.

Coincidence? I think not.

0 0 Reply

Thursday 13th May 2021 06:08 GMT spireite

This is not new to IBM......

I've been witnessing their code quality for years, they've just announced it, after 10 years of trusting it ......

0 0 Reply

Topics

Special Features

Vendor Voice

Resources

COMMENTS

Work Product?

correct, secure, fast - choose one.

Full disclosure

AIs are very context sensitive

Re: AIs are very context sensitive

Flying a Kite

it was collected from entries submitted to two programming contests

What IBM did next

Re: What IBM did next

Skeptical...

Countdown to ...

Python

This is not new to IBM......

POST COMMENT House rules

Enter your comment

Add an icon

Other stories you might like

Tech titans assemble to decide which jobs AI should cut first

AI spam is winning the battle against search engine quality

Google Cloud chief is really psyched about this AI thing

What's up with AI lately? Let's start with soaring costs, public anger, regulations...

AI PCs are here but a killer application for biz users? Nope

Psst, hey. It's the NSA. You want some AI security advice?

UK unions publish AI bill to protect workers from 'risks and harms' of tech

Devaluing content created by AI is lazy and ignores history

Google squashes AI teams together in push for fresh models

Logitech intros free tool for ChatGPT prompts... plus a mouse with an AI button

Gentoo Linux tells AI-generated code contributions to fork off

British watchdog has 'real concerns' about the staggering love-in between cloud giants and AI upstarts

About Us

Our Websites

Your Privacy