Sounds awful
I can't begin to imagine how bad the code is that comes out of this kind of iterative slop machine.
Open source developer Geoff Huntley wrote a script that sometimes makes him nauseous. That's becaues it uses agentic AI and coding assistants to create high-quality software at such tiny cost, he worries it will upend his profession. Here's the script : while :; do cat PROMPT.md | claude-code ; done Huntley describes the …
Playing Devil's Advocate here - surely that's the same argument as tutting at the younger generation for not speaking "the Queen's English" - language changes naturally over time, and it's main purpose is to convey information between two or more entities - if kids can communicate with each other perfectly happily by using terms that the older generation are completely unfamiliar with, then their language has still served it's purpose - many older people might also call this complete lack of adherence to grammar and vocabulary "slop"
So it is with this - conventional coding has grown, changed, adopted many rules & conventions, changed those rules & conventions, realised its mistakes & improved (in some cases!) - it's a continual evolution, and this is really no different.
If I need light in the room, I flick a switch and on comes a light. Now, I happen to know an awful lot about all the stages of how that light gets created, but none of that knowledge is necessary to flip that switch and put on the lights. If creating an app really does progress to the equivalent of a completely non-technical user flipping a switch and getting an app that does what they want, then that may make a lot of people very worried, but actually isn't that kind of outcome the whole point of IT?
Ultimately it's just a tool to achieve an outcome - there will be bumps and bruises along the way, in the same way that light bulbs sometimes blow, and electrical appliances sometimes catch fire and do an awful lot of damage - but it's all just another step along the evolutionary path.
Maybe some of afraid of the change, but life goes on...
I don't think you're entirely wrong, and in a way that makes me sad. The days of software as craft are probably waning, as they were always going to. But it seems a shame if the technology to do it is this.
In a way it is a step on the same road that took us from Assembler to C and from C to Visual Basic and from Visual Basic to Javascript - at each step you can get by knowing less about the layer below. I wonder if there is anyone out there programming web UIs today who understands everything that happens at every level when somebody uses them, from the browser script to the client OS to the client hardware to the server to the server OS to the server hardware and back. I certainly don't and I like understanding things! Once we get to the OS and below I only have the roughest idea.
I agree - it IS sad - I was commenting only the other day about it being great that a bunch of students rebuilt a replica ENIAC out of not much more than cereal boxes - that kind of thing is needed if future developers are going to truly understand their subjects from the bottom up
Unfortunately, while it is sad, it's the same journey as taken by many other crafts - try finding a blacksmith at his forge, or a master carpenter who hand makes absolutely everything - they still exist, but they aren't needed by society any more as anything more than a curio
There's no right or wrong, it's just another step on the journey.
Unfortunately this happened a long time ago, when development turned into a factory line, quantity trumped quality, and we all became beta testers
it lost the art and creation factor, I always saw it as the cleanest form of creation.. there was art in the structure and patterns, and in many ways, this is far more important than the compiled product... cause that structure determines how easily the product can be further enhanced.
The first AI LLM's we are using today equate to the Wright Brothers first viable flights. No one envisioned 400+ passenger airliners at the time. AI will improve through use and iterative self-checks will improve output over time.
Tutting over obvious errors in todays AI code is short-sighted and reactive. People who refuse to evolve will be proven wrong over time.
LLMs are not Artificial Intelligence, and never will be.
Genuine AI will very probably come about eventually, but is not likely to be an LLM.
We really do NEED to stop applying the term AI to things which are quite clearly no such thing. Like Tesla's 'Autopilot' and 'Full Self Driving', it is a term that encourages the bulk of the population to expect (and assume) that these things can be relied upon to do things which in reality are things which they are incapable of doing.
Not wanting to be that guy, but in an academic sense all of those things fit under the umbrella term of Artificial Intelligence. It includes stuff like
- Machine Learning
- NLP
- Robotics
- Computer Vision
- Gen AI
There’s also the big theoretical sci fi vision of conscious AI as a living entity, but everyone agreed to refer to that as AGI a while back now and I don’t think too many people are confused as to the distinction anymore tbh
We really do NEED to stop applying the term AI to things which are quite clearly no such thing.
That ship has sailed , people ( salesdroids mainly ) were calling any kind of software "AI" for years before LLM arrived and does ... what it does ....
So the chances of putting genie back in bottle now are nil
You got more chance of restoring "Hacker" to its original meaning
"LLMs are not Artificial Intelligence, and never will be"
They certainly aren't AGI. What they ARE very good at is finding specific solutions in a constrained space / for a well-defined problem. That's why LLMs are fantastic at maths and coding, and suck at a simple research that a 10-year old could do. In any case, a good software development was always first and foremost someone who understood the problem and could come up with a solution - the code always came later. Any idiot can use Claude code to come up with a fantastically coded app, which, likely as not, is a great solution for a completely different problem than the one needed to be solved.
"LLMs are fantastic at maths and coding"
I think there are quite a few commentards on El Reg who might strongly disagree with your use of the word fantastic there. It may be a useful starting point, but that is all.
In the end, any general LLM is only going to come up with an answer that is already in its repository of data. Anything different that it concocts will come about from the way it constructs the plausible word/number/code salad that it serves up, and is as likely to be garbage as it is to be useful. If used by "any idiot" to write code, that idiot is unlikely to know enough about coding to be able to see whether the code it spews out is good or bad, and equally unlikely to know how to rework it to make it usable. If they do know enough to spot the shortcomings, they could probably have written it correctly in the first place, without needing any input from the LLM.
> equate to the Wright Brothers first viable flights
Being learned in such things, I wish to point out that the Wright brothers contributed next to nothing to the history of heavier than air human flight.
All the necessary knowledge was there since about the 1860s and it was just waiting for contributing technologies to be refined, most notable amongst them the necessity of increasing the specific power output of existing engines. The Wright workshop made no contribution to any of that. They just happened to be in the right place at the right time to become part of an emerging nation's lore.
But there is a massive difference between not understanding every layer in the chain and not understanding even your own layer at all. You as a developer/coder at least have an understanding of the interfaces you're dealing with and what their requirements are. You understand what functions your bit of code are using or implementing. With todays "vibe coding" slop, these "coders" don't understand a single thing about what their program is doing. They don't know what it talks to, they don't know what it does, they don't know what it doesn't do. They just know that the end result they're expecting is happening. There's no care for efficiency, there's no care for standards, there's no care for potential bugs or backdoors. Oh, shiny window appear, good times.
When you flick a light switch, a series of deterministic steps take place to ensure the light comes on. That process has been refined over many years, is tried, tested and works. Sure, some electrical systems may have quirks, but usually these are shortcuts or bodges where the person implementing them knows it isn't *ideal* but has to get the job done. My problem with using an analogy like this to AI-based code is that it is not deterministic - it will come up with any number of different approaches each time you ask it, and you'll never get to the "best" solution unless you are already proficient enough in the output it produces to be able to tell what that looks like. As we gradually moved from machine code to assembly to C to other languages and frameworks, I accept the increasing layers of abstraction have meant none of us now are proficient at every layer of the stack. But the crucial point is that thought and experience and understanding went into creating those abstractions. Just blindly asking AI to "make code do thing" and accepting whatever it gives you means we're going to raise a generation of "engineers" who understand nothing at all. At the sight of the first bug, what are they going to do? Ask the AI to fix it? Good luck with that, because the likelihood is, it will try and refactor the entire repo to fix each individual bug, no doubt generating more bugs in the process. The maintainability, extensibility and overall long term stability of anything built this way seems highly doubtful. It would be, to use your original analogy, like flicking a light switch and sometimes getting a light come on and other times starting a fire in the upstairs bathroom; and so you flick the switch a few more times to see whether that will put out the fire, but that then causes the garage door to open and the fridge to turn off. But hey, the light came on.
Now - yes, 100% - I agree we aren't there.
But models like this WILL evolve and improve - remember for starters of course that LLM's being non-deterministic isn't strictly true - they have been seeded that way to give the illusion of creativity, but there's no particular reason that a model couldn't be trained on only "good" programming practices - in fact I'd fall off my chair if there weren't a hundred different organisations already trying exactly that.
Given that the vast majority of code that's still out there at the moment has been crafted to the best of our collective ability, and given that daily news stories prove that much of it is as full of holes as the proverbial Swiss cheese, a tool that could iteratively figure out the best way to achieve a result, whilst avoiding all known pitfalls, and adding to "collective" knowledge as it goes, sounds at least as good a way forward to me as the path we would be on without it.
And for sure - "ask Ai to fix it", you say with scorn, I say why not? Take this code, figure out where it's broken, and come up with a solution - rinse, repeat, until it's not broken anymore. Even then take it a stage further: "do this without using these dependancies", "make this run more efficiently" etc etc
For a real-world example, just look back at El Reg only yesterday
So why not give it a try a see where it leads?
You can't make it, a [COMPLEX] nondeterministic system deterministic through incremental re-writes. What you say is like mixing a pot of coffee and tea and trying to separate it again by pouring it through a paper filter repeatedly. Wrong way to use a tool, and sorry but a failure in logic at the outset. While tools like Claude aren't going away and will gain both utility and function, their nature will make them MORE complex not less. And no addition to their function will make them deterministic.
Ralph may eventually ralph up the buggy 0.5 version of code for a deterministic tool before the heat death of the universe, but only if the script has a watchdog timer to kill it off before your startup drains an unacceptable amount of it's bank account.
We have a new class of tool. We learn how to use it, just like CNC routers turned craft wood working into another programming task, or we check out and try our hand at another trade. Or those of us that can retire dust of grandpa's old workshop and become artisinal craftsmen. Engineering is what engineers do, and as the ground shifts we need to build a rebuild for the future. We need to build better tools, and we may need to build a new and parallel set of RELIABLE tools in the process. We definitely need to adapt how we train people in the discipline. People in school on the CS track are going to be flushing time and money down the drain on degree that isn't worth the paper it's printed on unless we use a prod to herd the academic side into teaching the new tools along with software engineering principles, and not asking Claude to write a slop curriculum on slop coding to separate community college students from more of the time and money they are working three jobs to pay for.
We need to pay down three decades of technical debt in how we manage our industry, our trade, and our livelihoods. Or we loose it all and end up on the wrong side of the Starbucks counter. I for one wince whenever I hear Venti, or someone order a medium coffee by calling it a tall or a grande.
I'm not suggesting that you can - I'm suggesting that with the right training, either Claude (with hooks), or a Claude-like equivalent can start to generate code that follows "good" coding practices, that gives a reliable (and where necessary repeatable) output, and can iterate through huge numbers of solutions looking for one which gives the required output, far faster than a developer could do the same, and that eventually such iterations could ensure that said code is both highly secure, highly efficient and fully commented
As for the link above for the Reg article yesterday, there is also nothing to stop that solution being checked and ranked separately by a completely different automated agent to ensure an independent test of the outcome.
It's early days yet, but these potential outcomes shouldn't be treated with the scorn and contempt that they are currently receiving - is this kind of evolution so different from the industrial revolution, or Henry Ford creating the moving assembly line? Probably not, but I guess only time will tell...
Does it really need to be trained to write good code?
In a multi-agentic system, one agent could come up with a plan a second agent writes the code, and a third agent check it follows the correct guidelines. No additional training other than the three agents being fed with different instructions: Planner, Coder, and Reviewer.
this is just trying to hide the problem inside a homunculus- at some point, one of them has to not be shit at programming. Coding agents are almost usable at very very small scale on well known solutions to problems, but as soon as your code hits a critical mass -really probably no more than 1000-2000 highly information packed lines of code- (i.e. not j2ee), it can no longer see the wood for the trees. the initial 'gains' you see, rapidly become losses- as you are forced to increasingly micromanage the system, which will start reinventing the wheel within the same codebase, start 'optimizing' or otherwise hallucinating the need to start taking entire modules around the back of the woodshed.
It is an averaging machine. When you have an empty canvas, on average, things start out looking good. As soon as you hit the point of information saturation, it actively tries to turn your code into the billion lines of mediocrity it has been trained on. In order to maintain the 'potential efficiency'- you require a superhuman with laser like focus, who can sit through the monotony of hitting the try-again button, and ensure their commits are clean, and don't e.g. strip all the comments, drool random shit all over the file, or reformat everything making it impossible to sensibly use git blame anymore. The problem is, like doomscrolling instagram, this cycle makes you actively stupid and you are suddenly no longer qualified for the role.
I don't have anything against coding agents, _in the lab_, or even used with care for very specific cases- but this current iteration is overhyped and dangerously being positioned as 'how it's done now' to people who don't know any better. Nothing good will come of this. maybe in 5 years, but only after enough damage has been done that legislation is in place so the ai companies are financially motivated to give a toss. This is assuming we still have access to fresh water, and that nobody asks too many awkward questions in the revenue meetings.
The statement "You can't make it, a [COMPLEX] nondeterministic system deterministic through incremental re-writes." sounds similar to the claim of Intelligent Design proponents that complex biological structures can't arise through evolution. As pointed out in "The Blind Watchmaker" (Dawkins), highly complex biological structures *can* arise through a mindless process, given (my paraphrase here, I'm not using all of Dawkin's language because I don't want to bother to look it up to make sure I get it right) a fitness function (survival), a selector function (reproduction at higher rates), and enough time. Determinism needs to be part of the specification, perhaps. And one need not expect two successful attempts to meet the specification to be exactly the same code, merely to correctly perform the same function.
A surprisingly simple process, seemingly wasteful, yet very powerful given enough repetition over enough time.
Two points, one "micro" and one "macro":
1) Micro point: The Latin word cognition has the root word "cog"; Romans thought about thinking as "revolving something around (in your mind)". What is one aspect of human thought other than turning something over and over until a light bulb pops on (Eureka, I've got it) when your thinking eventually produces an idea that seems to fit your constraints?
2) Macro point: The process of evolution creates highly complex systems, given enough time. Forcing an LLM to evaluate its own output repeatedly until enough adaption occurs to satisfy some measure of "fitness" seems a lot like evolution? Instead of using survival and reproduction as the fitness and selector functions, the loop-driven LLM development process determines fitness and selection by comparison to a specification. Evolution works.
This was done as an inspired hack. How much better can it become if it is done deliberately? If this technique proves to be useful and can be improved, it suggests a way forward for programmers that has already been mentioned by others: Programmers become the people who craft the specifications that are used to measure the success of the iterations done by the LLM. As pointed out, the "human in the loop" gets removed from the inner loop, replaced by the LLM evaluating itself against the specification. There is still a "human in the loop", and the outer loop may still be a loop, but it is a loop of refining the specification, not the code.
"Forcing an LLM to evaluate its own output repeatedly until enough adaption occurs to satisfy some measure of "fitness" seems a lot like evolution?"
I am afraid that sounds to me rather more like awarding attributes to the LLM that it does not possess - the ability to think, to evaluate, to learn, to understand - things that would constitute actual intelligence, but which in reality are qualities which LLMs do not have (and are unlikely to have in the forseeable future).
CNC programming is a good point, because even with the most complex machines, the person working out what to tell the CNC machine still needs to know about and undertand the concept of rotary materials shaping, and can often "sketch out" test designs on a hand lathe to get an idea of what is needed and what physics need to be taken into account. You can't just say "lathe me a thrimble" unless you've already told the CNC machine what a thrimble is and taken account avoiding attempting to drill through the drill mount.
@ParelezVousFranglais:
Given that the vast majority of code that's still out there at the moment has been crafted to the best of our collective ability...
The vast majority of the code out there is not "crafted to the best of our collective ability".
The beancounter/managerial I-don't-care-just-make-it-work-we-gotta-ship-it-now drum has been beating for decades.
> there's no particular reason that a model couldn't be trained on only "good" programming practices
If that were even remotely possible, don't you think people would have done it already? Who'd knowingly and wantonly train their models on "bad" programming practices such that those practices are regurgitated as positive examples to their users.
Thr corpus of information used to train models is also being flooded with bad output from current models. Can that be filtered out later to enable higher quality models to be trained? Probably? Will it be expensive? Also probably.
> As we gradually moved from machine code to assembly to C to other languages and frameworks, I accept the increasing layers of abstraction have meant none of us now are proficient at every layer of the stack.
Major difference between accepting that we aren't proficient at every level in the stack[1] versus using LLMs for the coding task: the languages & frameworks are explainable and can be meaningfully investigated.
If you write something in a script that drives a framework that is coded in a domain-specific language that is compiled[2] to C that is compiled to assembler that is assembled into a binary then, at each stage, you can[3] investigate the intermediate results and glean how changes in your input trickle down and effect the final result. There are even tools created to *encourage* you to learn how it all works, using the speed of modern PCs to show you "in real time" the assembler from your C++[4]... Failing that, there are other people who know the bits you don't and can be - persuaded - to describe them to you (bits of coloured paper may be involved).
But the LLM process is totally opaque, by its very nature. Even if you manage to turn down the "temperature" on the model so that it starts spitting out the identical code for multiple runs of one prompt, you have no idea what it is going to do as you change the prompt to try and improve the final output: will it just change one line or go hairing off down a totally new avenue? And if you try the later prompt in a brand new session, without the chance that the LLM was *really* being fed the "conversation so far", how does that compare? I.e. will somebody else, given your "final prompt, the one that cracked it", be able to replicate what happened? And next week, when the LLM has been tweaked again and you can no longer actually run the same generator you did last week?[5]
[1] although the "none" is going a bit far - some people do remember what they studied about compilers etc, although thinking about doping requirements to control electron transport in MOS junctions can be considered going off-topic when you are choosing which blur to use in your image processing script....
[2] "transpiled" if you really, really feel the need
[3] assuming nobody is deliberately trying to stop you doing so, that is
[4] other languages are available etc; e.g. you can probably find someone with a UI set up to do the same thing with Lua input and the VM's bytecode
[5] ok, this last is an artifact of the way LLMs are being presented and controlled rather than because they are actually LLMs: you can fall foul of the same effect if you use a third-party Web-hosted compiler (or higher-level framework); except that the LLMs are generally so large that you have little choice, certainly not compared to the way that every PC in use can easily host a compiler or two
"purpose is to convey information between two or more entities"
Obviously separated by space. When nearby, any perturbation of historical syntax or grammatical norms are unconsciously taken into account by those entities. When the parties are separated by space and cultures, misunderstanding will inevitably arise from deviations from those norms.
Once separation in time is added any ephemeral changes in language might completely obscure the original meaning from the writer's or speaker's descendents.
As it is when you are within ear·clipping distance of some linguacidal† youth you often have to refrain both from "WTF is that supposed to mean" and a swift biff to their lugholes.
† overjoyed that is not my coinage.
I think the problem with that analogy is the light switch will continue to work, but will form part of ever more complex building jobs that don't even realise they have the light switch built in, uses up terabytes of memory to flick the switch, and you end up horrified in 4 years time that your light switch inexplicably doesn't work anymore when a part of AWS-US-East-1 has an outage.
".....language changes naturally over time, and it's main purpose is to convey information between two or more entities - if kids can communicate with each other perfectly happily by using terms that the older generation are completely unfamiliar with, then their language has still served it's purpose"
That is very true.
However, you have to make the distinction between changes that improve the ability to communicate and those which hamper it - the difference between evolution of a language and its deterioration.
I think any changes are purely subjective based on the entities using the language.
When a command gets added or a syntax changes in a programming language, often you get the same arguments about whether it's an improvement or not, but I'm not sure you could ever say that a language "deteriorates".
Even if you could, if that was relevant to the masses, then we'd likely all be speaking Sanskrit, Mandarin or Latin as some of the most "precise" languages to have existed, but here we are happily communicating in (technically far inferior) English... ツ
"I'm not sure you could ever say that a language "deteriorates".
How else do you describe it when a word starts being used for a totally different meaning from its original, or the scope of meaning is widened to the extent that it covers a wide range of situations rather than the original narrow and precise definition?
The term 'Artificial Intelligence' is a prime example - a few years ago we all had a very specific view of where AI started, but in the last decade or so that definition has been widened by some to encompass entities (such as the current LLMs) which are very clearly not intelligent at all. Some people have even put forward definitions of such intelligence which would put a simple pocket calculator within the classification of 'AI'. How exactly can that be considered as anything other than deterioration of language.
There is a long list of words and phrases which in recent years have had their meaning widened or changed, or become so over used in inappropriate situations that they have become either meaningless or now make it difficult to pin down what the speaker actually meant. There really isn't any way to view that as anything other than deterioration of the language.
"However, you have to make the distinction between changes that improve the ability to communicate and those which hamper it - the difference between evolution of a language and its deterioration."
Except hampering the ability of communication is the point of street slang for kids - they can understand it but their parents can't - and that's why they change language. When they grow up those words become the norm and their kids come up with new ones. That's evolution, not deterioration.
Software isn't language.
Language slowly changes, but that's where the analogy ends.
The purpose of language is to convey meaning between peers. Languages change to better convey the meaning between peers, to better exclude those who are not peers, or to widen the range of peers.
LLM vibe code vomit nearly always doesn't even compile or interpret. So it doesn't even become software.
With careful pummelling, it can sometimes spew out something that compiles, but it has never ever been found to actually work.
I would place a decent wager on all language evolving.
English has always evolved. Look at Germanic, Greek, Latin, Nordic and more lately French influences. Every day, every one experiences new things. Language evolves to convey the meaning of these new experiences.
Groups evolve their own technical jargon (lawyers, medical, IT, different social groups, etc). It molds the group and is molded by the members. It unites and creates the insiders and others. (See Latin in the middle ages among the clergy).
If language didn't evolve the whole world would probably be speaking a single world language.
I offer my heartfelt contrafibularities to Mr Johnson on his new "dictionary". Its a nice historical record. But words change, hence, new editions.
It reminds me of the story of the the Space Shuttle's main engine.
It was designed 'top-down' rather than out of known components. When the turbine blades began to show cracks, they had a problem: because they hadn't been designed and tested separately, no-one knew just how it behaved in various situations. Fixing the problem became a lot more difficult and expensive.
This is going to be the problem with Claude's code. When, not if, there are problems, no-one is going to understand the components.
Isn't this a lot like early cars? Unreliable, kind of expensive, not comfortable compared to today... but compared to a horse had advantages in many circumstances. They never did replace horses: here in the western US there are still places horses and mules go that no other transport can (OK helicopters maybe but that is WAY more expensive). For a while as I understand it, driving a model T you had to be some kind of mechanic to deal with issues. I still recall when in high school cars needed frequent tuneups. The first manual transmissions did not have synchros - even Dusenburgs lacked them! Now how much do you have to know about how a car works in order to drive it? Is that really a problem? I own a plug-in hybrid and a diesel F250 pickup with stick shift, manual transfer case, and manual hubs. I love them both because they are both great at very different missions. Won't programming be the same? AI may replace the boring grunt work and leave the interesting stuff to us. For those of you familiar with piston engine aviation how about manual control of turbocharger wastegates and mixture vs FADEC?
Digital cameras also come to mind. I don't miss the darkroom or days of delay for commercial processing. Hilariously, photo SW now has filters to add the anomalies of film to the look! Even my phone has a pretty good camera. I can scan documents into a PDF and send it from my phone! No FAX machine needed! Every major paradigm shift has its pros and cons and never completely eliminates the old ways, it just makes them unnecessary 80% of the time. It this change all good? Likely not: doom scrolling! Online fraud! Etc! But the sky is not falling, or maybe it is, and always has been.
“The Babel fish is small, yellow and leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received not from its own carrier but from those around it. It absorbs all unconscious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with the nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish.
Now it is such a bizarrely improbable coincidence that anything so mindbogglingly useful could have evolved purely by chance that some thinkers have chosen it to see it as a final and clinching proof of the non-existence of God.
The argument goes something like this: "I refuse to prove that I exist," says God, "for proof denies faith, and without faith I am nothing."
"But," says Man, "the Babel fish is a dead giveaway isn't it? It could not have evolved by chance. It proves you exist, and therefore, by your own arguments, you don't. QED."
"Oh dear," says God, "I hadn't thought of that," and promptly vanishes in a puff of logic.
"Oh, that was easy," says Man, and for an encore goes on to prove that black is white and gets killed on the next zebra crossing.”
-DAdams
I have recently started to use $ cat <source-file> | less as a workaround to get legible text in less. Somebody thought it a great idea to add syntax highlighting to less, including illegible dark blue text on a black background. Using the pipe prevents this syntax highlighting.
If I weren't so lazy, I'd look up the documentation to work out how to disable this illegibility, or even submit a bug report. But the problem, although irritating, isn't sufficiently so to propel me to doing anything about it. Damn!
I'm sure there are zero copyright issues with such an approach. Genius!
Let's see it in action with some dusty deck COBOL code that's been resistant to previous efforts to modernize. Oh, but he'd say "sorry I need FULL specs and documentation, this doesn't count because those aren't present".
It is obvious to anyone who has ever written a program more complex than "Hello, World" that the most difficult part of programming is figuring out exactly what it is you want it to do. Even a fresh out of college CS grad with zero real world experience could code something arbitrarily complex, if you gave him fully complete and exact specs in English detailing exactly what is needed. Any algorithms that are needed are already developed and provided. Any and all if/then/else decision trees are graphed/flowcharted. Everything you need is right in front of you.
I'm willing to concede that in such a case AI could probably toss together something as good or better than the CS grad (or a 25 year experienced coder for that matter) in a single day versus months or years depending on how complex the specs were. But that's a fantasy situation, it never happens in the real world.
Perhaps instead.of learning how to code, the CA grads of tomorrow will learn to write extremely.careful amd.good specs.
Snd honestly.it seems.to.me that figuring put exactly what you want.something to do os a far more useful input/career than just being able to speak C++ natively.
"it seems.to.me that figuring put exactly what you want.something to do os a far more useful input/career than just being able to speak C++ natively."
Figuring out what the problem really is and how to solve it are part of the development loop. One never really knows what the specs are until the job is finished. But if afterwards an AI can quickly and cheaply generate a *better* solution that a load of meatbags fighting their way though the development fog ....
As for cloning the functionality of existing software .... bring on the IP lawyers
Fast creation is fun and exciting for start-ups, like a lot of things that everyone else then assumes they should also adopt, but most software is legacy. Even if Claude can create a copy, can it also maintain it or are we going to need to get humans involved in that? Will it even be human maintainable, or are you creating code that will need you to subscribe to Claude forever, even when Anthropic start charging enough to break even?
I keep hearing that my job will be replaced with this technology, but when I look at the tech in action and its outputs, it seems pretty weak.
The real "grim meat hook reality" comes from the line where they point out that with a decent copy of your marketing material, website, and documentation, the AI slop fire hose can converge on a cheaper, faster, bug filled mess of a knockoff of your companies code. The hilarious thing is they then talk as if companies goodwill and branding could save them. No, not with the playbook they have been running since well before the pandemic. We tend to hate even the ones that our paychecks are coming from.
What they pointed at is that they will Temuify every established profitable software company out of existence. Since the same people that buy fast fashion and fast food through Uber Eats and Door Dash are in the drivers seat, they will do what they know and buy the cheap knockoff if it looks enough like the name brand.
Your job may not be replaced, and you still might just not want to do it anymore.
I have already turned down offers for a position as an underpaid whipping boy expected to fix and maintain 100% uptime for managements vibe coded slop, and to be punished for their mistakes. No, not interested, and doubly so when "generously" offered it at 40k less then I was making before.
What we should be doing is organizing our own engineering guild and mercilessly crushing the companies destroying our industry and livelihoods along with many of our places of employment. There is no moat, so we either build a trade organization looking out for OUR collective interests or we drown in the rising tide.
" The hilarious thing is they then talk as if companies goodwill and branding could save them. No, not with the playbook they have been running since well before the pandemic."
Even before that with the constant demands to subscribe to software. If a knock off can be made cheaply and sold for cash up front people will buy it to avoid having MS/Adobe et al constantly feeding off their credit cards every month for the term of their natural lives...
So, after humans have done all the heavy lifting of soliciting requirements, developing code, testing, fixing, scaling, integrating, making secure, iterating to implement additional requirements because people are really bad at identifying requirements until they can see the thing and say "no, that's not what I meant!", thoroughly documenting the whole pile and then productizing the whole thing...
...an AI can reproduce that for $10 an hour (it doesn't say how many hours).
A photocopier can plagiarize a book for a lot less than $10 an hour, and that doesn't impress me either.
Yeah, the Ralph had to run continuously for 3 months to develop cursed 'a new programming language' (TFA link, or this one) which at $10/hr is around $24k -- not really cheap for many a tinkerers -- and it is not quite considered a finished work iiuc ...
More complicated reverse engineering fuzzing would likely take longer and be more expensive imho. In its 'Real-World Results' section though, the Claude 'Ralph Wiggum Plugin' (TFA link) notes "One $50k contract completed for $297 in API costs" suggesting one's mileage might vary some in this racket.
If the code is unmaintainable because it's a "Cathedral of complexity" with hundreds of 2nd level dependencies, obscure patterns and a pile of layers just for the sake of it OR because it's written by a bot does it really make a difference?
After a few years of modifications it will need a full rewrite anyways so - assuming those LLMs can actually write code that works - it's going to be a matter of cost (development + maintenance)
Second level dependencies? I'm afraid these days it's turtles all the way down until you get to cout().
Which is why https://github.com/MichalStrehovsky/sizegame lists sizes of "hello world" programs up to 11MB (Haskell) compared to 2kB (assembler) (and that latter could be done in two dozen bytes in MSDOS and bare assembler.
...but perhaps we could get stuff that the commercial outfits will never give us. Distributed versions of current stuff like social media that do not have centralised, censored, advertising-driven feeds, and alt networks that can function safely in dictatorships like North Korea, China, Russia and Trumpistan. Search engines that work like Google did when it began.
It's not perfect, but it may initiate a new revolution in online services that is crowd-sourced, bottom-up rather than top-down, and distributed, rather than centralised.
GAFA may have given us the tool to replace GAFA with something better.
Similar hype moments in history:
Scientist: I am way ahead of Turin, MY machine can decipher the German texts instantly and with only two requirements!
Churchill: And what are those?
Scientist: An identical enigma machine and a list of the daily configuration settings.
Churchill: Don’t call us, we’ll call you. NEXT!
...Are melted into air, into thin air | And like the baseless fabric of this vision... Yea, all which it inherit, shall dissolve... Leave not a rack behind... such stuff | As dreams are made on... I am vexed. Bear with my weakness. My old brain is troubled. (The Tempest)
Tomorrow and tomorrow and tomorrow | Creeps in this petty pace from day to day | To the last syllable of recorded time... And then is heard no more. It is a tale | Told by an idiot, full of sound and fury, Signifying nothing. (Macbeth)
One can always find something apposite in Shakespeare?
I hadn't encountered standups it this context until after COVID – I have never used any form of video conferencing – so the word only had a prior meaning redolent of promiscuity and cramped, steamed up telephone boxes where admittedly some agility was a requisite.
The first time I heard it used in a post COVID office I was rather surprised until the penny dropped… if anyone had mentioned scrums I would have been off pdq… besides I didn't fancy any of that lot. ;)
Companies have goodwill? I haven't seen any of that in a while. Mostly just corporates drinking the en$hitification flavor-aid and burning their companies to the ground in search of a .5% bump to quarterly reports.
Temu isn't ambitious in it's ideals, or it's execution, but it is absolutely _voracious_ in it's appetite. This means the ambition is in the audacity to burn the software industry to the ground by fully automating the knockoff software process and doing so on an industrial scale. And it's not coming, it's already here.
The best brands crumble after a decade of poor products, and most abandoned goodwill to strip mine their own companies. Played the game of how much blood (aka short term revenue) can we take before the body dies. How many workers can we cull before the ones that are left begin to collapse on the job, or literally set themselves on fire instead of suffer the agony and humiliation? Before they literally can't do the workload that two or three people did before? How much garbage they can ship before the customer revolts? How much they can raise the renewal? How about some micropayments?
So they in their own hubris sowed the wind and then carefully planted the seeds of their own destruction beneath their feet as well. Destroy their brands and then double down by pouring money into to the companies and tools that would automate themselves out of existence along with their workers and their dwindling market share.
Though to be fair I expect this fake AI bubble to burst before all that happens, and for the delivery of newer fake AI to be heralded to arrive right after cheap, clean, and nearly unlimited fusion energy.
Isn't this how the cheapest remote contractors work? It's an infinite loop of telling them to fix bugs, and they write more random code?
I've seen contractor projects balloon to millions of lines of very enterprisey code without it ever compiling. It's the sign that it's time to leave where you're working.
Apparently, OpenAI's Codex has an internal agent loop like that, that iterates over code production (an inner Ralph) (spotted by Benj Edwards). This begs a couple questions, namely doesn't claude-code have a similar inner loop, and would that not make the external bash loop somewhat redundant? And is updating of the prompt needed (in outer loop) to get an eventually satisfying output (Huntley seems to both suggest so, and not, simultaneously)?
But overall it's interesting that they're iterating over AI (so-called) tool application to refine outputs towards a (hopefully) convergent quality output (solution). It's a technique used at much finer scales (floating point ops) to solve PDEs approximated by linear algebraic matrix-vector systems for example (Richardson, Jacobi, Gauss-Seidel, SOR, ...) as well as optimization and inverse modeling (Newton method, Levenberg-Marquardt, ...) among others. It made me think of Sandia's iterative solution of PDEs using near-memory neuromorphic compute with proximal-only interactions, and that Richardson and Jacobi could be well-suited to that, even without spiking (i.e. plain-jane non-neuromorphic in-memory compute approach).
Quite stimulating imho (and with the crunch of puzzling bits) ...
There are people who are apparently working to make that happen, only slightly more sophisticated:
Both this and Gas Town have one thing in common:
The developers are so far gone that you can't even understand the blog post or X thread which is supposed to clearly introduce and describe the concept, let alone the code repeatedly spewed out in a loop by the LLM.
Perhaps it's what overuse of LLMs does to your brain.
From Gas Town post:
> Gas Town solves the MAKER problem (20-disc Hanoi towers) trivially with a million-step wisp...
Sorry, we are celebrating something that can solve a 20-disc Towers of Hanoi in about 30 hours of decidely non-trivial compute? The trivial coding problem that was used as the second example of a simple recursive function (just after Fibonacci Numbers)? And it takes something as complicated as thus "Gas Town" to do that?
The Wonders of Science!
So a LLM can clone commercial products if provided with resources including source code, specs, and product documentation. And enough tokens.
That is cool, it really is.
On the other hand ...
"Can you build X for us. We've already prepared the complete spec and product documentation. And by the way here is the source code as well."
... doesn't sound like anything I've ever heard for a project brief.
The reality is that clients don't know what they want. And that's not their fault.
They know that they have a problem that they need solved - and by admitting that they are usually already ahead of their competition. But it's not always clear what the real problem is that needs to be solved - the symptons that cause the pain are known, but getting to the root of the problem is a different matter.
And only then can we start to look at finding potential solutions. Within existing constraints - which are also usually not fully understood.
This is how making building software in the real world works. It's messy and it's complicated, and that doesn't make for a short juicy blog post.
But at the center of LLM software development approach seems to be the question "how to deliver X with the least amount of effort".
And that's just not how I understand the job of delivering working software that
The question is rather "How can we deliver the most lasting value with the least amount of resources".
And lasting means looking at 10-20 years, at least. All of the software I delivered 10 years ago is still in use (and actively maintained), and a lot of the software I delivered 20 years ago is still in use (a lot less actively maintained). As far as I know, none of the software I wrote 30 years ago is still in use. But for some industries 30 years is nothing.
We can only reliably deliver working software over decades by making very careful choices about the tech stack, and how it might unravel over time. That's where the "least amount of resources" comes in.
Building our tech stack on a LLM coding agent that goes through major changes frequently and belongs to a 3rd party organisation with a shaky business model seems ... quite adventurous.
So when this recursion produces the code that most accurately replicates the original application; well two things: will it accurately reproduce the bugs that the original code has? and who is responsible for the copyright infringements in the code when it reinvents the same algorithms and methods? The programmer, the AI (treat it as an individual) or the creator of the AI?
"Huntley has documented how he used Ralph to create an a tax app for the ZX Spectrum, and later reverse-engineered and cloned an Atlassian product."
Yeah. Right.
So we're talking about something which is entirely useless and will never be used and... a clone of something which is entirely useless and nobody wants to use.
Odd choice of project, unless you desperately don't want anyone to probe further.
-A.
Well... it's an interesting idea, but the Z-80 post is mostly "for subscribers only." as is the other link to posts on ghuntley.com.
This seems a bit like stealth marketing for someone's personal blog.
The way he talks, he sounds like a teenage edgelord on Gaia Online circa 2004 bragging about the dark magic he learned and is totally going to curse the bullies at school with.
This sounds like the code equivalent of copying a quote into your paper and then using the thesaurus to avoid plagiarism allegations. Cool, you ended up with something disgusting to look at that conveys most of the same meaning as a thing a person already made. Congrats.
Claude can modify PROMPT.md, as it's just a file in the project directory. You thought self-modifying code was bad...
I've not tried this but if we're trying silly LLM tricks...?
PROMPT.md: "Build a wonderful project. You may modify PROMPT.md to improve the project, as well as building the project."
Repeat ad nauseam which would probably be around 1 repeat.
Does this mythical $10/hour include the actual cost of the processing power utilized to generate the code and is the code as bug free as human generated and reviewed code. The article seems to leave a lot out of the analysis.
Right now a lot of AI use is very inexpensive as they work to get everyone to embed it into their process. How long do we expect this free for all model to continue when the data center bills come due?
Anthropic and Open AI are keeping very quiet about the real costs of inference - the compute needed to run something like Ralph - and given their cash burn rate it is likely to be much more than they are currently charging users. If the real cost is closer to $100/hour it doesn't have the same cost advantage over humans.
At some point we either start paying the real costs for generative AI or these companies collapse, taking hundreds of billions of dollars with them.
Or, faced with paying the real costs, those using these things throw up their arms in horror, and stop using them rather than pay an amount not justified by the benefits (assuming they still think there are any?)......
........and these AI companies collapse etc, etc.