* Posts by FeepingCreature

494 publicly visible posts • joined 31 Oct 2018

Page:

Trust the AI, says new coding manifesto by Kim and Yegge

FeepingCreature

Re: Any sections in the book talking about how much it costs to Vibe code?

I have never seen a convincing argument that AI companies lose money on API. It would be extremely strange to use a competitive marketplace with no moat or lock-in as a loss leader.

UK Home Office dangles £1.3M prize for algorithm that guesses your age

FeepingCreature

How about instead

> If children are wrongly judged to be adults, it can lead them to be placed in unsupervised adult accommodation or detention centers, potentially exposing them to abuse.

How about instead of trying to split the world into those who need protection and those who can be abused, you make sure that no abuse takes place in detention centers and also ensure both adults and the underage have access to all the resources and protection they need to be safe?

Unlike most of Musk's other ventures, Starship keeps it together for Flight Test 10

FeepingCreature

Re: "Did Not Affect the Mission"

The test was all the not-blowing-up that it did before that, all the way to a soft touchdown on the ocean right on target. Which is the entirety of its future mission profile when coming down on catch arms, so the explosion after that was more of a bonus lightshow.

Two scrubs, one Starship: Third time lucky for SpaceX?

FeepingCreature

Re: Well, the answer was 'No'

"Orbit ready" just needs the reentry burn to work. The only question for orbit is "if we shoot it up, will it come down where we want?" And it nailed that. Sure, singed and partially converted into liquid metal, but it came down on the dot. The FAA don't care how toasted it is so long as it stays in its corridor.

FeepingCreature

Re: Well, the answer was 'No'

Once again, that's genuinely completely normal and expected. Rockets don't handle sideways impacts well.

Toasty, sure, I grant you. Should be noted that they intentionally tried to toast it and also that it still worked fine, engines and all.

FeepingCreature

Re: Well, the answer was 'No'

And with all of that it still managed a soft touchdown! Glowing debris stuck to the hull and the ship is unbothered enough to show off a bit. Feels like the steel hull is really pulling its weight.

FeepingCreature

Third time lucky!

I think we can call this the most successful launch in the Starship program. Finally feels like we're making progress again.

Boffins detail new algorithms to losslessly boost AI perf by up to 2.8x

FeepingCreature

Re: Recurse!

You could absolutely do that. But optimally you want to run the draft model on separate hardware (cpu/gpu) anyways so you can keep the big model maximally busy. At that point, so long as the draft model runs faster than the big model, there's no point to speeding it up further, because the big model is what limits your throughput anyway.

FeepingCreature

Re: Confused

The trick is that LLMs are a lot faster if they can evaluate multiple queries in parallel. So we use the small model to predict the next ten tokens, then on the *assumption* that those are all going to be correct we feed all ten prefixes into the big model to get ten next tokens.

Optimally, what happens is the big LLM happens to output at each step the token that the small model has guessed. In that case we just got ten tokens for the price of one. But usually, the big model calculates at least one token differently from the small one; in that case we just throw away all the tokens after it and restart from that point.

So you say "Hello " and the small model says "workdays", you complete:

- Hello

- Hello w

- Hello wo

- Hello wor

- Hello work

- Hello workd

and so on, which you can do efficiently in parallel.

The true (big) completions are: 'Hello [w]', 'Hello w[o]', 'Hello wo[r]', 'Hello wor[l]', 'Hello 'work[d]'.

Now you just go through: "w was right, cool. o was right, cool. r was right, cool. k was wrong, it was l instead, so I got 4 tokens for the price of one" and start over from "Hello worl".

Firefox is fine. The people running it are not

FeepingCreature

Re: As a Transhumanist

Update: yep, now I have a web-based music library and player with cue/tts support. Maybe two hours.

FeepingCreature

Re: As a Transhumanist

AI makes things up, sure, granted. It's getting less over time, and in domains like programming you can just try its code and see if it works. I think this is mostly an inability of the trained persona to correctly cope with uncertainty rather than a technological limitation.

And... "It's helpful, maybe, but only if you don't care that much about the results, and already know quite a lot about what you're trying to do."

Yeah but that's a *ton* of things! Like, I'm trying Navidrome as a web-based music app rn. It's ... nice, except no cue/tta support, which is essential for my music lib. I know how to do it better, I even have a detailed plan in mind, but I'd have to write my own frontend and it'd be a whole thing and normally I couldn't be arsed. But now, soon as I get off work, I'll grab Sonnet 4, explain it the nitty gritty of what I'm thinking of, and see what it comes up with, and I firmly expect to have something usable - even with a pretty web frontend - by the *same evening.* And this works for everything! I can turn an idle thought into working code in half an hour, or a day for something genuinely practical.

Is it as good as the hype says? No. Is it as good as *it* thinks? No. But it's useful. Real and practically valuable. All it takes is trying to figure out how to make it work rather than how to make it fail. And, like every skill, experience.

(Also, the power thing is straight up invented from whole cloth. It's nonsense. Yes datacenters use a lot of power, welcome to operating at scale.)

FeepingCreature

Re: As a Transhumanist

I am a member of several groups in the acronym :) They're not mutually exclusive.

Well, inasmuch as say "transhumanism" is "a group" anyway.

FeepingCreature

Re: As a Transhumanist

Well, I don't think it's a faction, and I don't think it's pushing AI. It may be "a contributing factor in AI being pushed everywhere", but that's a very generic claim. In particular, you're blaming people who are saying "pause AI" for AI accelerating - which is especially funny because e/acc, the *actual* "accelerate ai, maximum hype" faction, isn't even in the acronym.

(Also, AI is of course artificial, and intelligent, and clearly does work, I use it daily. I used it half an hour ago to add a feature to a website. I was like "Claude, add this feature" and it did. Maybe concretify this? Like, what in particular that you think should work doesn't?)

edit: This is all anyways a distraction from the simple fact that the people pushing AI aren't TESCREAL in the first place!

FeepingCreature

Re: As a Transhumanist

I think the individual members of TESCREAL exist - that's why I said I am some of them - but "TESCREAL", as a single unified phenomenon does not exist. The acronym is in fact an attempt to create a natural outgroup out of everybody the people who use the term dislike. Maybe it could be argued there's a "milieu", and I agree that some of those groups are "adjacent". It's the attempt to draw a line around them and say "everything in here is its own, common thing" that I object to.

Saying "TESCREAL did this" is like saying "the left did this".

FeepingCreature

Re: As a Transhumanist

Sure, but it's just a random interjection. I'm a doomer, and from where I'm standing the weird current AI hype *isn't* TESCREAL. At most it's people vibing off TESCREAL, trying to ride hype to sell products. Remember that R at least are *against* further AI capability development - I mean, part of the problem with TESCREAL is that it contains groups that are literally politically opposed, because it has no coherent concept. The EAs aren't even mainly interested in AI! It's really just a label of "people I don't like", and it still doesn't fit here.

I mean, personally I think programmers are sleeping on AI, and my recommendation to fix this is for them to get an OpenRouter account and maybe Aider, and experiment with the APIs at cost. The people selling "agents" and "integrated solutions" rn are not on my side, do not represent my interest, and are afaict not in my ingroup.

Basically, I'm saying "inasmuch as TESCREAL is a thing, which I disagree with, I know TESCREAL, I'm in TESCREAL, and I'm telling you it's not us doing this."

FeepingCreature
FAIL

As a Transhumanist

Good article, but did you have to bring the wholly made-up TESCREAL "let me put a label on everyone I personally don't like and pretend it's an organized group" bullshit into it?

No, the Mozilla management are not TESCREAL. Even if we grant that's a thing (it's not) they wouldn't be in it. If anything, TESCREAL (which broadly rounds off to "nerds") are part of Firefox's natural and longstanding userbase. As a T, S, C, R and L, I assure you that absolutely none of my interests are represented by Firefox management taking the last competing browser engine not under Google's control and slowly driving it into a ditch.

Like, what the hell is the argument there? "The human condition shall be overcome, thus we must sell ads"? Malaria nets for Africa, thus axe the Rust team"?? "Humanity should colonize the Cosmos, which demands integrated VPN"???

Google’s Gemini refuses to play Chess against the mighty Atari 2600 after realizing it can't match ancient console

FeepingCreature

gpt-3.5-turbo-instruct in pgn format, I dare you

GPT 3.5 Turbo Instruct. Available on the OpenAI homepage and API. PGN format chess game. Measured at 1700 elo. Yes we know the models suck, why not try the one model that is known to not suck?

Microsoft Copilot joins ChatGPT at the feet of the mighty Atari 2600 Video Chess

FeepingCreature

gpt-3.5-turbo-instruct is the only LLM that was ever good at chess

Nobody knows why, but it seems likely that some chess games snuck their way in the training corpus.

See https://blog.mathieuacher.com/GPTsChessEloRatingLegalMoves/

gpt-3.5-turbo-instruct is still available: https://platform.openai.com/docs/models/gpt-3.5-turbo?snapshot=gpt-3.5-turbo-instruct

As Microchess is estimated at 1200 elo and Turbo Instruct at 1750 elo, I suspect that would be a better fight. Make sure to use PGN text.

LLMs can hoover up data from books, judge rules

FeepingCreature

Re: ...in the US

>>> I think there is a point that might need emphasis: ai is capable of producing an exact copy of any input text.

I think there is a point here that needs emphasis: this is straight up false, lol. Typical LLMs aren't within several orders of magnitude large enough for this.

FeepingCreature

Re: ...in the US

Producing a recollection for profit is not a violation of copyright. A recollection is not a reproduction, and a summary does not violate copyright.

No, "being allowed to remember and talk about the work" is not a right that needs to be granted to the reader, profit or otherwise. AIs are a "blender of inputs" just like brains are, that is to say not at all.

FeepingCreature

> The argument was always that it was the scraping of pirated works and the failure to make payment or acknowledgement of the source that was the principal issue.

I assure you I've seen the argument that the problem was training at all many times.

The idea that paying the authors or artists for a single copy of the work would mollify all complaints has been advanced, but has always seemed ridiculous to me. Obviously the AI companies should of course do this, but also obviously it will not quiet a single complaint.

FeepingCreature

Re: LLMs can hoover up data from books

They of course do this already- that's what "thumbs up", RLHF, and stuff like Deepseek R1 and the ChatGPT o1 series are all about.

The idea that LLMs are damaged by training on their own output is based on a single very bad study.

AI coding tools are like that helpful but untrustworthy friend, devs say

FeepingCreature

Re: 76% ... won't ship AI suggested code without human review

I use LLMs *heavily,* I love them, they make my day so much better.

But the notion of shipping LLM code without reviewing it is utterly bonkers. What are those people doing?!

Forked-off Xlibre tells Wayland display protocol to DEI in a fire

FeepingCreature

Re: Code talks

I sort of feel like, if you feel discriminated by one sentence in the README, then you are probably the sort of person that sentence intends to keep out.

Please tell us Reg: Why are AI PC sales slower than expected?

FeepingCreature

Yeah but these people get GPUs. The built-in NPU thingy is desperately underpowered for that usecase.

It's really a solution in search of a problem.

Tesla FSD ignores school bus lights and hits 'child' dummy in staged demo

FeepingCreature

Re: "Elon sets priorities, and he's never made safety a priority."

First-stage rocket reuse in the dozens and low-orbit satellite constellation with laser cross-links not enough for you? If you totally ignored every word out of his mouth and judged him by what he'd actually achieved, he'd be one of the preeminent innovators of our age. Elon Musk only looks bad if you judge him by what he promises. Which to be fair, *is* a problem, but don't undercount what he gets done because of that.

Some signs of AI model collapse begin to reveal themselves

FeepingCreature

The model collapse paper is rubbish

They used a hilariously undersized model. The effect gets less the bigger the model is.

(Though it is quite amusing that this false information is now retold and retold again as fact. Did an AI write this article...?)

Research reimagines LLMs as tireless tools of torture

FeepingCreature
WTF?

What the hell are you talking about

To begin with, I didn't create shit. But second, I'm just as free to not use social media as I am to not play slots. Don't give up your agency so easily. Everyone of us is here (and on Twitter, and on Facebook...) voluntarily and intentionally.

FeepingCreature

Re: I am getting the feeling

I just think there's a difference between "I'm gonna take advantage of this even though it harms people" (commonplace) and "I'm going to take advantage of this in order to harm people" (very rare).

FeepingCreature

Re: I am getting the feeling

You're right, I don't believe it. I think it smacks of just-world bias: that evil can only occur because evil men will it. I think nature richly demonstrates that evil can arise by sheer coincidence, systems simply following their incentive gradient. Social media lies latent in the shape of human reward learning, as do slot machines. Humans merely brought about what was always there, waiting; not out of great evil but an absence of unusual dedication to good.

AMD puts Intel in rear view mirror with Threadripper Pro 9000 high-end desktop chips

FeepingCreature

Re: 350W!!

I'm fully on board with using a computer to heat, but portable AC units are actually amazing. :) I have one on my balcony that I've converted to dual-hose operation, so it doesn't waste half its energy cooling fresh outdoors air. Two thirds of the year it does nothing, but in the summer months it's definitely a lifesaver.

FeepingCreature

Re: 350W!!

Liquid cooling with a big radiator stationed on your balcony :)

FeepingCreature

Re: 350W!!

I have a ~350W GPU and I can confirm that after a few hours of load, my room is noticeably warmer.

FeepingCreature

Re: Performance with compatibility

Pytorch mostly runs on both. I've had pretty good experiences; that is to say, things that are not explicitly written for NVidia (ie. direct shader code, nvcc calls etc.) will mostly just work now. ROCm is a function-for-function reimplementation of CUDA in the first place; I assume they just didn't want to fight a lawsuit over it.

FeepingCreature

192 threads at 5.4Ghz boost

I know it won't run at 5.4Ghz across the chip because heat, but in theory if you hooked a monster of a water cooler up to it, you could run AVX512BW FMA, that's 128 int8 ops per cycle, over 192 threads of 5.4Ghz would give you 132TOPS on CPU alone, or 32TFLOPS if you used float32. And that's with zero dedicated matrix ops.

If AMD added dedicated matmul accelerator units, their CPUs could compete with their GPUs, and arguably they should. No need for ROCm, just use the CPU backend.

Next week's SpaceX Starship test still needs FAA authorization

FeepingCreature

I don't normally nitpick but Gwynne Shotwell*.. Gwen Shockwell is a bit far :)

ChatGPT burns tens of millions of Softbank dollars listening to you thanking it

FeepingCreature

IMO, anything that can emulate emotions to the degree that a LLM can, ought to be treated well.

SpaceX's 'Days Since Starship Exploded' counter made it to 48. It's back to zero again now

FeepingCreature

Re: Range Safety

"Ship FTS is safed" is what they usually say at that point. Not clear why they'd safe it given the ship was tumbling at that point.

AMD looks to undercut Nvidia, win gamers' hearts with RX 9070 series

FeepingCreature

Yeah at a factor of ten margin.

It's clear why they do it. It's much less clear why there's no serious competition. I'd like the relevant authorities to have a good look at the respective CEO's mails and phone calls.

Hey programmers – is AI making us dumber?

FeepingCreature

It's making us smarter.

I disagree. It's not by learning simple skills that we get better at complex skills at all.

Knowing how to call an API has nothing at all to do with the ability to model a multithreaded interaction, or design a scalable architecture for a program, or choose an appropriate database, or debug a stack corruption, or profile and optimize an algorithm by precaching some auxiliary data. That is why AI will make programmers smarter, because it cuts away the dross and lets the learning focus on the hard parts.

What's changed is it's no longer as easy to tell the skill level of a programmer by looking at their outputs. Naive neophytes can now produce programs that used to be reserved to bloodied journeymen. But that doesn't mean that the actual journeymen are dumber than before.

(Also, the main thing you learnt with StackOverflow was to recognize when StackOverflow was smoking crack and you should find another source. This skill is alive and well in the age of LLMs.)

As somebody who's been writing code both with and without LLMs, in my experience the ability to call them on their bullshit makes them much more useful for development. All these newcomers writing code with AIs without understanding it will learn this skill through pain, as we all have.

DeepSeek's R1 curiously tells El Reg reader: 'My guidelines are set by OpenAI'

FeepingCreature

Tech bros 0?

I have bad news for you about Deepseek.

FeepingCreature

Re: AI "reasoning"

To be honest, it's always (since GPT3 at least) been how LLMs worked. You just used to have to list out the steps explicitly in the prompt. Now they've finally done the obvious and trained it to produce the steps as well.

The only change in R1 is they've gone from "You should" to "I should".

Blue Origin postpones New Glenn's maiden flight to January 12

FeepingCreature

As a SpaceX fan

As a SpaceX fan, good luck to the Blue Origin team tomorrow! I'll be cheering you on. Healthy competition can only be good for access to space.

FeepingCreature

SpaceX's payload fairing is its primary limiting factor, at 6k cubic feet vs 16k for New Glenn. So if the launch goes well, Bezos will have the orbital rocket with the "largest demonstrated-operational payload-to-orbit capacity" (note exact phrasing)... for one day, as Starship (35k cubic feet) will also deploy some satellite simulators on the next launch.

You're right with regard to mass, as Falcon Heavy has New Glenn beat with 64 tons vs 45 tons to LEO. But it's not easy to actually stuff that mass in the F9 fairing.

Open source maintainers are drowning in junk bug reports written by AI

FeepingCreature

Re: AI can understand code

It can understand code! It's just better at understanding it while it's writing it than while it's reading it. As, to be honest, are we all. There's even a famous aphorism about it!

Current LLMs are just undertrained at the specific skill of debugging.

FeepingCreature

Re: AI can understand code

It produces a statistical average of the data it has been trained on that is functionally indistinguishable from spotty understanding.

Given that your brain (as well as any other mathematical system) is encodable as an integer (Turing et al) there is in fact a number that understands your dog.

FeepingCreature

AI can understand code

I've gotten AI to write thousands of lines of working and well-tested code for me. To say that AI cannot understand code is just nonsense. I'll readily agree that it has a hard time debugging though, and should certainly not be taken at face value. IMO the big weakness is it gets confused easily, but it cannot tell when it gets confused, so it always behaves like it has a handle on what's going on, even when it doesn't. But sometimes it does! It's not that it doesn't know, it's that it doesn't know when it doesn't know.

That said, my main question with spam bug reports is "who benefits from this"? It's confusing; I just don't understand what the payoff is there. I'd guess it has to be well-meaning users trying to learn bugfixing with AI help?

FeepingCreature

But if someone's at 90% capacity, then the 10% extra workload is only 10% responsible for pushing them in the red. This is just the swiss-cheese thing all over again.

Mr Intel leaving Intel is not a great sign... for Intel

FeepingCreature

AMD not shoot foot challenge (difficulty impossible)

Yeah sure would be great if there were top of the line AMD consumer cards with good ROCm support that could serve as a salespitch for their datacenter offerings. Not, uh, one card that wasn't actually supported until one year in and that didn't stop crashing the system until two.

But I'm sure they'll come up with something better any day-- what's that you say? They're leaving the top segment to NVidia entirely? Committed to no releases in 2025? Well okay then! I guess they don't have to win if they don't want to.

Page: