* Posts by flipflap

2 publicly visible posts • joined 25 Sep 2023

AI safety guardrails easily thwarted, security study finds

flipflap

Nasty in, nasty out

LLM's are the canonical definition of SISO. To my dumb mind the solution feels simple - don't feed the poor thing the common crawl. Train it on content that isn't awful.

Feeding it every internet word - including the utterly depraved, awful, and worse - and then trying to convince it to be nice with 'alignment' feels like a strategy that can never work.

AWS, for instance, has trained its copilot on code that meets licences most enterprises would be willing to accept. So it has a massively reduced copyright risk compared to github copilot.

I suspect the problem is that it's easier to try and align, than it is to filter a trillion tokens. Which is commercially short sighted in the worst way - everybody else gets to pay for the damage caused by that thinking.

Microsoft hiring a nuclear power program manager, because AI needs lots of 'leccy

flipflap

BSOD...

This will bring an entirely new meaniing to blue screen of death - in fact, dear register - let's have a competition to vote for the best new moniker:

I shall start - Bright Scorch of Destruction.