Open Source Quality Institutes
https://www.tbray.org/ongoing/When/202x/2024/04/01/OSQI
get the government to pay
It's been about a week since the shock discovery of a hidden and truly sophisticated backdoor in the xz software library that ordinarily is used by countless systems. An infected machine would have allowed someone with knowledge of the backdoor to gain remote control over the box via its SSH daemon. Though the dependency – …
>"because there’s no business case for paying people to take care of it."
Government can change this, by requiring quality assessed Open Source, vendors wanting to do business with government will need to ensure their Open Source is of a suitable quality and thus will pay.
In some respects this has similarities to Open Systems in the 1980's where governments did fund the creation of testing lab's and got industry buy-in; only for the government to back track muttering "market forces" and other things. Which over the decades had directly lead us to the current sorry state of affairs.
Hype? Well, this one was found before it got disastrously far.
But how do we know no other attempts have been made undetected and successfully so?
My view is that if one is being totally serious about system security then at this point in time Linux is in deep trouble. Many would reject that view, but then again many would not be able to confirm that their business’s entire dependency tree genuinely originates from known fully trusted sources. Most of it relies on “well everyone else is using it, so it must be ok?”.
As ever, it depends on your requirements but right now no one really knows what threshold OSS reaches, and has no way of measuring it.
Proprietary systems make the news much more often and for all the wrong reasons, usually AFTER a big incident. Look at Mitre CVs and FOSS is not even a blip in comparison to proprietary systems.
This was caught on time because FOSS being what it is, others were able to review the code and took notice. I say it’s more secure iMHO.
Cough heartbleed cough. Able to is not the same as has been.
The only reason why this particular attack was noticed was because the attacker messed up, not because anyone was reading it. And the place where the attack was lodged was extremely unlikely to be reviewed anyway. Whoever reads the build script carefully, or even understands them?
Unfortunately, what’s now been shown is that the measure of security of OSS systems is no better than an article of faith. You say it’s more secure. Fine, but with the greatest possible respect, I have zero data on your intentions. It is now evident that all of us have no data on the intentions of many of the myriad devs and their future successors of the source code packages that, presently, vast chunks of the world assume are good.
We don’t have any data on the devs of proprietary software either, but some one does to some extent.
But how do we know no other attempts have been made undetected and successfully so?
Like maybe the old neglected OpenSSH issue that caused so much uproar. The TLAs are not filled with morons, so it is highly likely that a backdoor they insert would be indistinguishable from what others may consider to be a genuine mistake in a highly complex piece of code. So when a CVE comes out for a key library where it looks like someone has "missed out handling of ABC in library XYZ" just ask yourself whether a smart enough individual could have crafted that situation.
If you're this sophisticated (ie a nation state) then you can just get your programmers hired by the closed source software vendors too. They may not be as underfunded, but I suspect time pressures, project deadlines and apathetic staff can hide just as much careless check-ins as underfunded open source software.
Thing is, had this been closed source would the excessive CPU usage have just been put down to 'bad code'? Surely it was being able to work out what was going on that was in a large part responsible for it's discovery.
As ever, it depends on your requirements but right now no one really knows what threshold OSS reaches, and has no way of measuring it.
That's as true of closed source software as it is OSS.
The reality is we have a software industry, not an engineering discipline. (Because 'programming is easy' and you can 'just pick it up').
>If you're this sophisticated (ie a nation state) then you can just get your programmers hired by the closed source software vendors too.
The difference is that the hiring company can ask its government to do its own background check. For all intents and purposes that likely means a US corp asking the US Gov, and for critical roles within the corp the US gov probably is motivated. It’s a lot harder for a nation state to clear that boundary than it is to socially engineer its way into a one man OSS repo.
It’s a lot harder to confer a gov security clearance on all the key people in an internationally diverse OSS project.
One could argue that the way RHEL is headed it’s becoming more proprietary and more American. Maybe this is a reason why SystemD (definitely a RedHat thing) is endeavouring to supplant lots of existing stuff; it’s a code base largely under RedHat’s control.
Proprietary software with this attack within it would indeed have been harder to independently investigate.
> What can be done to protect open source devs from next xz backdoor drama?
1) Further strengthening the open-source motto of "It will be done when its done". Users need to be patient. Anyone applying pressure should possibly raise suspicion going forwards.
2) Reducing dependencies. Developers should not be afraid of writing new wheels that actually fit the problem. Too much crap, dragged in (especially from PIP/NPM/crates.io/CPAN) is just bad engineering. Open-source software has a real dependency issue.
3) systemd is bad engineering. Pulling everything into a mega-system, sprawling / entwining dependencies was predicted to be a bad idea. Ultimately this ended up being the case via the hooks into even well engineered software (OpenSSH).
> Do multi-billion-dollar corporations that feed off free work done by others need to step up and help here?
No. God no!
People keep saying "stop using so many dependencies" and never offer a real alternative, just "well why aren't you writing them all yourself?" which is logic so laughable even a 5 year old would question it. No, how about you write them all for me? Why don't you write my command argument parser? Why don't you write my serializer? Oh, and make sure they aren't buggy, ideally I'd like you to spend maybe 5 years on each of them to make sure they're flawless and have the features I want.
This cargo cult behavior is just because NPM made such a mess of things. One language's bad ecosystem is being used as an excuse to rant at everyone who wants to reuse code without any solutions, and it's pathetic and childish.
*Real* solutions I see are experiments where people turn libraries into WASM modules that are sandboxed by default. Ideally, all of a library's logic would be sealed in a sandbox and only it's I/O would be of any consequence, which is far simpler to constrain. I think there's even a Rust project that's doing this, I forget the name though.
People really need to stop this "stop reusing so much code" fallacy, it makes no sense and nobody will never listen to it.
Code reuse is an essential part of programmer productivity. Back in the 20th century, there was actually considered to be a “software crisis”, in that software projects were too often running over time and over budget, and consuming the efforts of way too many programmers.
Today that doesn’t happen—not in the Open Source world, anyway. The huge cornucopia of ready-to-use toolkits for common languages like Python and the like gives you a massive running start on any new project. That and the common base of a *nix OS like Linux, with its ready-made stock of tools you can use to create your code, test it, build it, and deploy it. And profile it and maintain it.
Code reuse is an essential part of programmer productivity.
True. But I do despair when devs choose the lazy option of dependencies rather than implementing without when they should be easily capable of it. I accept it's always a balance but when someone is using Python and imports "numpy" and "struct" just to extract a 32-bit entity from a bytearray I have to roll my eyes.
I sill have my three copies of the Camel book (each from/for a different employer.) Took me a long time to figure out that Larry Wall is talking about hackers, definitely not software engineers. Hubris gets you things like systemd. And a whole lot of other tools that have contributed a lot to productivity up front only to take it all back with payday-loan-class interest.
Knowing your limits helps you avoid creating seductive messes.
Today's "programmers" aren't that anymore. They just throw canned libraries or "frameworks" at a problem, without actually knowing why and if there would be a better option to solve a problem, with less unknown "hubris".
To sort a small array or list, a quick hand rolled bubble sort might do just fine, without the fluff of pulling in a whole qsort library. Or throwing simple data into extensive SQL tables with vulnerable SQL statements/queries, where a simple, small random access data file would suffice.
Reusing code is fine, but one needs to know what to reuse, why and how.
But for far too many folks these days, that only know how to use a hammer, everything starts to look like a nail...
Yes code reuse is more efficient than writing everything yourself, but in some cases it has gotten way out of control. I recall when I first tried to learn node I created the most basic app with just a few dependencies. But when I dug into it I found that there were actually thousands of dependencies dragged in with just this simple app that did almost nothing.
There is a difference between a solution using i.e two key dependencies, crucial to functionality vs a load of shite (200+ micro dependencies) dragged in from crates.io to do trivial things right?
It tends to be a difficult concept to grasp for many Rust fans but sometimes that wheel really *does* need reinventing. Sometimes the crates.io dependencies really are not appropriate.
People using dependency aggregation languages (i.e Rust/JS/Python) are not dumber... just often bad engineers. For example loads of physicists are waaaaay smarter than me. But their code is shit.
For an example. Yes. The Emscripten toolchain. 99% of the functionality is written in homogenous C++ with zero dependencies (aka upstreamed into the Clang compiler's WASM backend). That last remaining 1% pulls in Python and node.js and 80+ dependencies for some of the most trivial crap.
Look through the issues list and take a wild guess where most of the build / deployment issues creep in. It is all just legacy technical debt now.
Again, very smart guys... but from a Mozilla background, they are "web developers". This is synonymous with poor engineering and this mindset has shown a little. Their build system doesn't even support a proper deployment for OS packagers.
Can you give an example where using a generic library from crates.io/NPM, etc has been better than a bespoke tailored solution?
A correct cost analysis will look at future maintenance cost of external software, as well as future technical debt burden once it is discontinued upstream.
Luckily, I am thankful that the GPL license prevents a lot of terrible dependencies from polluting many codebases.
I guess we work in very different sectors of the software engineering industry.
In addition to carefully considering whether to use a dependency or re-implement yourself (which has to be done on a case-by-case basis as a self-implementation could be even more buggy), it would be useful if we had a wider choice of dependency interfaces and sandboxes to choose. Security-critical apps like ssh, which can compromise a whole enterprise, should be able to tradeoff performance against safety with selection of a library interface which offers more protection even though it is very slow.
Yes, I say that--but mostly only to get attention.
What I actually say is, "carefully evaluate the attack surface of all code brought into the project". If someone takes that seriously, they will very often find it is *much* cheaper to write their own from scratch that does *only* what they actually need in the code. Or maybe a fork, trim, and fix.
The end result is inevitably--far fewer dependencies.
systemd is bad engineering.
... predicted to be a bad idea.
Indeed ...
In spite of all the evidence/arguments previously presented against it by those who knew better, it is something that this episode has been dangerously close to making a very painfully proven point with grave consequences.
A point that is still being ignored by most of the people involved in/with Linux.
This systemd crap-fest will continue to be firmly pushed forward by the usual suspects until it is fully integrated into the Linux eco-system as the de-facto init.
ie: finally, a MS registry for Linux, with all that such a thing entails .
If this process is not stopped, Linux as we know/knew it and Debian (the Trojan horse used for to accomplish the feat) will definitely cease to exist.
And the well known, 30+ year old EEE will have taken its latest victim.
The writing has been on the wall for decades now, but the problem seems to be that most of the people involved are decidedly illiterate.
.
Your mostly wrong there. Redhat caused all of this because as the M$ of commercial Linux, they had scale, money, and mindshare. So all of the distro's built by downstreaming one of Redhats code bases, or worse selling themselves as compatible, were just along for the ride. Debian fell in line due to bad internal politics and shortsightedness precisely because Redhat was the "too big to fail" mindshare leader, and Debian leadership was susceptible to the huge influence they had.
So now almost the entire Linux world has been drug down into the quagmire. But the fact that the Systemd team is intentionally making the problem worse is bigger than the actual issues with Ssytemd itself.
But to state it clearly the Systemd ecosystem is a bigger problem than systemd itself. If you require an init system that does everything Systemd does, in a way that is transparently compatible with Systemd, you have mandated using Systemd, in all it's poorly architecture sprawling horror. Because if you build an init replacement that meets the core functional requirements, you still have install Systemd and deal with all of it's problems unless you re-write a new interface for every other service the Systemd team infested by making it systemd dependant/integrated.
So you can't just build a fixed init. You have to rebuild most of Linux, or ask ever so politely to beg the time of the other Devs to rewrite THEIR code base to support both Systemd and your new Init. That is a huge amount of work, which they will understandably politely decline. And even with resources, the systemd team will be continuing their mad dash to add new sprawl to the mess they have made. So catching up with the head is virtually impossible.
So Redhat as the upstream of most of the main Linux distros has the leverage to stop it, but won't because they created the mess, they would have to admit they were wrong, and the Systemd sprawl increased their monopoly control of the commercial linux world. They get to play king of the hill with all of the other distros who are either depended on their code as a downstream, or would die on the vine unless they were broadly compatible with "Big Red"
A more likely escape hatch is business supporting and shifting to the BSDs, and hopefully learning from their mistakes. But it's easier to move to three functioning members of one family than the dysfunctional mess of companies and ego's on the Linux side. And at least under BSD the init is freestanding and not linked to most of the other OS processes. So if you needed to implement a fully recursive state and dependency checking PID 1 replacement, you could without breaking or re-writing half the OS and pissing the internet off.
Red Hat weren’t even keen on systemd. Poettering had to persuade them (and everybody else) that it was the right solution to a whole host of problems, that were not being adequately addressed anywhere else. That’s why it has become popular.
Yes, there are people who disagree, and make a different choice. After all, Open Source is all about choice.
I'm not an expert but I believe this is a religious argument which can never be settled.
I personally dislike the "init through scripts" paradigm since I don't believe in scripting. Scripts become so bloated and complex after a while that it's much easier and readable to implement it in code. As for the SystemD architecture some parts of it may have been poorly implemented but those can be corrected.
> The ones complaining are, shall we say, most charitably described as “armchair programmers”.
These days I'm definitely an armchair programmer. But there have been times when I've ended up delving into the systemd code, lets be charitable here, lets just say that the developers are very naive about a lot of the areas they've ended up dabbling in.
I don't have an issue with having a single service manager rather than several different ones.
Personally I don't like binary config files or log file, but that a small matter. I'd prefer to do a lot of these things from scripts, but again - not a major issue.
Perhaps if the developers of systemd had followed the normal Unix/Linux guideline of do one thing and do it well then they wouldn't keep having all the issues they have got. But rather than following decades of good practice they've branched out in all sorts of different directions and no one can be an expert at everything. So they've coded up areas where they don't have enough real world experience and have just said, hey this works on my laptop so it must be fine.
They've also not followed good practice with compartmentalizing things and this leads to potentially opening the system up to security weakness unnecessarily, more experience developers probably wouldn't have made the same design mistakes.
In this case we have an issue that a distribution is trusting systemd and is therefore including it into other bits of SW and since systemd has a huge number of dependencies its inclusion is exposing other code to weakness far away from the core of the application.
Would you care to mention one or two?
I ask because I have looked at the systemd
code myself, specifically the sd_notify
system where the backdoor was able to get in. The code is complex because it has to deal with a lot of cases I hadn’t thought of—like operating from within a virtual machine, for example.
For example where they need to interact with system firmware environments and have assumed how enumeration is likely to be implemented. It's improved over the years, but they'd clearly not researched the field widely enough. But then why would the designers of a service manager be experts in such things as ACPI?
Are you talking about ACPI and UEFI? Which are complete cans of worms and never quite implemented according to any actual specs anyway?
Is this src/libsystemd/sd-device/device-enumerator.c
the kind of code you’re talking about? You think, at about 1200 lines, it is maybe too simple? If you have “researched the field widely enough”, could you point out one or two things that could be done better?
"Linux PIDs, for example, start from 1, there being no PID 0."
PID 0 is owned by the kernel, specifically the process that keeps an eye on memory.
Traditionally, the init was the next process called during boot, so it defaulted to PID 1 .... later, as the kernel grew more complex and had to call a few other processes that required PIDs, technically the init might have received PID 2, 3 or 4 (or whatever), and indeed I worked on early systems that did this. Thankfully, in order to preserve sanity within the system, wise heads decided to reserve PID 1 for the init.
The Linux kernel has it's own initialization process, what we think of as "init" is just there to set the system up for humans. One can change the "init" called by the kernel as PID1 to whatever you like at the kernel command line, using init=/path/to/valid/program as a kernel boot parameter. Try using bash. The more adventurous among us might try EMACS or vi instead of bash.
You didn't ask me an original question. I simply pointed out that a point you made was incorrect, with a little background should anyone be interested in what is, to be fair, a rather esoteric subject. And then, instead of acknowledging that you made a mistake, as any sensible chimpanzee would do, you waffled along at random, trying to make yourself look intelligent.
You failed. Miserably.
Do you have any more words of wisdom with which to attempt to impress the crowd?
> Don’t understand what you mean. Linux PIDs, for example, start from 1, there being no PID 0.
I thought we'd been discussing device enumeration, so what do PIDs have to do with it?
As for PID 0, Unix kernels used to have a PID 0, the argument going that it is necessary to hand assemble a proc structure before you can run fork to create others. They'd make PID 0 then they could fork & exec init so that it would work like any normal userland process. As others have pointed out, Linux uses 0 as a PPID.
I had said "Unix kernels" for reference see the McKusick book on 4.4BSD page 504.
I can't share the code I have in front of me, its from a proprietary version of Unix.
As for the Linux kernel, I don't know my way around the code that well but in the Linux kernel there is an init_task of type struct task and this has
init_task.pid set to 0
init_task.comm set to "swapper"
just like the legacy Unix kernels, only there it's a proc structure.
Take a look at init/init_task.c
But this wasn't my comment about naivety, where I was referring to earlier versions of the systemd assuming that device enumeration starts at 1 when some systems start at 0. Please don't get me started on systems which don't start at either zero or one.
It may not look like it, but that’s the task_struct for pid 1. You will notice there is no fork
call anywhere in there. The execution of the specified command for the init process happens in init/main.c, where it is done via run_init_process
, which in turn uses kernel_execve
, which is just the in-kernel entry point for the execve
system call.
Since there is no creation of a new process, that means it must be running in the already-created process context, which is that init_task
.
So you see, there is no such process as “pid 0”.
I've been saying for years now: systemd is the Windows-ification of Linux. It's the opposite of the traditional UNIX ethos of using small, discrete tools to accomplish generic tasks. This has been wildly successful because it encourages evolution in a manageable manner and at a manageable pace.
The logic for having them be one set of people is that the developers know what the system needs to have on it since it's running their code, so they can make changes faster. The logic for having them be separate groups is that those who specialized in administration know things developers don't know which turn out to be important to the health of the system. Both arguments have some good points, but taken to their extremes, don't produce the obvious better results their adherents hope for.
Take this as an example. Sysadmins wouldn't, just by being sysadmins, recognize this vulnerability in XZ. The people who spotted it were programmers reading its code, not admins installing the thing. Alleging that having a sysadmin running the server instead of combining that role with a developer would prevent the class of problems is overoptimism.
The Second Principle of Agile Truth being:
...
Working software over comprehensive documentation
...
Is, IMO, much to blame. Overdocumenting exists, but what about ever documented the proper thing, like >HOW< things work.
Agile was barking at some form of BEFORE THE CODE documentation: proposals, Product Development Authorizations,
...
But not the description of how and why this particular pice of text does its magic, the kind of documents intended for the Developer/Admin/Sysop who will have to use it and fix it when the original author is long retired.
Anybody has a OuiJa keyboard?
Not that Javadoc didn't make the same error; but at least "the intentions were pure".
Eiffel xas a good thought; but, as many other, sacrificed on the altsr of Speed At All Cost, so game software is more realisric and everything else less safe.
Now let's talk about log4j...
'Stable', 'Testing', 'Unstable', but add one for to be used by anyone deploying software as a commercial service, or at a 'critical infrastructure' level, e.g. 'ISO_Registered'.
IOW, establish a level of software 'authenticity' that meets certain required 'paid for' standards. And make sure that some of that 'paid for' goes back to the original developer/s, although they would not be responsible for maintaining the software at 'ISO' level (or whatever it's called).
There's certainly no 'free' solution to this problem, and no solution that's going to be anywhere near perfect, but there will be some solution that covers most of the bases most of the time to a level that provides for transparent and 'trustworthy' software, from top to bottom of the stack, where that level of assurance and traceability is needed.
"And make sure that some of that 'paid for' goes back to the original developer/s, although they would not be responsible for maintaining the software at 'ISO' level (or whatever it's called)."
If they're not responsible for doing that, I'm not sure who will be. Maybe you can get enough people to pay them so they agree to be responsible for it, but if you can't, then it likely won't get done. A separate group of people certifying software is not likely to have the knowledge necessary to actually know whether the code has been compromised. This is especially true if they have to certify every package on a Linux system, as they would have to if they're going to catch things like this. That is a lot of packages.
Yes, we're talking about a 'big problem', and big problems usually require big, and costly, solutions. But if the problem really is a big one then the cost is usually deemed to be worth paying.
We're talking international cooperation here. Always easy to establish and maintain!
According to Akamai "XZ Utils Backdoor — Everything You Need to Know, and What You Can Do":
The backdoor is quite complex. For starters, you won’t find it in the xz GitHub repository (which is currently disabled, but that’s besides the point). In what seems like an attempt to avoid detection, instead of pushing parts of the backdoor to the public git repository, the malicious maintainer only included it in source code tarball releases. This caused parts of the backdoor to remain relatively hidden, while still being used during the build process of dependent projects.
So the victims were not compiling from the original Github open source version, but from some tarballs instead? This surprised me - maybe it's wrong to call it "open source"? Not compiling from the published open source was one of the security mistakes made, and it seems like one that could be straightforwardly addressed going forward.
"So the victims were not compiling from the original Github open source version, but from some tarballs instead? This surprised me - maybe it's wrong to call it "open source"?"
Open source refers to the licensing. Both copies used the same license. It was definitely open source.
The distinction is just where people who needed the code went to get it. They had multiple options, and the exploit was added to a subset of them. Someone who chose to get the code by cloning the repository evidently would have missed it, whereas someone who used the alternative would end up picking it up. That's a problem already, because we'd probably want to make sure that the two sources are in sync, but it doesn't stop the project from being open source or indicate that the people getting the source were necessarily doing something obviously wrong.
I do this myself with some projects. You can clone my git repo and get a copy, or you can download copies of the source from a different site. If you make sure you're on the same release with both, the operative files will be identical. The git repo has more files in it because the source archives just contain the code and build scripts, not irrelevant things like the .gitignore file.
The git repo has more files in it because the source archives just contain the code and build scripts, not irrelevant things like the .gitignore file.
Which, unfortunately, means the consumer of the code can't automatically check the tarball really does match the repository in all respects. But the alternative - require all packages to be built from their git respository - means there will be a lot more complexity in build scripts so it may still be possible to hide hacks using the same tricks used in this case (extremely opaque m4 macros which react to changing a few bytes in an obscure binary "test file").
But the alternative - require all packages to be built from their git repository - means there will be a lot more complexity in build scripts so it may still be possible to hide hacks...
The tarball was used specifically because the source changes were strange enough to have drawn attention when viewing the diff with git. Git itself provides security by allowing granular viewing of changes. So adding the requirement that source comes directly from a singular git source doesn't seem too much to ask - especially for anything going into a kernel build.
Also it's not only deliberate maintainer hacking that could use the tar file strategy - a compromised account would offer the opportunity to quickly replace a tar file with a compromised version with a very close timestamp, and no one might notice.
Of course much worse behavior - e.g., downloading docker tars of doubtful origin and running them, is widespread. But not for kernel compiles.
- Privileged maintainers' identity should be established, preferably through in-person meetings. Intelligence agencies or government institutions could be asked to aid in this.
- Complexity brings a bigger attack surface and should be reduced. Idem dependencies.
- More scrutiny of binary and text BLOBs added to the source code
- If the other maintainers don't understand every part of the checked-in code it should be rejected or explained
Despite the article and several others on the subject being framed as something close to an apocalyptic event, which to be fair this could have been if no body had noticed, this incident demonstrates both the weakness and strengths of an open source model.
The solution to this problem isn't to stop using open source software, it is to make sure that all the important bits are resourced properly, but unfortunately if the solution to that problem was easy it would have happened years ago.
It's the permanent problem for anything where there is no or minimal revenue stream. It's a real demonstration of "there's no such thing as a free lunch.".
Given that a major part of the problem is personal trust, a big component of "resourced properly" is that required to establish that everyone involved is trustworthy. As we've seen, keen, enthusiastic and productive (things that we usually associate with "resources") are not in themselves enough. Someone else has to say, "this person is OK", and someone has to be the root of all that trust. It's exactly like certificates. There has to be a root Certificate Authority, and if you don't trust it then the certificates issue in its names are worthless. So it goes with developers.
So, who is that going to be that root of trust? A person? An organisation? Establishing that a person can be trusted is not really something that falls within the software development world, it's more a Human Resources / Security Department thing. Does the EFF have a HR and Security Department? Probably not. Does Linus Torvalds? No (with the greatest respect to his talents and capabilities).
The advantage that companies like Microsoft and Apple have is that they do have HR and probably something resembling a security department (especially if they're having anything to do with government(s)). And because each and every one of their customers is paying money, that is resourced. That doesn't absolutely mean that their developers are fully trustable, or that they adequately vet software they borrow from the OSS world. But, it's better than nothing (and if government security standards are involved, its probably quite thorough).
Another Thing Missing From the Debate
This episode saw the attempted introduction of an explicitly coded backdoor into pretty much all Linuxes, that would have given the perpetrator access to a lot of Linux boxes all over the world.
Global access being the presumed end goal, it's important to recognise that this can be achieved in other ways. It doesn't need an explicitly coded back door. The same level of access can be attained via simple coding errors, or slightly flawed code in the right piece of source code, or indeed a CPU flaw. These have been happening all the time.
The only difference between the two is that the originator of the deliberate backdoor gets pilloried, or lynched or something, whilst the developer who simply made a mistake is, well, everyone nods with understanding whilst being grateful it wasn't them. For example, nobody (including me) thinks bad things of Robin Seggelmann or Stephen N Henson, the pair who (according to the Wikipedia article) between them made the mistake that lead to Heartbleed and failed to notice the bug. However, it's entirely possible that someone else did find the Heartbleed bug and was carefully using it for years before the bug was (re)discovered and publicised. One person's innocent mistake is easily repurposed a someone else's global backdoor, with the same potential impact.
So, how does one tell the difference between an innocent mistake, and a deliberate mistake? One cannot. It starts taking government-levels of investigatory powers to be confident that the person who made the mistake hasn't got a private deal lined up in the background. Absent that, one can design review processes that make it difficult for a loan actor to succeed, and more elaborate ones to ensure that two cannot collude to get a "mistake" through to production, and so on (depending on one's level of paranoia). But these take more and more resources, and they're no use unless there is some independent audit of the activity under the process.
Thing is, with OSS and long-distance-physically-never-met teams of devs cooperating, it's potentially quite possible for quite a few devs to collude. After all, to the rest of the team the only thing that distinguishes one dev from another is probably their email address; those are not hard to get.
Wise words, Bazza. Well said, sir.
> So, how does one tell the difference between an innocent mistake, and a deliberate mistake? One cannot.
And that keads us to, what to do instead. This backdoor was going to open an ssh connection.
Outgoing firewalls? Would that have helped here? It's not a new idea, but seems to have fallen by the wayside in recent years. Perhaps time to bring it back?
I think that there's no technological solution (other than what I'll term "far out" concepts such as an AI review-o-mat, advances in formal specification / automated testing, etc).
What would have helped here? It really does come down to either positively identifying trustworthy volunteers, or review processes that significantly hinder bad actors. Given that "security vetting" is near enough impossible for OSS to do all by itself, OSS is then reduced to enhancing the review process. There's already a shortage of willing volunteers, and it's even harder to find people to do review work (people prefer coding to review).
The danger is that some large corporate concern will step in and "take ownership" of the problem. If one does step in, they'll be wanting more ownership. RedHat has already demonstrated a willingness to do precisely that.
Yet Another Aspect
The other thing to remember is that "developer identity" often being not much more than a name and an email address. Such identity is readily transferable, or stolen.
For a purely hypothetical example, the only reason we have to believe it really is Linus Torvalds authorising Linux releases is because we're confident that his IT credentials are not compromised. If someone did learn his password, they'd be able to be "Linus Torvalds" for the purpose of sneaking dodgy code into the kernel.
Of course, if Linus was hacked we'd soon learn all about if via the media, and the damage would be repaired. But, would that happen to everyone? What if a well placed soon-to-retire dev passed away, and a bad actor who'd got their login credentials was simply waiting for that to happen? We'd be fully dependent on the family knowing what the dev's importance was in the OSS world, and knowing who to contact.
Companies solve a lot of this by having much closer relationships with their employees - pay, pension, HR, management, office location, teams, parties, family members knowing where the salary comes from, control of their company IT, etc.
Everyone seems to just assuming that upgrading a piece of code is normal so they just ask themselves "what kind of bureaucracy can be introduce to stop this from happening?". To me, that's the wrong question.
The first question is "Why are you changing that library?" Was it to fix a critical bug? Was it to add another component?
The next question is "What kind of test strategy are you using to verify that the library functionality hasn't changed?"
The third question is "If you're adding new functionality then why does it need to be in this existing library?"
This practice of constantly changing code, often for trivial or cosmetic reasons is commonplace these days , as is the practice of so integrating the build process that its not easy to isolate functional modules for testing. Its seems that its common practice to just build something and if it compiles and seems to work then it gets released without any further testing. This is asking for trouble and certainly explains why, for example, Windows is so unstable. Since Microsoft has started to 'embrace' open source by taking over GitHub I've noticed this kind of squirt and pray methodology all over the place -- its not that the code is 'complicated', its because we've failed to make it simple. Errors will creep in and its a crapshoot whether we notice them before they cause any damage.
Obviously, what the big monopolies would like would be to add keys to everything with them as the keyholders. I daresay that everyone will be duly surprised by this in due course.
I'm not saying that closed source is the solution but one advantage of closed source is that generally you know exactly who checked what changes in which means it is easier to point the finger of blame. Open source is a wonderful process as long as those contributing can be trusted and by and large that is the case. But if the one bad apple manages to sneak under the radar, it could be very bad news for a lot of people with minimal risk to the contributor who can often hide behind anonymity.
This post has been deleted by its author
Based on the past behavior of large corporations I am not sure why one would entertain the idea that companies with closed source software would be forthcoming about vulnerabilities in a more timely or forthright manner than the open source community.
The idea that a company just naturally would want to disclose and fix software (or product) issues to minimize damage to their customers is not born out by the historical record of some other large corporations that have knowingly put their customers directly in harms way.
Boeing and their MCAS software come to mind. Yes it was an errant reading from the single AOA sensor that fed into the system that was the base of the problem, but had a community looked at the design prior to installation it is more than likely it would have been highlighted as a horrible design.
Reaching further back into history of corporations knowingly ignoring deadly safety issues to save money one can examine GM's faulty ignitions that they continued to install for many years AFTER they realized that they would randomly turn the vehicles off disabling the power brakes and steering and even locking the steering wheel. The executives literally determined it was legally less expensive to hide the problem and pay out on the few families that figured out their loved ones had crashed and been injured or killed because of a defect in their car then it was to spend under a dollar to redesign the ignition so it wouldn't do this. They saw redesigning it as an admission of guilt and therefore massive liability so best to pretend you just didn't know what was going on. Many people were killed that didn't have to be had GM just fixed the problem.
So my only point is, I trust many eyes looking openly at something to more quickly and efficiently point out potential problems than relying on people within a closed source environment that always have the option of hiding behind the veil of closed source and denying that there is a problem because they have a vested interest in picking the truth or lying based on their determination which is least costly.
I’m just assuming that tarballs are just a convenience for easier download (1 file instead of 2,000), and should at any time be equal to some specific version of the source code repository.
So an easy fix would be to run a script every night that extracts the tarball and does a diff between the tarball and the repository. And sends the diff to a developer or two who can take action.
Or forget about that altogether and replace the tarball every hour to prevent this from happening. The script must already exist, the only change would be to run it not only after any accepted change, but also once every hour.
This post has been deleted by its author