
It does sound rather...
...more like a compiler issue to me from what was said in the article: Optimising-out bounds- and/or null-checking code!!!
Not impressed with the kernel devs' responses anyway.
A recently published attack exploiting newer versions of the Linux kernel is getting plenty of notice because it works even when security enhancements are running and the bug is virtually impossible to detect in source code reviews. The exploit code was released Friday by Brad Spengler of grsecurity, a developer of …
Really the bigger issue here is the SELinux vulnerability, as that does exist on all current distributions using SELinux out there right now, and that particular vulnerability likely goes back several years. No vendor yet has mentioned how long exactly the systems have been vulnerable, but both Fedora 10 and 11 are known to be vulnerable. The vulnerability allows anyone to exploit the large class of null pointer dereference bugs in the kernel, which would not be possible with a regular kernel.
-Brad
Come on, El Reg, I really expect better from you.
The following: "Although the code correctly checks to make sure the tun variable doesn't point to NULL, the compiler removes the lines responsible for that inspection during optimization routines." is completely false.
The bug is real, but it is a very simple bug. Not checking a pointer for a NULL value. No, the code does _NOT_ check for NULL and that is what causes the problem. This has been blown way out of proportion.
For those who know C, this is the relevant code:
struct sock *sk = tun->sk;
...
if (!tun) return POLLERR;
The bug is in the 1st line - it uses tun before checking it for NULL. The check is a few lines below. A very simple bug that happens to the best of us.
Now the exploit is extremely clever, but the bug itself is trivial.
It's not the fault of GCC that the kernel developers failed to use the proper optimizations to build the kernel with. There exists a specific gcc optimization flag, "-fno-delete-null-pointer-checks" that keeps these kinds of bugs with this pattern from turning exploitable like mine did. This flag will be added in the next stable version of the kernel.
-Brad
I want to know who's idea it was to have the compiler remove NULL pointer checks by default. If you are testing for NULL you are doing it for a reason!
The way I see it this is not a failure of the kernel team for not specifying the -fno-delete-null-pointer-checks compiler flag, it is a failure of the gcc team having the compiler do away with such checks by default!
I for one, would prefer a kernel that spends a few extra cycles testing for bad parameters, to getting my system reamed by some pimply script kiddie in china!
Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed.
I'm sorry you don't seem to get it, but you don't make yourself look smarter by spamming your response on every site mentioning this vulnerability.
Oh and for reference, Red Hat has marked the SELinux vulnerability I disclosed as "High Severity":
https://bugzilla.redhat.com/show_bug.cgi?id=511143
-Brad
Well, this interested me, so I wanted to check. Here is what "info" says about that gcc flag:
-fdelete-null-pointer-checks
Use global dataflow analysis to identify and eliminate useless checks for null pointers. The compiler assumes that dereferencing a null pointer would have halted the program. If a pointer is checked after it has already been dereferenced, it cannot be null.
In some environments, this assumption is not true, and programs can safely dereference null pointers. Use -fno-delete-null-pointer-checks to disable this optimization for programs which depend on that behavior.
Enabled at levels -O2, -O3, -Os.
I don't know the kernel environment, so I don't know what happens on a NULL pointer dereference there. But, with typical user code, what gcc is doing is reasonable, if a bit extreme.
The bug is in the kernel code, where the check is *after* the dereference. Even if the author knows that that works in the kernel environment, I think it is still a bad idea because it is quite non-obvious. If performance is that critical, then add a comment explaining what is going on. Adding the gcc flag to the kernel compile flags will help.
All IMHO of course - I'm not a kernel developer.
I'm about to demonstrate that I'm not an expert, but why don't non-kernel processes have their (virtual) first page/segment, into which any null (as zero[*]) would point, by default removed from the process' address space? The hardware would then catch it and hand it to the kernel on a plate.
And perhaps get GCC to report check-after-use constructs like this which are clearly wrong.
[*] null != zero in the C spec but in any current machine I'd expect it to be.
Let me check, this is a potential *compiling* problem, so the kernel code is sound. The compiler is OK, too. It´s just a matter of passing the right options at compilation time. Hardly a Linux problem then. More like a *potential* vendor problem...
Good to flag, so that self-compiling guys don get caught pants down, but hardly the end of the world. Especially as, from what I gathered, any exploit would need to run with the setuid bit set, which, let´s be honest, is not bloody likely to happen in any standard distribution, let alone hardened ones. Dubious setuid programs are likely to be prevented from running in the first place. It looks suspiciously like a ¨Oh my dog, if I run exploit code as root my system might be vulnerable!¨. Wake up people, regardless of the OS if you run exploit code as root you´re screwed. And any attack that needs admin privilege to be efficient is a non-attack to begin with. If I get admin access to your system, I am totally not going to try and exploit an obscure vuln in the kernel. There are much easier and more interesting things to be done. I side with Linus on this one. Any program running with the setuid bit set *is* a potential hazard and should be carefully reviewed, that´s why it´s considered bad practice, and that´s why it´s forbidden (or triggers massive warnings) in most serious distros. Now if your sysadmin is willing to make his system wide open, it´s hardly an OS problem, is it?
It´s still a clever attack, one of which might spread using social engineering to root Ubuntu n00bz. Oh, except that Ubuntu doesn´t seem to be vulnerable (yet).
Just one more thing, hardening a system doesn´t mean running SELinux. It means (amongst other things) that only trusted code is allowed to run, so this attack code is never going to be allowed to run in the first place. To this regard, the article is misleading: hardened systems are completely, absolutely, positively, 100% safe.
For one second I thought some of my systems could be vulnerable, I´ll just relax and have a pint or ten now...
Stepping back for a moment there seem to be a few lingering issues that are not really fully resolved.
A null pointer dereference exception is a nice debugging aid. But in code that is supposed to be secure it can never be trusted. It depends upon the system protecting an area of memory at address zero. Typically a page. Clearly it may fail to trap the dereference if the data structure referenced by the pointer is larger than a page. This is not exactly a common thing, but it isn't impossible. Array indexing through a pointer with large array indexes might also come under this failure, and is a much more common thing.
The point is that secure code can never rely upon null pointers being trapped. Code must always check. Always. The colorary is clearly that optimising out null pointer checks is always incorrect in secure code. Always.
It seems that someone forgot that in kernel space, you don't have the possibility of protecting memory like this and that, a-priori, this compiler optimisation is invalid. Always, for all code.
This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code.
One might hope that no SUID program anywhere in the OS is compiled with this optimisation. It isn't just the kernel, it is the entire OS build process potentially at risk. So this is an issue that reaches to each and every distribution packager.
Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough.
The reason that quote was included in my exploit was because of the incredible incorrectness of it, as I was indeed exploiting the kernel in every case, and in the case where SELinux was enabled, there was no setuid binary necessary at all. So Linus' analysis at the time was completely off. Linus is no security expert -- I don't understand why you Linux zealots prop him up as one. If you really want to know what Linus thinks about my exploit, why don't you ask him about it now that he's (presumably) actually seen it? I know what he's said about it in private, and he is most certainly not calling it "trivial bollocks." So with him as your idol, do you now also agree it's not "trivial bollocks" or do you have any critical thinking of your own? You ignore the response of every other legitimate security researcher and point to a quote from Linus in reference to a video of the exploit I posted last week, which was included in the exploit precisely because it was so horribly and hilariously wrong.
"Trivial bollocks" that is currently unfixed and rated by Red Hat as "High Severity."
It's exactly this "let's fix the bug, patch the software and get on with it" that perpetuates the cycle of "fix the bug, patch the software and get on with it." That's such a 1992AD security mentality, which the rest of the world has moved past, while the Linux upstream still lives in the security stone-age.
-Brad
I develop code to run on MS platforms, and I know what most of you 'Linux people' are talking about. But the fact is that all software contains bugs and vulnerabilities. Jumping up and down and crying out about whos better than who is just childish bullshit. Maybe 'we' realize that and 'you' dont? Personally I dont care which platform people choose. Right tool for the right job I say.
I mean really. When there is a bug on Windows, it is better for *everyone* if it is patched. Likewise when one is discovered on Linux. Nobody wants systems falling down or sending spam (unless you are the one doing the toppling or spamming!).
Cant we all just get along? :)
You'll find that most Windows users are quite well-adjusted and not likely to jump on the fanboy bandwagon, like how I see some Linux users/zealots do so (I am not saying that all are like this, just enough to make it noticeable)
I have grown tired of all the fanboism that comes with a story that projects Microsoft / Linux / Mac / etc in a less-than-perfect light. I wonder when the time will come that people realize that it is just a personal choice and no matter what is poster on a forum will never change the minds of others, and that extreme thinking only takes away from your argument.
I suppose this is that same as how everyone has seen the Muslims / Christians / Jews, a few extremists will cast the entire religion in bad light and then everyone else will assume that a person from another religion is a terrorist / bible-thumping racist / money-grubber.
Life has taught me that there will always be trolls harming the adoption of good ideas, people constantly using inflammatory phrases in an attempt to convert people to his or her side but rather harming their own position. Perhaps the only way to treat people like this is just to ignore them, to effectively deny them the attention that they so crave.
It's a code bug, not a compiler bug. The compiler ignores the redundant NULL check after the pointer's been used (which, assuming the compiler is smart enough to work out if a pointer may have changed its reference, is perfectly reasonable). So, surprise surprise, open source doesn't lead to perfect coding however much some people believe it does. True, this isn't a major issue, it's still slightly worrying that NULL reference checks aren't checked with a static analysis tool before releases..I would have thought that was pretty standard practice for something as important as an OS kernel.
> "Setuid is well-known as a chronic security hole," Rob Graham, CEO of
> Errata Security wrote in an email. "Torvalds is right, it's not a kernel issue,
> but it is a design 'flaw' that is inherited from Unix. There is no easy
> solution to the problem, though, so it's going to be with us for many
> years to come."
Um, so doesn't his translate as "Linux is known to have a major security hole that is unlikely to be fixed in the near future"?
This 'exploit' requires the user to have root in the first place, to inject a setuid program into the system (which would be caught by the next run of tripwire and SELinux wouldn't let it run anyway, but let's not let facts get in the way of a good story).
If the bad guy gets root = game over. Anything else they do is just icing.. even SELinux isn't an absolute defence against this.
I agree the optimisation flag on gcc is the real bug - it should be flagging these dereferences as errors not deleting the tests.
The simple bug is dereferencing tun in the line "tun->sk". The fact that after that there's a NULL test on "tun" which GCC correctly optimises away, doesn't make it a more serious or unusual bug -- although it would certainly be nice if GCC issued a warning "optimising away NULL test because you've already dereferenced it". In particular, your contention that "from a source review the bug is unexploitable" is wrong, unless the source reviewer in question somehow misses the "tun->sk" line with the bug in.
The bug in PulseAudio, which the Reg article somehow conflates with this one, is of course completely separate.
Peter
Its because the M$ Camp (me included) all have hangovers on Saturday morning as we were all out last night with real women in real pubs not geeking out over some compiler issue.
Obligatory flame
*nix sux - cry yourself to sleep cos some bloke with a beard made a mistake in your shitty OS
(I don't care really - just joining in for the sake of it)
The core problem here is this is a dangerous optimisation that should only be enabled explicitly, not bundled into -O3. It's dangerous because it assumes the privilege level of the code being compiled and the system behaviour of the target. It should default conservatively and doesn't.
The source itself is strictly correct but inherently dangerous, it assumes knowledge the compiler doesn't automatically have and could have been written more robustly. Its sloppy. being blindsided by gcc gets them off just once, they need to take this much more seriously. I want robust defensive coding in my kernel, not blame shifting.
"Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed."
I expect that's because he based his diagnosis on the much-quoted code fragment...
struct sock *sk = tun->sk;
...
if (!tun) return POLLERR;
If this is the vulnerability, then it is indeed a trivial "used before checked" bug. "tun" is clearly used before it is set and any decent data flow analysis would pick it up even if it is buried in a long and confusing routine. LINT has done this sort of thing for years. I doubt the Linux kernel is marked up with all the annotations required, but to suggest that this can't be found by examining the source code says more about you than the state of the art.
If this is not the vulnerability, perhaps you could enlighten Mikov and the rest of us.
I have not looked at the code in question and have no intention of doing so, but according to the previous comments, the problem is either a compiler bug (though I see that this is disputed) or the code is checking for a NULL reference AFTER it has dereferenced it, which even if it is legal on a particular platform, is a bloody stupid thing to be doing!!!
Either way, this should be trivial to fix (ok - fixing the compiler could take a while if that really is the issue, but it's hardly insurmountable).
But being Linux, I suppose those responsible need to slag each other off and argue and talk shite for a couple of months before anything actually happens. Mmm.... I think I'll stick with BSD, thanks.
You should have just printed the 300 lines of comment as the article, very funny, though perhaps in places not intended.
Fair play though, not a bad piece of code, but this is why we don't use brand spanking kernels. That being said, it is a bit of a non-issue, there are some very specific circumstances and dependencies needed here, and the exploit is a tad flimsy in places- though hey, if it works...
> "This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code."
They do, and the gcc team are happy to work with them and add new options or modify optimisations to make the compiler more suitable for their usage. But I don't know of anywhere the kernel team have ever sat down and made a clear list of what they do and don't want the compiler to do; they're a bit reactive rather than proactive, what tends to happen is that some optimisation turns out to cause a problem for some bit of code in the kernel, the kernel team approaches the gcc team and gets the problem addressed, then six months later it all happens again...
>" Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough. "
Now you wait just a cotton-pickin' minute there. Kernel development is hard-core stuff, and not suitable for amateurs and dabblers. You need to know how a computer works from top to bottom to do it, you need to understand everything from hardware and busses and memory accesses and caching to low level assembly and synchronisation and threading techniques up to the level of security and usage patterns and efficient algorithm design - and you need to understand how the toolchain works and what it does. Kernel developers have very special and unusual requirements, and a compiler is a general purpose tool for a broad audience. It is for kernel devs to know and clearly explain their requirements, not for non-experts (compiler devs) to attempt to second-guess them. They should absolutely be expected to read the fine print of the opimisation flags they want to use to build their code - it's only one more drop in the ocean of fine print they need to read and understand to write reliable kernel code.
There seems to be confusion here as to whether it was a compiler bug or a coding error, whether the kernel is flawed or not ...
If it is the compiler optimising away something when it shouldn't have, this is a fail for the compiler developers.
If it's a case of the source code being correct and the compiler optimised away a null check this is a fail for the developers who built the kernel and their process, but not a fail for Linux per se.
If the source code is incorrect then that's a fail for the kernel programmers and no amount of buck passing to the compiler or arguments that it's not a bug will wash.
No matter how the bug arose, if it exists, is exploitable and demonstrably so in the field, it's a huge fail for Linux either way, and trying to claim it's not worth worrying over is simply trying to downplay the issue.
If there's no source code error, the compiler did not optimise away something it should not, and no exploitable bug exist then I'll agree it's a storm in a tea-cup. Unfortunately that does not seem to be the case.
Yes, there's one problem with your theory.
The Linux kernel has a known, demonstrably exploitable security problem in the field, and the kernel developers do not wish to fix it.
Trivial or not, apparently it's not so trivial that they'll be fixing it any time soon.
No, the reality is that too many Linux zealots including the kernel developers refuse to ever accept they're wrong on anything.
This is why Linux is never going to make traction whilst this attitude is so prevalent and why it's stuck in a rut. Because Linux developers write the code that Linux developers want to write, usability be damned. Find a security exploit in their code? They'd rather let it stay in there claiming it's not their fault than accept they're not perfect and are equally capable of making simple, blatant mistakes.
There are two problems.
The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error.
That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check.
Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid.
The second problem is that some of the people on the Linux kernel do not apparently intuitively grasp the seriousness of this.
The compiler is a red herring. It doesn't delete NULL pointer checks - *UNLESS* you've already dereferenced the pointer, in which case it quite reasonably assumes that you've already crashed before you get to the check anyway - and the exploit proves that it is correct in this assumption, or rather that crashing would be a best case! But testing for a NULL pointer at the end of the routine is just too late: the bug occurs here:
struct sock *sk = tun->sk;
At that moment, because it is allowed for a user process to map memory at address zero, what you have done is inject a user-controlled data structure into the kernel, which implicitly trusts its own data structures. That is the security violation, and it's nothing to do with the compiler, it's more a consequence of a false assumption in the kernel:
1) I can trust all pointers to kernel objects, because they will only point into kernel space and only privileged (i.e. trusted) code can place anything in kernel space.
2) But NULL is a pointer value and it is not in kernel space.
3) The kernel trusts *all* pointer values, incuding the one that happens not to be in kernel space but is under user control.
Ouch. The false assumption could be stated in a single sentence as "we can trust all pointer values, including NULL, because even though it's technically not a kernel address it will always make a crash if you access it". But no, it won't, that's just not true.
What might work better would be to build an option in the compiler to use a value like 0xffffffff as a NULL pointer instead of numerical zero. Or for a few pages (maybe even a few meg?) down at the zero end of memory to be declared 'honorary kernel space', protected by the same kind of PTEs that prevent the user accessing kernel space, and not mmap'able, although that might only mitigate rather than fully block the entire class of exploit.
I don't understand grsecspender's response:
>"from a source review, the bug is unexploitable"
We've known for some time that dereferencing a possibly-NULL pointer is exploitable, it was first shown by that ARM exploit by Barnaby Jack
http://www.theregister.co.uk/2007/06/13/null_exploit_interview/
then there was the SWF/ActionScript null pointer dereference by Mark Dowd
http://documents.iss.net/whitepapers/IBM_X-Force_WP_final.pdf
so frankly, any source code audit that doesn't ask the question "Could this pointer possibly be NULL?" is not asking the right questions at all. I think it's fair to say that there are two bugs: the use-before-NULL-check bug is one, and the user-is-allowed-to-mmap-NULL is the underlying and more serious bug which is what enables this and the whole class of other similar bugs to be exploitable. (You could probably argue that a third bug is in the code that calls this routine while passing it a NULL pointer in the first place.)
That GCC was optimising-out a null-ref check when it could clearly see that the variable had already been used, is *expected behaviour* for the compiler and clearly documented in its manual. If the programmer couldn't be bothered to RTFM, that's *his* fault. not the compiler developers'.
That non-time-critical code for an allegedly modern operating system is being written in a portable assembly language that's getting on for nearly 40 years old is the real bug here. Even my humble Nokia 2630 is more powerful than the computers C was created for.
Oh yes: for those who haven't read the original article the Register's piece was based on, take note of the following quote from grsecurity's own website:
"Due to Linux kernel developers continuing to silently fix exploitable bugs (in particular, trivially exploitable NULL ptr dereference bugs continue to be fixed without any mention of their security implications) we continue to suggest that the 2.6 kernels be avoided if possible."
Note that they're referring to *multiple* instances of these kinds of bugs. The line people enjoy quoting was just one example, which the researcher used to write his proof of concept exploit. That the Linux kernel is *riddled* with such bugs reflects poorly on its developers.
This is also the kind of bug which, had the software been written using half-decent tools, would never have made it into the released code in the first place. For all the criticisms of Microsoft's ".NET" languages, the simple fact is that C# wouldn't have let you write such bad code in the first place. When a user interface—and that's all programming languages are—makes unwanted actions easy to perform, it is time to replace it.
Microsoft may have their flaws, but at least they're trying to do something about the appalling tools this industry insists on using. They're still a long way from development nirvana, but at least it's *something*.
(Oh yes: my computer is a Macbook Pro, not a Windows box. So please don't waste time accusing me of fanboyism. There's no such thing as a "best" platform. Only a "least worst".)
>"The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error."
It's not a bug. It is an explicitly documented feature in the manual, which also warns you to take care with it:
> >" `-fdelete-null-pointer-checks'
> >Assume that programs cannot safely dereference null pointers, and that no code or data element resides there. This enables simple constant folding optimizations at all optimization levels. In addition, other optimization passes in GCC use this flag to control global dataflow analyses that eliminate useless checks for null pointers; these assume that if a pointer is checked after it has already been dereferenced, it cannot be null.
> >Note however that in some environments this assumption is not true. Use `-fno-delete-null-pointer-checks' to disable this optimization for programs which depend on that behavior.
> >Some targets, especially embedded ones, disable this option at all levels. Otherwise it is enabled at all levels: `-O0', `-O1', `-O2', `-O3', `-Os'. Passes that use the information are enabled independently at different optimization levels. "
>"That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check."
I'm saying that when the user explicitly tells GCC to assume that it can do this, GCC should assume that it can do this, and if it is not true, the user should not have told lies to the compiler, and the compiler should do what the user tells it. Maybe you *want* to get a SEGV if the pointer is NULL because you plan to handle that elsewhere? The compiler can't second guess you. You told it that control flow cannot possibly reach that test if the pointer is NULL; it would be stupid of the compiler to bother inserting it.
>"Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid."
It's not stupid. The vast majority of these checks are not going to come from buggy code like this, but from places where any/and/or/all of macro expansion, inlining and templating have combined to generate inefficient code. If the compiler wasn't very aggressive with optimising this stuff, C++ templates would still be the hideous bloated monstrosities they used to be back in the 90s - i.e. practically unusable in anything that has to be the least bit efficient.
You can argue about whether this is a "dangerous optimisation", and should only go in -O3 by default (I might agree with you), or whether it shouldn't be turned on by default at all but always left to the user to request (I'd probably disagree), and you can argue that it should generate a warning (I'd certainly agree with you, but I might consider that sufficient reason to leave it enabled at lower -O levels), but calling it "stupid" is simplistic and lacks insight into the issues. As I think I mentioned once before, GCC is a general purpose tool that must work for a huge range of different applications from small realtime embedded to overnight number crunching batch jobs. No single set of optimisations is ever going to be completely right for all those applications, and the -O levels are crude guidelines, but if you have a very specialised need, you need to take control of how you use your compiler.
Sure, get rid of C, that will free up our time to concentrate on reference counting, garbage collection, bytecode inefficiencies, blah blah blah :).
Linux, Windows and Mac all run on C-based kernels. C may have 30-year-old problems, but at least they're *understood*.
It's worth pointing out that there are many demands on software, not just security. Execution speed comes out pretty high on the list, and nobody wants to cripple their PC with a kernel that already killed their performance for them before they launch their first application.
C is 'portable assembly' because that is precisely what is required for an efficient kernel implementation. If kernel developers could write in something else they would - they are not masochists!
To make it an error, doing that would pretty much go against the principles of C++, as there's plenty of reasons why you might want it to actually do that. Even putting it in as a warning is a bit dodgy as far as I'm concerned..
A far better way of managing it would be to have a compiler switch that flags them as warnings instead of just optimising them away silently, then when you add new code you could easily keep track of it, especially if you've done some pointer intensive code.
Surely the reason for the optimisation is (among other things) code like this:
inline char foo(char *p) { if (p == 0) return 0; else return *p; }
char bar(char *p) { *p = 2; return foo(p); }
int main() { char c = 0; return bar(&c); }
If foo gets inlined into bar, the compiler can spot that the null pointer check in the inlined code is unnecessary and remove it. This is a most excellent optimisation (granted, in this example foo and bar do so little work that other optimisations may render it unnecessary).
As far as the C standard is concerned, this optimisation doesn't have to assume that a null pointer dereference would halt the program. The dereference of a pointer which may or may not have been null means that the implementation can thereafter assume it wasn't null. If it was null the behaviour of the rest of the program is undefined anyway, so the tiny detail of the assumption being false doesn't make it invalid. If dereferencing null is valid and is supposed to have predictable behaviour, then you're into non-standard C, so you have to read the compiler docs. GCC's behaviour appears to be (a) standards compliant and (b) documented, so should come as no great surprise to the programmer.
For my example code, the optimisation certainly should not result in a compiler warning or error. There's nothing wrong with either function foo or function bar. It's just that one of them takes the (perfectly reasonable) approach of checking its input, and the other one takes the (also perfectly reasonably) approach of requiring that its callers not pass in null pointers. Standard functions exist taking both approaches - compare for example time() and strlen().
Maybe the point is being missed here.
I doubt that anyone regards the existence of an exploitable bug/feature in Linux to be good. However, ask yourselves why there is argument about whose "fault" it is...
In a complex system, one always tries to put the right solution in the right place. There are several ways this problem might be fixed. Choosing the wrong one might fix it more quickly, but may cause problems later. If speed is not the over-riding issue (and it seems it isn't) then thinking carefully (and this means arguing) about whose responsibility it is to protect against this problem is the correct response.
Once you have the correct protection installed in the correct place and everyone knows whose responsibility it is to look after this in future, then you have a more robust system. Failing to argue this out and fixing it the wrong way just starts you on the path towards a system that's unmanageable from a security point of view. I think you all know the example I'm thinking of...
@spendergrsec: Brad, you should really stop tooting your own horn and it would also help if you weren't unnecessarily rude . Everybody so far has acknowledged that the exploit is very impressive. Good work. I really mean that and have said it from the start. But please, don't let that go to your head.
I am trying to clarify to readers of El Reg who may not be experts in C or the Linux kernel (unlike the crowd in LWN), that contrary to what has been said, this is an ordinary run of the mill bug, which is easy to spot and fix in a regular code review, and it is not caused by a flaw in GCC.
@BlueGreen: Normally the hardware would catch the NULL pointer reference and it would result in a kernel oops. However part of the exploit is that it (relying on another bug) first maps valid memory at address 0. It really is a very clever exploit relying on unrelated kernel bugs.
The bug in question itself however is trivially noticeable and fixable. Any tool like LINT would have caught it (in theory; in practice it is not so easy to run LINT on the kernel).
When an article of this nature comes up for Mac or Windows, we flame each other, etc, but at the end of the day, the company fixes it. When this comes up for Linux, a whole bunch of code monkeys have a pissing contest ("I know more about coing than you, look at this crap I typed" and "mom, mon, he said I couldn't code, tell his mother so he gets spanked") and argue that its a compiler issue.
And this is why Linux is shite, its for code monkeys, who actually like to spend their weekends coding and compiling, rather than having a life. As they say, you get what you pay for.
TANSTAAFL
1) Big geek fight over who's fault it is - problem not addressed
2) Much finger pointing - problem not addressed
3) "It's a feature!" (of either the kernel or GCC) - problem not addressed
So Linux out in the field has (or will have...) a critical security flaw and the freetards are too busy waving their pocket protectors about to actually fix the problem. With an attitude like that, is it any wonder most organisations who rely on IT to run their business would not touch Linux with a shitty stick?
Get your act together you bunch of jumped-up primadonnas; you are not doing yourselves, Linux or the open source community any favours with your public bitch-fest.
This post has been deleted by its author
I normally wouldn't bother posting on this subject matter because I'm not a fan of Binux but I had a feeling the Binux geeks wouldn't be able to post without mentioning Microsoft and you didn't let me down. Great so you use Binux but get over your fascination in Microsoft and your hatred for those who prefer it over your free alternative. Yes Binux is free and people still prefer the paid alternative LMAO
I wrote: " Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough. "
AC replied above: "Now you wait just a cotton-pickin' minute there. Kernel development is hard-core stuff, and not suitable for amateurs and dabblers. "
I don't disagree. (I do have the background, I have been involved in a number of OS projects, have written kernel code, debugged production commercial kernels, taught computer architecture etc etc.) The point was that - no matter what the needs - it still clearly, as demonstrated, not enough. It wasn't this time, and I will bet won't be again.
One of my pet hobby horses comes to mind here too. A deeper problem that besets pretty well all code now is the whole idea of a null pointer. This is an age old problem that comes in many forms. Basicly you have out of band semantics being carried with the in-band data. In this case zero has special meaning for pointers. It could have been any number (ffffffff was mentioned above as an alternative, but it doesn't work any better). The semantics of "not-a-valid-pointer" is carried as a special case value. We are so used to this as the only way of doing things that we forget an entire generation of computer architectures that never suffered from such issues. Tagged memory was about in the 60's, but here we are, 40 years later, and the total dominance of the wretched x86 has left computer system design moribund. It is like the Apollo missions. 40 years ago great things were done. Now we can't even do the same stuff, let alone progress beyond.
The problem is that the developers chose a linear, text-based UI designed in the 1970s to convey their instructions to the computer.
The 1970s was the tail-end of the era when most computer applications were non-interactive, transactional processes. E.g. payroll runs, bank account systems and the like. You started the program, let the machines chew through all that (serial) magnetic tape, (serial) paper tape or (serial) punched cards and waited until it went "BING!"
Very few developers write linear, non-interactive applications like that any more. Yet we are still using textual programming languages. Consider that *all* languages are fundamentally linear: we read from left to right (or vice-versa in some societies), top to bottom, from the beginning to the end. Programming languages are *inherently* linear.
When you try and remove that linearity—e.g. with many OOP attempts—you end up with a language so stuffed to the gills with structural scaffolding and similar fluff that you end up with code that's obfuscated behind umpteen layers of brackets, punctuation and meta crap. Because—I repeat—all languages are INHERENTLY linear! No matter what colour you paint your cat, it'll still be a cat.
Now, textual user interfaces are still used in IT today, but they're no longer centre-stage. Most people prefer graphical interfaces. You can convey a lot more information visually than you can with words alone, yet software development tools are stuck with a "text in text files!" mentality that is arguably doing far more harm than good. Just because textual interfaces is how we've always written code in the past, it doesn't mean this is how we should continue writing code in the future.
Linear, text-file-centric programming languages are the wrong tool for the job. That there are practically no mainstream alternatives reflects very poorly on the IT industry and its conservatism. (If the FSF movement really wanted to make a difference, this is where they should be concentrating. The world really does not need more UNIX clones.)
The rise of multi-core CPUs should be a rallying cry to designers of development tools the world over. There's a massive market simply gagging for the right answer. C is not that answer. Neither are C++, C#, COBOL, Object Pascal, x86 assembly language or Java.
To researchers and students who are looking into this field, please do not invent yet another linear, text-based programming language. We have far too many of those as it is.
(I do have my own view on how programming *should* be done, but the essays I've written on the subject are long and unsuitable for a comments box like this. I'll post something to my website when I'm done evaluating CMS software for it.)
Until all the instances of dereferencing the pointer before testing for NULL have been fixed, you just set a compiler flag and the problem goes away.
AFAIK, this kernel has not been released yet.
Because it is open source, you can go and look at the source code and see the extent of the problem. And the kernel coders are 'embarrassed' into fixing it. Contrast that with closed source projects.
"With an attitude like that, is it any wonder most organisations who rely on IT to run their business would not touch Linux with a shitty stick?"
LOL - you clearly know nothing about the industry. I work in IT and vast swathes of companies rely on Linux mostly RHEL or SuSE for their business.
I dissagree. I am writting from my linex net book. It is brilliant for sitting on the sofa or in bed browsing the web. The MS version would have cost alot more to run well and had a disk HD, rather than the small SSD this can have for the same cost. It dose what is needed, and if it all gose tits up ill wipe it and start again.
Wouldent use *nix for much else though, short of servers, for just those reasons. The last thing I want to do when I get home from work is to piss about with code just to get the thing working.
BTW, this crashes about once a week. My Vista PC? Never crashed. Although im sure *nix geeks will call me a liar.
@Crazy Operations Guy + AC 09:19 + Defiant + others: I'm with you on the childishness of bringing MS into it. All shall have bugs: a serious one was a failure to honour FILE_FLAG_WRITE_THROUGH correctly in some circumstances for Win2k, this being fixed in SP3 (ie. it took far too bloody long). This was potentially very serious if you are, as I was, managing large DBs for clients, but did anyone hear about it? Nope. I only found out by accident. MS don't advertise their failures.
@Sean Timarco Baggaley: You were making sense up until the point you started talking about linear text. Do some research on these (eg. Visual Programming Environments, paradigms & systems, Ephraim P. Glinert), actually design a language & learn the lessons (which are: dataflow driven execution behaviour becomes totally obvious and... that's it on the good side) and use a visual language. I & a colleague used several. They were all horrid; unwieldy, slow to program with, had to rely on text to do anything significant etc.
Also don't confuse the linear representation of text with the concepts the text impart.
Certain stuff is a very good fit for graphical expression, such as smallish automata, which are much more comprehensible, and probably high-level UIs (perhaps inc. stuff like Mathematica). Much else isn't IME because the necessary abstractions can't be expressed graphically (how do you represent addition but with a + sign? or looping n times without numbers? Or a 'max' function without using the word? or 'pick the nth item from this array'? How do you identify subroutines if you can't use text to name them? etc. forever.)
@Francis Vaughan: Your comments on null make a kind of sense but tagged architectures simply offload complexity from the software to the hardware. Which I think is a fail. Keep hardware simple (& consequently correct) & fast, let the compilers do the work.
@Tzvetan Mikov - thanks.
@Tzvetan Mikov : Normally the hardware would catch the NULL pointer reference and it would result in a kernel oops. However part of the exploit is that it (relying on another bug) first maps valid memory at address 0. It really is a very clever exploit relying on unrelated kernel bugs.
So the exploit manages to (1) get code inserted at address 0, and then (2) exploits (waits for) an (unrelated) null pointer dereference to get it executed? Presumably (1) is the "overlooked weaknesses" and "whole class of vulnerabilities" the article mentioned but failed to provide any details on. Because fixing (2) should be trivial and seems to be a red-herring (along with GCC).
>For those who know C, this is the relevant code:
>struct sock *sk = tun->sk;
>if (!tun) return POLLERR; (Tzvetan Mikov)
This is poorly written code but it should do one of two things -- trap out or nothing in particular. Its not good code -- I personally dislike implied casts (just because everyone assumes a NULL pointer is a zero doesn't make it so and relying on the compiler fill in the blanks is asking for trouble). Assuming it does give a NULL pointer (more accurately, a NULL segment) trap then I'm not sure how an attacker could get code into that segment to execute. After all, one of the big weaknesses in Windows used to be that it never really did segmentation -- it just ran a big, flat, address space, so you could abuse it but I think even W. today will just trap out on a NULL access.
The fix is easy. Just revise the compiler options and rebuild the kernel (then go through the sources looking for this coding problem). That's one-upmanship over Windows....the whole thing can be done in less time than it takes to write a comment.....
"The fix is easy. Just revise the compiler options and rebuild the kernel (then go through the sources looking for this coding problem). That's one-upmanship over Windows....the whole thing can be done in less time than it takes to write a comment....." .... By Martin Usher Posted Monday 20th July 2009 04:13 GMT
And that creates a NeuReal Ruling Elite, Martin Usher, and therefore you can fully understand why the Status Quo Pretenders do not mend themselves/their ways, and are therefore Destined to be Purged Catastrophically and Unceremoniously from Power with the Control Operating Systems Collapse/IMPlosion.
This post has been deleted by its author
Wow. I must have missed all those Linux boxes strewn all over the worker's desks, ramming the control centres and generally saving life as we know it. Gosh and golly. Maybe I'm Linux blind?
Or maybe Linux only has less than 1% penetration for a very good reason. For an example, see this geeky bitch-fest about who has the biggest calculator, meanwhile the kernel could be sitting wide open to attack.
Well done. *slow clap* This is why your beloved kernel only sees the inside of geek's bedrooms and certain niche applications. Which is a bit of a shame really, as it could be quite good if it was managed and dealt with in a professional (and customer/end-user focused) manner. At the very least it might give the incumbent OS maker pause for thought.
But no - you keep waggling your GCC options in the air and missing the point entirely.
This is being blown out of all proportion
Kernel root hacks have been around since the dawn of linux, i don't understand what is new about this, sorry. Finding vulnerabilities is part of coding.
All operating systems are vulnerable when you have them at terminal level. The main challenge is any web-facing services, ensuring that you have security at that level - this is much more important. Microsoft's track record on this is not great at all compared to linux.
One comment for the binnex guy: go run your website on bindows without doing updates, see how long it lasts vs a 'binux' box.
For all of you muppets banging on about how C# or Java or some other managed code sanboxed language would be sooo much better than C - have you cretings stopped to think what actually does the managing? We're talking about code for an operating system here , yes OPERATING SYSTEM, the thing that operates the hardware. If it could somehow be written in a managed language what exactly pray would be doing the underlying managing? A bunch of magic elves?? Or perhaps there should be an even lower hyperviser layer that would do it! Yes, thats it! Oh , but what could you write that in then? Surely not nasty unmanaged C or assembler?
Why don't you muppet apps developers leave the system development discussion to real programmers who have a clue and you get on with writing your cutesy GUI apps in the fluffy managed language of your choice.
1) write a macro to pull out all the windows bashing comments made over the years, swap round the words Linux and Windows and repost them.
2) to keep a note of the url of this story for the next time some Linux user posts to a story about a Windows bug claiming how bullet proof their system is.
...me and some pals went on a class on how to make *assembly* code for PICs (IC). Yes, we were writing code straight in assembler to be compiled and transferred to ICs that would later run it (it was a program designed to run a simple LCD display).
Compiling the thing straight out of our feeble assembly skills was downright scary, because the compiler would bomb us without mercy. Guess what, things like null pointer checks on any sort were *disabled* by default, but nonetheless we were bombed, until we did exactly what the the teacher told us. Bottom line, the final code went for less than 2k of memory, and would easily fit in the IC memory. The LCD worked perfectly. Of course, the teacher was guiding us on how to avoid any booby traps we were planting beneath our own feet. Plus we were mapping every memory and register and nook and cranny used, according to the PIC structure we had, data-sheet on hands.
The teacher showed us the same code running out of a C compilation. The thing bloated to 10k or more, could barely fit in the IC, and would run *nearly* as fast (but in fact 50% slower to run, since the PIC and display IC would run at 4MHz, way faster than the tiny code we wrote, so we couldn´t tell it was slower). The code seemed so much simpler and care-free, though. The code looked sloppy. Any of us would have gotten that written in half the time we took on the real assembly hardcore stuff.
From this experience compiling straight out of assembly and opcodes, or using C, I fully understand now the kind of crap we run into, be that Windows or Linux. People are sloppy, they don´t care if their code is bloated, or will be so when compiled, and it is damned hard in hell to find a bug once it is past all compiler checks without a hitch.
As a mind exercise, try to compile *the whole* kernel out of assembly, and come back in 5 to 10 years after you have brushed every bit, tested every memory loophole pointer problem, and you will have a better kernel. From my point of view and tiny experience in programming, good things don´t come easy.
That´s why we have the bug or *feature* in the first place. that´s why we will probably find another. That´s why M$ never got their act together. That´s why *very* few people proactively search for bugs, because they just fall on your lap, when you are not looking for them.
Processors, even multi-pipelined and SMP processors, execute each instruction from an arc in a linear fashion, one at a time.
Data structures can only really be modified by one process at a time, this is why we have mutexes, and well understood criteria for claiming and releasing resources.
Programs, even with exotic event driving models are inherently linear. A task/thread/function is started (Possibly asynchronously.) it does something, it finishes, the caller picks up the result. That's how things work at the processor level, regardless of what language you're writing in, and for the most part it's how people think too.
We can already partition and refactor code so that an operation can be described in psudeo-spatial-distribution rather than in strict linear fashion, but some degree of linearity of the entire programming experience is inevitable as the guy at the keyboard will have to press one key, after another.
Your argument may as well be "The problem is computers as they exist today." Good luck inventing your hovercraft, but I suspect that at the end of the day you'll just invent your own wheel.
I happen to think you're a troll, but never mind.
@Everyone else, And I do mean everyone.
Conspicuously bad spelling, typing, and grammar stopped being amusing a few years ago, didn't you get the memo?
My main concern is that code like this (Using an unchecked pointer) got into the kernel in the first place. It opens up a vector for blackhats to introduce vulnerabilities.
Looks like this has been fixed in 2.6.30.2, according to
http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00706.html:
"[...] Fix NULL pointer dereference in tun_chr_pool() [...]"
After the patch the code in question looks like this:
struct sock *sk;
if (!tun)
return POLLERR;
sk = tun->sk;
I suspect 2.6.30 and 2.6.30.1 won't appear in many distributions.
and I'm still confused. Isn't the simple answer to just arrow down to the null check on tun, type dd (we are talking linux afterall ;), arrow up to the line initializing sk, and hitting shift+p? I get the impression the kernel isn't rife with this problem... as many have pointed out it *is* something one can find in a code review... what's the big deal?
@call me scruffy: Pedantry was never amusing in the first place. Didn't you get the memo?
"Looks like this has been fixed in 2.6.30.2, according to
http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00706.html"
So to sum it up we have a vulnerability that appeared in a kernel release not yet (and now never will be) adopted by a single *release* version of a Linux distribution, that a fix was available for 3 days ago (so almost same day as disclosure of the bug?) and that apparently required root privileges to exploit anyway - rendering it redundant.
We then have a torrent of Reg "commentards" writing off Linux as an operating system because OMG it has bugs! It's no better on the recent articles about IE and Windows security holes.
It all has the feel of the Daily Mail about it: a sensationalist article which neglects a couple of small but important facts, and then the predictable stream of knee-jerk reaction comments based on people's prejudices against this or that.
@boltar - while I certainly wouldnt suggest that using managed code is necessarily The Future(TM) for OS development how about you try widening that narrow mind of yours and check out the Singularity project
http://channel9.msdn.com/shows/Going+Deep/Singularity-A-research-OS-written-in-C/
http://research.microsoft.com/pubs/69431/osr2007_rethinkingsoftwarestack.pdf
This post has been deleted by its author
"You can convey a lot more information visually than you can with words alone"
True, but you can convey information a lot more quickly (and precisely) with words than you can visually. I can write a pseudocode description of how a particular function operates in far less than half the time it would take to create a simple UML sequence diagram that conveyed exactly the same information. That is, on the computer anyway -- on a whiteboard it would be probably about half the time, since I wouldn't have to drag and drop a bunch of icons with a mouse. A picture may be worth a thousand words, but it's often faster to write a thousand words than to paint a picture.
Looks like it is already fixed in latest kernel - 2.6.30.2 (released today..)
Looks like the opensource security method really works.
Compare the speed to this fix with some of MS's well know vulnerabilities (wasn't there a year long vulnerability fixed in ie recently?)
I don't think Singularity is particularly mind-broadening at all. It is an emulator for a non-native instruction set (CLR in this case) upon which you have run an OS for that arch. There was a project about ten years ago to do exactly this with Java, a language that was rather popular at the time. However, all such projects are doomed, for what (in the present discussion) is a rather interesting reason.
You see, an OS kernel is like an embedded system. You have complete control over all the interfaces and if you ship enough drivers as part of the system then it is entirely possible that every byte of executable code running at kernel level is your own. That makes tools like LINT (or PreFAST, for the Microsofties out there) especially powerful because they can do as much data flow analysis as you can afford. Under such conditions, most of the molly-coddling provided by a managed run-time can actually be done *statically*. Not only does this make the checks infinitely faster, it allows both the compiler and programmer to make further adjustments and improvements to the code.
One of the comments clarified what was going on; that an optimization, enabled by default at higher optimization levels, makes an assumption (about dereferencing a null pointer) which is false for kernel code.
If one is compiling kernel code, I would have thought that one would have to set some special compiler flags in order to do so. Thus, maybe the best fix for this would be to change the compiler so that -fno-delete-null-pointer-checks is on by default, no matter how high the optimization level, if the compiler has reason to believe it is compiling for an environment in which this particular optimization is invalid.
Back in the 1960's, many of the operating systems were written entirely in high-level programming languages, including their "kernel". For example, Burroughs used Algol for some systems, and the current generation of that line from Unisys still does.
The Multics operating system developed by Honeywell and MIT was written entirely in PL/I. Many of security and operating system concepts of today originated in Multics.
Many describe Unix as a "child" of Multics, but unfortunately it was of the earlier Multics efforts by MIT, RCA, and Bell Labs that was abandoned because of issues with the third-party-developed PL/I compiler required to implement everything else. After the split, RCA sold its computer division to Honeywell (who could build a PL/I complier, itself written in PL/I).
While those at Bell Labs were involved with design and other early work, they left before the widespread changes required to resolve issues found during developent and to reflect ongoing research. Since their initial needs were single user in a protected environment, they gave little attention initially to security, file systems, and such. (Much of this was later "fixed" as AT&T began using Unix for real projects and Unix was available to universities.) Their implementation language was "C", essentially an alternative assembler language for the PDP-11.
VMS/OpenVMS is a truer descendant of Multics on one side, the DEC TOPS-10 and RSX-11 operating systems on the other, and addressing security (e.g. "Orange Book"), clustering, sharing of resources,and much more, while requiring relatively few people to administer and support even massive networks of systems. Even today, none of the Unix, Linux, MS Windows, or OS X systems come close to matching the reliability, security, clustering, or ease of use found in VMS 25 years ago.
The VMS executive (kernel) is mostly written in BLISS, a system implementation language which DEC had already used to write some parts of TOPS and RSX. BLISS has features and limits much like C, lacking language features like I/O and UI, while producing modules that can run bare bones in restricted contexts, without access to most system functions or libraries. As I remember, the code to process device interrups had to be written in BLISS or MACRO, with C later allowed,
Any supported language could be and were used together or separately to implement the ancillary processes because they all used the language-independent calling standard and data structures. This is the same for user applications call system functions, with the automatic validation and verification of arguments and data eliminates errors that routinely plague Windows, Unix, and Linux. Interestingly, when the "Open" software support was added to VMS (thus OpenVMS), the UNIX-style system calls sometimes lacked the additional checking of their aguments required by VMS, resulting in several security issues involving VMS. VMS has been been ported from VAX to Alpha and then to IA-64, and it is now owned and supported by HP.
Finally, a note on tasking, threads, etc.
We also have known for decades how to avoid all the issues related to multi-tasking, threads, sharing of data structures, deadlocks, rundown/cleanup on abnormal exit, etc. The PL/I standards committee examined this requirement extensively about 25 years ago and determined that none of differing models found in existing implementations of PL/I could be the basis for the a standard. The committee developed a general model for "real-time" and the language constructs required to add this to the PL/I language, publishing an approved technical information bulletin to document their research and to guide possible implementors. This model uses block-structured critical regions to contain access or modification of shared data, has high-level scheduling and locking functions, static (compiler) deadlock prevention with inclusion of a constraint on nesting shared regions, and dynamic (runtime) deadlock detection without the constaint, and much more.
This model allows you to code things like the following (I forget the exact syntax):
WAIT (InputQueue^=NULL() & NumHandlers<MaxHandlers) LOCK(InputQueue,NumHandlers) BEGIN;
ItemPtr=InputQueue;
InputQueue=InputQueue->NextQueueItem;
NumHandlers=NumHandlers+1;
END;
Exiting the Begin block by reaching the END releases the locks. A GOTO to a label outside the block also releases the locks.
Compare this with what is required to do this safely in various programming languages.
"Thus, maybe the best fix for this would be to change the compiler so that -fno-delete-null-pointer-checks is on by default, no matter how high the optimization level, if the compiler has reason to believe it is compiling for an environment in which this particular optimization is invalid."
Exactly, and the fix for this was added at the same time as the fix for the main bug, on friday 17th July:
http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00705.html
So much for the rants of some of the Reg commentards, I especially love this one, posted Saturday 18th July 2009 15:21 GMT nearly a day after the bug was fixed: "The Linux kernel has a known, demonstrably exploitable security problem in the field, and the kernel developers do not wish to fix it. ...No, the reality is that too many Linux zealots including the kernel developers refuse to ever accept they're wrong on anything."
One thing that's interesting about the original
struct sock *sk = tun->sk;
if (!tun)
return POLLERR;
is that it is valid, working code. "tun" is not dereferenced: the address stored in "sk" is the sum of the address of "tun" and the offset of the sk field within it. Optimising out the test is an unsafe optimization that I would expect will be fixed in gcc before long.
What's also interesting is all the guff about "this wouldn't happen if the kernel was written in some high-level language".
Possibly.
Actually, no. Not at all. I cannot think of a single language with enough expressiveness to write an operating system that doesn't have its own class of weird, annoying and sometimes not even subtle errors.