I'm surprised there isn't already a #pragma directive to say to the compiler: "Don't be clever here, just do as I say".
How the GNU C Compiler became the Clippy of cryptography
The creators of security software have encountered an unlikely foe in their attempts to protect us: modern compilers. Today's compilers boil down code into its most efficient form, but in doing so they can undo safety precautions. "Modern software compilers are breaking our code," said René Meusel, sharing his concerns in a …
COMMENTS
-
-
-
Tuesday 10th February 2026 11:45 GMT fragcula
Not to mention that the "proper" way to do this is have separate constant time implementations (that is if you don't want to use a cryptolibs memcmp intrinsics).
`-O3` is literally go as fast as possible and break things. It probably even ignores no inline markers .
Your hand rolled crypto (if you dare make one) should be compiled `-O1` probably and with careful boundaries (like not always having LTO on).
-
Monday 16th February 2026 16:19 GMT David Brown 2
Why bother commenting about gcc flags if you have no idea what they do? -O3 does not mean "go as fast as possible", does not mean "break things", and does not mean "ignore no inline markers". When you say "literally", did you bother to read the literature - the gcc reference manual?
-O3 simply means enabling more optimisation passes than -O2, including some that may make the code bigger, may make some code slower rather than faster, and that may take noticeably longer to compile. Baring compiler bugs (which exist, but "wrong code" bugs are very rare in practice), correct C code gets compiled to assembly code with the same effect, regardless of optimisation flags.
The trouble here is that the developer wants "effects" such as constant time execution that cannot be expressed in C - and thus the compiler cannot give any guarantees about it. The developer must therefore find ways to express the semantics he wants, beyond those of C, in a way that the compiler can implement. The "obfuscate" empty inline assembly is a good way to handle this, though the name chosen is poor. (I have used such code myself, for other purposes.)
-
-
-
Monday 9th February 2026 14:07 GMT Bebu sa Ware
there isn't already a #pragma directive
There is and I have had to use it for totally non crypto/security reasons (forcing ctor functions to run in static links.)
[gcc manual 6.62.15 Function Specific Option Pragmas]
#pragma GCC push_options#pragma GCC optimize ("O0")
int sensitive (...) {
....
}
#pragma GCC pop_options
These attributes can also be applied with the function __attribute__() syntax as someone has also noted.
[6.31.1 Common Function Attributes]
-
Monday 16th February 2026 16:23 GMT David Brown 2
Re: there isn't already a #pragma directive
Relying on "-O0" for effects is a bad idea. C compilation does not have a concept of "no optimisations"; and the passes used and code transforms applied at different optimisation levels varies for compiler versions. Modern gcc does plenty of transforms at -O0 that would have been considered "optimisations" in older generations. Sometimes there are specific compiler passes that you might want to disable manually with a pragma or function attribute, but the "empty assembly" trick is generally safer. (Volatile accesses might also be suitable, and more portable, but they may have efficiency issues.)
-
-
-
Saturday 14th February 2026 11:27 GMT ilpr
This is irrelevant
Protecting from timing attacks is not about instructions, it is about timing: if you have a constant length wait after unsuccesful guess you don't need to mess around with any of these.
And proper hashing is meant to prevent guessing the original password since changing one bit will change the whole hash thus making it unguessable.
-
-
-
Monday 9th February 2026 13:13 GMT chuckufarley
There is a section in the Gentoo handbook that specifically warns users against setting -O3 globally in the make.conf $CFLAGS. Linux From Scratch also warns against using it in their handbook. Even the GCC manual itself warns that -O3 can and will break some code because it is so aggressive.
-
Monday 9th February 2026 19:58 GMT bazza
-O3 being dangerous strikes me as somewhat absurd. If the compiler is building code that does not implement the as-written source's functionality, then it's not acting as a C compiler. Instead, it's acting as a nearly-but-not-quite C compiler that goes wrong in exciting and arcane ways.
Worse still, it's misbehaviours like this that make life a lot harder for developers. If one cannot trust the compiler to build correct code no matter what the build options are, then the build system's configuration and binary testing becomes a whole other thing to worry about, on top of whether or not the source code is correct.
Meusel and colleages are to be deeply congratulated for being alive to the problem and having a means of noticing the change (even if it was luck, or curiousity!). To set up a build / test system that could reliably spot when critical execution time variations have been changed by the compiler (when nothing but the compiler version has changed) - that's a ton of effort to make it work reliably.
Another interesting problem is, it's not just the compiler. All a compiler does is produce op-codes. These days, the op-codes get interpreted by the instruction decoder pipeline and broken down into instructions the core will actually run, and its the timing of these that actually matter. Really, the only way forward with instruction decoders is to make them do more and more complex analyses, much like compiler optimisations. If they start getting good enough to chop out nugatory sections of code...
Operating systems like INTEGRITY are quite interesting. The versions of that which I have used execute on a single core only, and dole out execution time to processes on a fixed-allocation basis. That's probably the only way to be totally sure of not having timing side channels.
-
Monday 9th February 2026 21:20 GMT Crypto Monad
-O3 being dangerous strikes me as somewhat absurd. If the compiler is building code that does not implement the as-written source's functionality, then it's not acting as a C compiler. Instead, it's acting as a nearly-but-not-quite C compiler that goes wrong in exciting and arcane ways.
It doesn't "go wrong". It transforms your code so that it does the same thing but faster. That's unless:
(1) you're doing something which has "undefined" behaviour in the C spec (which is quite a lot). In those cases, the compiler can make the code behave more or less however it likes. But then, the behaviour is undefined with or without optimization.
Some examples: https://mohitmv.github.io/blog/Shocking-Undefined-Behaviour-In-Action/
(2) you're doing something which depends on timing, which is the case here. The C spec has nothing to say about timing of the generated assembly language, and as long as it gets the same results according to the spec, it can shuffle things around.
Note that this is not limited to gcc. Clang can give equally surprising behaviours:
https://research.swtch.com/ub
To reiterate, this is *not* a bug in the compiler. If anything, it's a bug in the language which explicitly permits your code to be transformed in ways you don't expect.
-
Tuesday 10th February 2026 05:41 GMT Blazde
It's not even just the language, but any higher-level language. The old 80-bit (& 40bit) Intel floating point format made it basically impossible to have a deterministic result from FP calculations because the final results depend on whether your intermediate results stay in 80-bit registers or get pushed out to 64-bit memory locations, and abstracting away registers & memory is about the first job a higher level language has. The only way around this is to break through the abstraction layer and be hardware-aware but then you're outside any platform-independent language spec.
These days memory ordering is a more complex example. Different hardware does it in different ways and compilers need to be able to abstract that away by saying, essentially, 'results may vary' and you as the programmer must say really clearly when that's not acceptable, and you must use imperfect tools for that, invariably with some idea of the hardware involved.
Or more simply when you tell the compiler to do something but your program then overflows the stack - "I didn't ask for that!" Yet, without 'solving the Halting Problem' the compiler can't know in general when the stack will overflow.
If compilers didn't do all this then compiled code would be unimaginably slower and the language spec would be absurdly pedantic because it would operate to some lowest-common-denominator virtual machine. (But not like the Java Virtual Machine, that introduces a whole new class of side-channel vulnerable non-determinism via the magic of garbage collection...)
-
Tuesday 10th February 2026 07:52 GMT bazza
> To reiterate, this is *not* a bug in the compiler. If anything, it's a bug in the language which explicitly permits your code to be transformed in ways you don't expect.
I've got no problems with compilers spotting and stripping out pointless code that has no side effects or consequences to the final state of the program. It's when they generate code that produces a program with a different final state to that specified by the C source code. That's when the compiler is no longer a C compiler.
I have come across this in some compilers. The one in IAR Workbench for AVR's was at one point producing broken code if one turned on optimisation.
I've no idea if -O3 in gcc falls into this category. I can see why a "myth" could build up that it does (especially as it starts moving code order around), but the gcc folk are generally pretty careful.
> The C spec has nothing to say about timing of the generated assembly language
Indeed so. Though it's interesting; C started off back in the 1970s intending to be just a thin veneer atop assembly code, and therefore gross optimisations (like chopping out side-effectless code) are somewhat contrary to this philosphy (because, an assembler wouldn't carry out such optimisations). In this regard, C is no longer a "Systems" language, at least not with today's compilers implementing it; one cannot guarantee in source code alone what the system behaviour will be. System behaviour depends more than ever on how the code has been built. That's a perversion of what was intended. The side effect is that stuff like security (which is pretty important) can get broken, and is vulnerable to being silently broken despite zero code / build script changes.
It's all very well saying "but the standard says this is OK", the problem is that there were some things so intrinsic in expectations that no one thought to put them into the standard. Simply hiding behind the standard is akin to "just following orders" responsibility denial, especially given that the compiler authors are pretty mixed up in the standards creation process itself.
Perhaps the standard should be updated, so that where such things matter the source code can insist on specific system outcomes.
-
Wednesday 11th February 2026 19:05 GMT Blazde
I don't know about the 70s but by the mid-90s assemblers were doing basic run-time optimisations, notably for long/short x86 jumps and other idiosyncrasies, and the assemblers capable of optimising code this way efficiently were idolised by the most hardcore assembly coders. This already breaks side-channel timing (which was also already broken by the hardware) and absolutely nobody cared about it at the time, and this was when C was regarded as slow compared to hand-coded assembly.
If you want to say that there should be a language that respects timing, suitable for the most secure computation including cryptography, I'd wholeheartedly agree. Maybe it could even be based on C. But pretending that was an intrinsic expectation all along is disingenuous. C, and assembly, are general purpose and the general case really did demand efficiency. Even more so in the past than now.
-
-
-
Monday 9th February 2026 21:26 GMT david 12
-O3 being dangerous strikes me as somewhat absurd.
It's always been that way, ever since c compilers were advanced enough to have an "optimizing" state and a "mostly correct" state. And not just for GCC. Corner cases and weird constructs aren't noticed in testing, and high optimization is at the forward edge of compiler development.
then it's not acting as a C compiler.
That's another part of the problem he's identified: that people think that there is a standard c, and that it is enough to build a compiler that implements standard c.
-
Monday 16th February 2026 16:35 GMT David Brown 2
It has never been that way with any of the C compilers I have used (and that's several dozens, on a dozen architectures, over three decades), except on rubbish tools. (And I've seen a few of these.)
It is not uncommon for compilers to have a few "experimental" flags for newer optimisations that are not well tested, but documentation generally marks them as such and they will not be enabled by typical "optimisation level" flags. Like most complex pieces of software, compilers have bugs, and it is fair to say that if you combine obscure or rare source constructs with new or rarely used optimisation passes, your risk of hitting a bug increase. But the risks are still small. (And the developer here is not seeing any kind of compiler bug.)
The reason optimisation appears to be "dangerous" to some people, is that they have subtle bugs in their code. If you have undefined behaviour in your code, you might well see different results depending on the optimisation passes enabled. But that's not because of "dangerous" optimisations - it's because of buggy source code.
And there is a standard C (several versions, the latest being C23). But standard C does not handle timing in any way, so the developer cannot express timing requirements in standard C. Thus he needs to use compiler features beyond that - such as the empty inline assembly.
-
-
Tuesday 10th February 2026 08:18 GMT mihares
In our case, what I think happened (and I am way too lazy to check whether this is actually true or not, it's just how I remember it a few years later) is that -O3 gives the permission to the compiler, among other things, to reorder the instructions as long as the result looks the same.
For the vast majority of *good* code, this is fine. There are cases where this is not fine even for good code, and for good reasons, hence the warnings about using -O3 in various documentations.
HEP (High Energy Physics) simulation code is most definitely *not* good code and I was most definitely not paid enough to fix it all --> I did not use -O3 to compile it.
-
Tuesday 10th February 2026 13:06 GMT Blazde
Reordering of instructions is a way more fundamental to computation and has a much deeper history than -O3 in GCC. Practically, it begins at hardware level, quite widely by the 90s, for the sake of efficiency and performance, and once the hardware is doing it why shouldn't the compiler too? At any level of optimisation. So the language specs embraced it after that: "This is fine as long as the result looks the same."
The problem is that 'looks the same' has different meaning depending on your model of computation: Is the end result the same? Or is it okay to be within a couple of epsilon of the result? Or do you require it to use exactly the same time/memory/bandwidth/etc resources in order to avoid any side-channel analysis (good luck figuring out all of the things in this list because it's endless)? Or is it fine if the computation stalls sometimes for a full half a second to run GC and JIT-optimise the code if that makes everything more efficient overall? Or should the almost-accurate result just be produced with some balanced minimum of resource usage possible because common sense says that's what's wanted 99% of the time and sometimes helps us get our result this century, even if that means the compiler pre-computes lots of things and hardcodes the answers?
Clearly, the acceptable computation models for HEP and for side-channel resistant cryptography are very different. Anyone involved in either should have an appreciation for the trade-offs.
-
-
-
-
-
Monday 9th February 2026 12:50 GMT Mishak
The problem here
The problem here is that programmers are trying to get the compiler to do something it is not required to do - the C Standard defines an abstract machine that is used to execute the program, and any attempt to "force" it to do things is likely to fail. Temporal behaviour is not part of the specification, so there are no guarantees as to what will happen if the programmer is trying to write isochronic code (where the execution paths all take the same amount of time).
Any infeasible paths are likely to be removed during optimisation (which are effectively a set of mathematical transformations that are applied to the code), and "turning them off" may not remove all optimisation. Compilers are also free to "mess" with values, as long as the code behaves "as if" it was what was intended. For example, have a look at this Compiler Explorer example that shows Clang "changing" the values of enum constants and "forgetting" to call an error handler (note that the code produced by Clang is fully Standard Conforming). Reducing the optimisation level to '0' does 'fix' the code (it is still broken, but the infeasibility and undefined behaviour to not manifest).
The example also shows a function that erases a password buffer before it is returned to the memory pool. The write operations used to erase the buffer are strictly unnecessary writes as the values written are never subsequently used as they are followed by a call to 'free' (and this did use to happen). However, compilers are now taking this sort of use case into consideration to help programmers - even when the behaviour is not required by the standard.
-
-
Tuesday 10th February 2026 17:08 GMT bazza
Re: The problem here
Doesn’t help if the machine is going to be expected to process millions of such checks per second. Isochronic code is intended to be as fast as the machine can achieve and no slower. A timer means a compromise.
Also, there’s consequences in cache usage which another process can probably sense; the process that goes to sleep having completed its function would have to find something negatory to do, with the problem that the compiler may strip that out..:
-
Tuesday 10th February 2026 20:55 GMT Paul Crawford
Re: The problem here
Doesn’t help if the machine is going to be expected to process millions of such checks per second.
But most sleep() style calls simply suspend the thread to the OS so you are not a huge drain on CPU resources. True you might have a lot of threads running at high throughput but the fixed-time value need not be huge and cause a big memory budget/cache hit on the thread's memory use. For example, if your call in 0.01-0.03 as an example, you could pad it to 0.1 and only have around a 5-fold thread use increase.
Alternatively, by forcing a delay on each attempt it makes brute-forcing harder to do unless the attacker can do it from a huge number of IP addresses or whatever to escape the rate-limiting impact of a delay on each caller.
-
-
-
-
Monday 9th February 2026 12:53 GMT The Mole
Optimizers optimize
"Meusel ran a constant-time implementation through GCC 15.2 (with -std=c++23 -O3"
The -O3 is telling the compiler to to optimization and then he is complaining about it doing optimizations?
Most of the suggested options are, well dumb, they are trying to trick the optimizer with a hope that it won't get more intelligent in a later release, if you don't want the optimizer to optimize than the best thing to do is explicitly tell it not to - looks like that functionality exists by declaring the function with __attribute__((optimize("O0")))
-
Monday 9th February 2026 15:19 GMT GBE
Re: Optimizers optimize
"Meusel ran a constant-time implementation through GCC 15.2 (with -std=c++23 -O3"
The -O3 is telling the compiler to to optimization and then he is complaining about it doing optimizations?
Exactly. He told the compiler explicitly to do whatever it can to make the code run faster, and then bitches about the code running faster.
I've recently discovered a dangerous flaw in the design of my car! When I take my foot of that pedal and press it on that other pedal the car suddenly STOPS!! This could happen right in the middle of the freeway!!! People could be killed!!!! Something must be done — Think Of The Children!!!!!
That said, even when you "turn off" optimization with -O0, the C standard still allows the compiler to do anything it wants that still produces "correct behavior". The definition of "correct behavior" does NOT (and never has) included constant (or even predictable or consistent) execution times.
-
Tuesday 10th February 2026 15:07 GMT JoeCool
Re: Optimizers optimize
Exactly what I was going to say about optimizations, Thanks.
Looks like not every FOSDEM presentation is worth a Reg writeup.
"The Clippy of cryptography" I don't actually understand what's being implied, but I suspect that's a problem with the developer, not the compiler / language.
-
-
-
Monday 9th February 2026 18:32 GMT doublelayer
For strings less than the hash length, the explicitly lengthened comparison is still faster. For something longer than the hash length, comparing hashes is both constant time and faster, except calculating the hashes to compare is not. That was also a simple example of a thing where optimization can introduce additional gaps, and since many sensitive comparisons already use hashing, not the biggest of them.
-
Tuesday 10th February 2026 00:53 GMT Gordon 11
For strings less than the hash length, the explicitly lengthened comparison is still faster.
This is a password check. A user has just typed something (which, to a CPU, is like watching glass flow). Speed should not be an issue here.
So why not just implement a random(ish) wait before returning the result?
-
Tuesday 10th February 2026 06:17 GMT Autonomous Mallard
Probably "good enough" in this example, but there are plenty of applications where random waits would kill performance and constant-time functions are necessary.
It would also be possible to filter out a random wait with enough iterations if the function call can be replayed with the same input.
P.S: Glass is not a slow-flowing liquid (unless it's molten, obviously). We used to be bad at making windows flat, which is why old windows look like that.
-
Tuesday 10th February 2026 14:40 GMT DoctorPaul
Which is why it is called "antique glass" even if it's freshly made, if I recall correctly from those stained glass workshops all those years ago.
As for slow flowing, I thought that I heard that give it a few hundred years and glass panes will be thicker at the bottom, but maybe I hallucinated that ;-)
-
Tuesday 10th February 2026 18:03 GMT brainwrong
Glass
Old methods of plate glass manufacture couldn't get the thickness very even. It makes sense to put the heavier bit at the bottom.
Glass is solid. An experiment you could try is to smash some glass, find some sharp bits, store them for a long time, taking care not to damage the edges, and see it they're still sharp years later. If it were a liquid then you would expect surface tension to blunt the edges. I haven't done this, but maybe your descendants could report back in the future.
-
-
-
Tuesday 10th February 2026 06:33 GMT doublelayer
Because, as we've had to say a few times, the password check thing was an example to simplify the reason for needing constant time so they could get to explaining their point about the compiler. Real password checking does use hashes, and it has nothing to do with constant time. It has to do with not storing the password in plain text.
I was instead responding to "Looping through each character is inefficient". If the string you're comparing is smaller than a hash, it's not inefficient. It's quite a bit more efficient. This is especially true if what you're doing involves a large number of string comparisons as various security algorithms require. In password checking, that's not a problem, and a lot of password hashing specifically uses extremely inefficient hash algorithms to make brute forcing harder. Other algorithms don't do that and that's where timing attacks become a bigger problem.
-
-
-
Monday 9th February 2026 22:13 GMT OhForF'
>The user types in a password, which gets checked against a database, character by character. Once the first character doesn't match, an error message is returned.<
My concern here would not be a side channel attack but how to keep that database secure Storing the password in clear text in the database in 2026, really?
-
Monday 9th February 2026 13:48 GMT alain williams
Can someone please explain to me ...
We should not be comparing passwords character by character since we should not store passwords in clear text -- as it is a nightmare if/when some cracker gets hold of the password database.
We should store a hash or message digest of the password and when testing a candidate password hash/md it and compare that. Converting to a hash/md means reading *all* of the password and then comparing the result will give you no clue as to which pass of the candidate password is bad; so a timing attack will not work.
Anyway: the first step of authentication will be getting the password hash/md, very likely from a database. The time to access from a database will take vastly longer than character by character comparison.
So: I do not understand. Can someone please explain.
Thanks
-
Monday 9th February 2026 14:19 GMT Anonymous Coward
Re: Can someone please explain to me ...
@Alain_Williams
Yup......explanation is simple. Namely.....bad guys attended the FOSDEM talk........so the examples were MISDIRECTION!
Probably if legitimate programmers want THE REAL SCOOP on GNU C, they need to pay extra!
So.....not only misdirection, but a marketing ploy to get more revenue.
Is this paranoid enough? Probably not!
-
Monday 9th February 2026 15:58 GMT kmorwath
Re: Can someone please explain to me ...
How do you compare hashes that doesn't fit wholly in a CPU register? And if the hash is stored as a string, and the code still does a string comparison?
Hashes do not allow to reconstruct the plain-text easily - but if you can probe a stored hash directly to find a match.... is not different than comparing a plain text password - the elapsed time will tell you how far you went.
-
Monday 9th February 2026 17:00 GMT Bill Gray
Re: Can someone please explain to me ...
Not quite. It will tell you how much of the (salted) hash of the password you entered matches the (salted) hash of the "real" target password.
So if I know the salting/hashing method used on the server, and enter 'password1' (hashed to 0xf43a...), and then I enter 'letmein' (hased to 0x314b...), and the first returns faster than the second, I can say that the hashed password starts with 0xf.
Then I generate sixteen passwords that will hash from 0xf0... to 0xff... to get the next digit. And then sixteen more to get the third digit, and so on. The hash may be to bytes rather than to an ASCII string, so I may proceed a byte at a time rather than a digit at the time, but you get the general idea. If the hash is compared four bytes at a time (as 32-bit integers), I'll be in more trouble.
Anyway. Near the end, having carefully worked out digit by digit or byte by byte, I'll have to come up with a password that hashes to most of the target, and eventually one that hashes to the entire target (with cryptographic hashes generally chosen to make that practically impossible.)
I will also have to deal with the fact that the difference in timing due to this effect isn't going to be a heck of a lot, but differences due to network lags will be significant. So I'm looking for a very faint signal in a sea of noise.
tl;dr : can't say I'd worry about this particular attack.
-
Monday 9th February 2026 21:29 GMT mj.jam
Re: Can someone please explain to me ...
> Then I generate sixteen passwords that will hash > from 0xf0... to 0xff…
But this is the hard part. Even given the hash, there is no simple way to generate these passwords. On average, you would have to generate 256 passwords to find one that starts 0xf4, and then 16^3 to find one that is 0xf43, then …
And if these are salted, then you can’t even precompute passwords that have certain hashes
-
Tuesday 10th February 2026 16:18 GMT Bill Gray
Re: Can someone please explain to me ...
Yes, that was exactly my point. The first few digits are easy. Then it gets sixteen times harder with each digit. Then you need a runtime exceeding the age of the universe.
Re salting : if you read the second sentence in my original post, you'll note that I assumed you somehow know both the hashing and salting scheme. I could have added that neither will generally be the case. But to assume those haven't leaked is to rely on "security through obscurity". Evaluation of cryptography usually assumes that the algorithms used are known to everybody.
Basically, to get an even vaguely plausible attack this way, you have to give the attacker every advantage. They have to know the salting/hashing scheme, and they have to be able to spray a bazillion passwords so that they can detect the actual variation in timing.
-
-
-
-
-
Tuesday 10th February 2026 05:04 GMT Fido
Re: Can someone please explain to me ...
I'd go with the time traveller hypotheses.
At this point modern hardware does not run any code in constant time, so expecting a compiler to produce a binary executable that runs in constant time is unrealistic. The focus needs to be on algorithms that don't require constant time execution to be secure.
There is evidence that the particular vulnerability of Rijndael-based cyphers to timing attacks was known but undisclosed at the time the AES was ratified. Hardware implementations of any encryption system can't be easily audited and are still subject to automatic optimisation.
At the same time, I'm surprised a time traveller isn't more focused on quantum-secure algorithms.
-
Tuesday 10th February 2026 17:31 GMT bazza
Re: Can someone please explain to me ...
Pretty sure that AES is simply a data transformation function, it does make any decisions based on the inputs to that algorithm and will take the same amount of time for equal length inputs. Corrections welcome, if this is wrong!
The wider system that uses it may well be time variant on different inputs.
-
Wednesday 11th February 2026 03:39 GMT Claptrap314
Re: Can someone please explain to me ...
Oh, wow. CISA actually asserted that table-based timing attacks were not an issue? In 1997? Mind blown.
Equally important is that Pentium & Athlon don't appear to support cache locking, so uggh...
On a single-threaded processor, preloading the table into the L1 might get you out of this, but uggh...
-
-
-
Monday 9th February 2026 18:39 GMT doublelayer
Re: Can someone please explain to me ...
This was an example that could be clearly explained in an article. Describing how timing attacks work on different parts of a cryptographic algorithm would take longer. But it's not as irrelevant as you think because strings other than passwords get compared as well, and sometimes those strings could have value too. For example, usernames aren't hashed, and if you could poke at a system and get valid usernames, you might have more information to use against it. It's never quite that simple, because timing attacks alone won't work against a lot of remote systems due to the randomness of network latency and system load, but with each additional level of comparison, explaining how an attack would work requires more and more typing.
-
-
Monday 9th February 2026 14:23 GMT Chris Gray 1
'strict'
It's been a long time, but doesn't Fortran have a rule that says something about evaluation inside parentheses cannot be moved outside of those parentheses? I believe it related to evaluation of expressions involving very small floating point values.
Because of that I long ago put a 'strict' construct into my Zed programming language. The intent is that stuff directly within the range of the construct must be executed as they appear. There were a couple of provisos needed, but something like this would seem to be more reliable and universal than things like setting the optimization level of an entire C function. Maybe for C30? (Kidding!)
-
Monday 9th February 2026 14:38 GMT Bebu sa Ware
"password, which gets checked against a database, character by character"
Hopefully only a straw man for this example otherwise I am not too sure I would want M. Meusel lurking in the vicinity of my code.
I assume (purely because I would leave this sort of coding to those that know how to do it safely) that the clear text password would be read into secure memory (mlocked/mprotected ?) in its entirety before being hashed etc and verified with the clear text obliterated at the earliest opportunity.
The timing for a successful verification would be contrived to be indistinguishable from a rejection (whether rejected on the basis of an invalid identity or incorrect password or whatever.)
This stuff is hard enough without having to worry about compilers' enthusiastic optimisations. Pehaps GCC might be augmented with a —Ocryptographic_sensitive flag, attribute or pragma to render the code deterministic with respect to timing with code generation skewed towards the security concerns.
-
Monday 9th February 2026 14:48 GMT Richard Tobin
"Hazardous" optimizations
"good C programmers know to fear the aggressive optimization of Boolean logic, which can be hazardous to their finished products". The optimizations described in the linked article are quite different from the problems facing cryptographers. The optimizations in the linked article are just wrong, and a compiler that does them is not conforming to the C standard. The problem for cryptographers is that they want to specify behaviour that isn't part of the C language, namely timing.
-
Monday 9th February 2026 16:12 GMT Anonymous Coward
As someone who spends a lot of time compiling of C into binary patches, I can say that GCC is a real pain in the ass, more than any other compiler. There are some optimizations and transformations it does to the code that can't be turned off. I use C as a kind of machine-independent assembly language and it's the perfect language for that purpose, with most simple straightforward compilers. But GCC always tries to be clever and abstract the code in the way that high-level languages do and the resulting code is often more instructions than necessary, more stack usage and generally reorganizing constructs in the most awkward way possible. I just want it to do what it's told and and turn a simple piece of C into a simple piece of assembler. I think the GCC developers have their heads in the high-level language world and don't understand the use-cases of embedded developers.
-
Monday 16th February 2026 16:44 GMT David Brown 2
C is not, and never has been, a "machine-independent assembly language". One of its aims - which it handles admirably - has been to reduce the need for developers to write assembly code. If you do not understand that C is a high-level programming language, defined in terms of an "abstract machine" and compilers generate code that matches this only on the "observable behaviour" (i.e., volatile accesses and data that goes into and out of the program), then you do not understand the language or the job of a compiler. I have been using gcc for embedded development for nearly three decades on some ten different target architectures - the GCC developers understand embedded development just fine. (Most of the time, anyway.)
-
-
Monday 9th February 2026 17:25 GMT Claptrap314
This is silly
If you are worrying about side-channel attacks, then you better at least consider the implications of physical access. That means worrying about the bit patterns that are flowing through the execution units on the microprocessor. And if timing is your concern, you don't just worry about loops, you worry about memory access patterns. Look into locking cache lines and the like.
NONE of these are part of the C standard, which means that you have to hand-assemble. Even a new version of a compiler might change behavior. You have to hand-assemble. This isn't nearly as nasty as it sounds because 1) gcc at least allows symbolic assembler (so you don't have to worry about register assignments), and 2) modern processors tend to have AES primitives built in.
Oh, and since you ARE doing hand-assembly, you get the flag register now. You're gonna love it. No more expensive (and probably wrong) checks for overflow, for starters.
If that was the point of the talk--to drive home to the newbies what their seniors should have hammered into them already, I get it. But I've never been in that part of the business, and I've known all of this for decades.
-
Monday 9th February 2026 23:46 GMT Anonymous Coward
Re: This is silly
Exactly, and there is plenty of justification for doing this, but the real complaint is that they want to use optimisations without thinking about hardware level operation... When designing code that needs to act in a specified way at the hardware level. It's obvious any automated functionality could be incorrect.
-
-
Monday 9th February 2026 17:27 GMT billdehaan
This problem existed long before Clippy
I was writing drivers and test software for customized video controllers in 1986 in Microsoft C version 4.
Back then, you could compile to tiny (64K for data and code), small (64K code, 64K data), medium (1MB code, 64K data), or large (1MB code, 1MB data) executables.
Most of the video chips I was coding for addressed 256x256 pixels, but were only addressable by line. In order to change one pixel, I needed to refresh the entire line of 256 pixels it was in. This meant keeping copies of what the other 255 pixels were. An array of 256x256 bytes (8 bit colour) took 64K of data. That amount of memory is a rounding error today, but this was the 1980s, where the 640K barrier was a real thing. Memory was really tight, so it was a struggle to get everything to fit.
Version 5 of the compiler introduced a ton of improvements, as well as the huge model, which allowed for 32 bit data pointers. This was a godsend, and would allow us to use memory above 1MB, so we switched to it.
We recompiled our driver code, and everything broke.
To reset the screen, I built a 256 byte array, filled it with black (0x00) pixels, and then pushed it iteratively to all 256. I then repeated it with all white (0x0F), then black again. This was what the hardware required to initiate the screen.
Instead of a white screen, we got one white line at the top, and nothing else.
Disassembly of the executable showed that only line was being sent, so that made sense. The question was why.
The code (going from 40 year old memory, so possible not totally accurate) was:
<small>
unsigned char buf[256];
memset(buf, 0, sizeof buf);
for( line = 0; line < 256; line++ )
{
API_buffer_update( line, buf);
API_hardware_delay();
}
memset(buf, 1, sizeof buf);
for( line = 0; line < 256; line++ )
{
API_buffer_update( line, buf);
API_hardware_delay();
}
API_display_refresh();
</small>
What I discovered was that the compiler differentiated between deterministic and non-deterministic functions, and optimized them. It mistakenly decided (some) user libraries were deterministic when they really weren't.
Basically, it thought "hey, you're calling the same function with the same arguments repeatedly in a loop. So as far as the optimizer was concerned, a better way to do this was:
unsigned char buf[256];
memset(buf, 0, sizeof buf);
API_buffer_update( 0, buf);
API_hardware_delay();
API_buffer_update( 1, buf);
API_hardware_delay();
API_display_refresh();
and that's what it generated as output.
The solution was to disable optimizations. This was a shame, as the executable from the v4 compiler ran in about 3 minutes, with the v5 taking only about 30 seconds, a dramatic improvement. Disabling optimization, it was almost 4 minutes, slower than v4, because it disabled all optimizations. But it worked, which was more important.
Fortunately, they release v5.1 of the compiler soon after, which didn't assume all user library calls were deterministic, so it stopped optimizing loops out of existence like that.
The details are different, but the problem remains the same, 40 years later.
-
Monday 9th February 2026 19:56 GMT Richard 12
Re: This problem existed long before Clippy
The "static volatile" type qualifiers exist to tell the compiler "this is a hardware register, do not reorder or eliminate dead stores or reads" for this exact reason.
It's an often abused source of error though.
It goes back to the dawn of C, but I wouldn't be surprised if some early (pre-C89) implementations were broken.
-
Monday 9th February 2026 22:36 GMT billdehaan
Re: This problem existed long before Clippy
but I wouldn't be surprised if some early (pre-C89) implementations were broken
Oh, hell yes.
If you know the history of it, Microsoft was initially not an OS vendor, but a language vendor, making interpreters and compilers.
Their initial IBM PC languages were Basic, Pascal, Fortran, MASM, and COBOL. Customers asked for C, which they didn't have, so they sublicenced Lattice C and resold it. Microsoft C v3 was their first homegrown effort, and it was really version 1.0, and buggy as hell. Version 4 was much better, and we used it, but the implementation things like "static volatile" and "register" keywords were, shall we say, suspect, at least initially. They were deliberately excluded from our coding standard because of that. They could have been cleaned up later, but until the coding standard was updated, they weren't to be used.
Version 5 of the compiler was a misfire, as per my initial post, but 5.1 was much better.
Version 6 was a monstrous upgrade, and complete overkill for the market I was in, so the companies I worked with tended to move to Turbo C, later Turbo C++, which then became Borland C++.
-
Tuesday 10th February 2026 06:12 GMT Lipdorn
Re: This problem existed long before Clippy
<q>..."static volatile" and "register" keywords were, shall we say, suspect, at least initially. They were deliberately excluded from our coding standard because of that.</q>
Well, if you're purposefully excluding the very keyword that is there to indicate to the compiler to not omit read/writes to a variable...don't be surprised that it does, in fact, omit those read/writes.
-
-
-
Monday 9th February 2026 19:35 GMT ssieler
"As one audience member suggested, perhaps one day a compiler could accept prompts that specify what areas of the code not to tinker with."
HP Pascal compiler for PA-RISC Had that option nearly 40 years ago.
also, another technique of string comparison that avoids timing attacks is to count the number of characters that match and then at the end of the loop compare that sum to the length you expect. I was doing that over 20 years ago. (of course, checking string comparisons is pretty unlikely for passwords, for many decades)
-
Wednesday 11th February 2026 04:06 GMT Claptrap314
Counting is too complex.
If you insist on zero-terminated strings:
/* Data Oblivious Timing compare for zero-terminated strings */
/* returns 0 if strings are equal, some other value if not */
char dot_strcmp(char *a, char *b) {
int d = 0 ;
while((*a) && (*b)){ d |= *a++ ^ *b++ } ;
d |= *a ^ *b ;
return d ;
}
At least, I think that's the syntax, after these decades...
-
-
Monday 9th February 2026 20:49 GMT Anonymous Coward
Maybe I'm missing something here, but the issue was about timing. Having the password verification routine take the same amount of time regardless of how quickly it determines that the password did or did not match. Why not grab the system time at the start of the routine and then once the match/mismatch has been determined, simply wait an arbitrary amount of time so that it always takes the same amount of time to finish?
-
-
Monday 9th February 2026 23:37 GMT S O
Not a new idea at all, and extremely effective. Human-imperceptable timing changes are already widely used, but just like the people using optimization blindly here, then complaining that a subtlety is removed, they are often bypassed by later developers for testing efficiency.
e.g. Some people just want the compiler to read their mind rather than paying attention to what they need.
-
-
-
Tuesday 10th February 2026 12:08 GMT Dizzy Dwarf
Well, it's a bad example
Nobody stores a password in a database - you store a salted hash and compare those.
But strcmp is the wrong function to use. A human wrote that to return as soon as the answer is known, which is as soon as a character differs. The optimiser had nothing to so with it.
Also, that would be in glibc, your -O3 is not going to recompile that anyway.
You want a function that has to compare every character. Maybe something that returns how good of a match it was.
-
Tuesday 10th February 2026 12:49 GMT heyrick
Assuming we're actually dealing with clear text passwords...
How about step through each character of the password and if the character doesn't match then counter++. If counter is still zero at the end, the password matches. Then there is no early abort and the time taken depends upon the length of the input, not it's correctness. If you're paranoid enough to think it's possible to detect the counter++, then increment a different variable if the character matches...
-
Tuesday 10th February 2026 20:51 GMT doublelayer
The compiler will see that you run this loop which can only ever increment counter and has no other side effects then run
if (counter == 0). Therefore, as soon as counter hits 1, the if statement is guaranteed to be bypassed, meaning there is no need to continue the loop as it would have no effect on the result of the program. That's what makes compiler optimization tricky if you want to avoid it because you're effectively playing a puzzle game against the compiler programmers where they try to find ways you could be inefficient and eliminate them while you try to find better ways to hide inefficient things from them. Compiler writers tend to be quite smart, so that game is hard to win for long and, just to know whether they've won the last round, you have to manually inspect the product of the same source code on each release of the compiler.
-
-
Tuesday 10th February 2026 13:01 GMT Taliesinawen
Failed analogy ..
> The GNU C compiler is excellent with reasoning about Boolean values. It may be too clever. Like Microsoft Clippy-level clever.
I fail to understand the analogy. One is a highly optimized compiler; the other is an annoying paperclip. I'm sure there would have been a better analogy over on User Friendly.
-
Tuesday 10th February 2026 15:04 GMT Spamfast
The Straw Man
As others have pointed out, code generated by a C compiler is allowed to take as long or short a time as the compiler likes regardless of any optimization options - nothing in any C specification has ever had anything at all to say about this.
In fact, with the pipelining & caches in microprocessors, not to mention concurrency in the form of interrupts (including preemptive multi-tasking on a single hardware thread processor) and SMP even a function written in assembler whose execution flow depends entirely deterministically on its inputs can take different amounts of time depending upon what other code has been executed between invocations.
Complaining that a C compiler does what it's supposed to do because you wanted it to do something else is a classical logical falacy. If this is the best example René Meusel can give at a FOSDEM then maybe they should consider inviting someone else to speak.
-
Tuesday 10th February 2026 17:30 GMT PapaPepe
Timing attacks
This article shows two things:
1) In general, software construction can only rarely be called "software engineering": Using C compiler and assuming the result will conform to some assumption about running time is "off-label" use of an important system-building component. This may be acceptable in medicine, but it is not an acceptable practice in engineering.
2) Systems vulnerable to timing attacks suggest to me both their architects and their users would do well to read an old, old position paper by Ben Laurie and Abe Singer: https://www.nspw.org/papers/2008/nspw2008-laurie.pdf
-
Friday 13th February 2026 22:07 GMT captain veg
Can it be fair to require the average programmer to understand inline assembly,
Yes.
Absolutely.
Completely.
I would go as far as to say that anyone who can't understand some kind of assembly is not really a programmer.
Still, the answer to the conundrum at the heart of the article is simply to place the boolean result into some storage location and to return it after a fixed delay on a timer. Isn't it?
-A.
-
Saturday 14th February 2026 06:26 GMT Herby
Old news...
In the distant past (the 60's to be more exact) IBM had optizations in its Fortran compiler. The "most aggressive" was opt level 2. I remember being told that while to worked most of the time, occasionally it could "optimize" entire loops out of existence. So yes, this is old news.
As for comparing passwords: You should read in all the characters of the password BEFORE doing anything with it. Sure go ahead and check the hashes, but the error time will always be consistent. If it is BAD, it will take some time. Since it is comparing hashes, the character on which the hash fails is irrelevant since the character of the hash being wrong tells little about which character in plaintext is bad (as that is what hashes are supposed to do!).
-
Saturday 14th February 2026 11:11 GMT ilpr
Problem isn't the compiler here..
Problem isn't the compiler here - it is allowed to do these kind of optimizations. The problem is what kind of value you are comparing: if you are using plaintext password sure, attacker might guess something, but one-way cryptographic hashes like SHA are supposed to unpredicatable and secure - precisely for this kind of purpose. So if you are properly hashing (and salting) the password it should not matter what the timing is as you cannot predict the password from the hash.
This all sounds like someone has not understood the meaning and purpose of hashing and has decided to blame to compiler on their iffy code.