Reg Standards
But he also wondered if measuring kernels by commits is valid and suggested measuring new kernel versions by the number of lines changed.
An opportunity for a new category of Reg Standards to be adopted
Linus Torvalds has loosed the first release candidate for version 5.15 of the Linux kernel, though isn't sure if it's a big 'un or nothing to get excited about. "So 5.15 isn't shaping up to be a particularly large release, at least in number of commits. At only just over 10k non-merge commits, this is in fact the smallest rc1 …
The problem with this setting is that a warning from GCC may mean some totally trivial thing that does not really affect code correctness, or it means something you really want to fix. Setting -Werror will then cause both of these to make your compilation fail. You can make it more finegrained, really per warning type, but then you get a really long list of options.
Also, new GCC versions typically add more warnings, or make existing checks stricter, so after a compiler upgrade you often get lots of errors if you have -Werror.
So in principle a useful option, in practice it gives you a lot of busywork.
One could argue that most of the so-called "busywork" should have been taken care of years ago.
Some people make a big deal of tidying up their kitchen because they only do it once a month or so, whether it needs it or not. Me, I make sure the kitchen is clean every night before I go to bed. Only takes a couple minutes, max.
I had a good laugh about this. I could easily predict what was going to come.
Typically, I program in C++. We treat warnings as errors and try to eliminate them. C++ makes it easy. Whenever I have to include C code, however, I get this stomachache. I know that it will spill tons of warnings at me and I know that it will be pointless fighting them. In C you mostly only have the possibility to “cast them away” employing the blind-cast. This basically disables all checks, thus making the whole caboose worse than with warnings.
Torvalds shot another – predictable – “own goal” with this one, which makes me wonder how far away from reality he has shifted already.
It is a vital core component then, MacroRodent, that targets/exercises enablement of correction or excision of indolent and malevolent code agents assisting the reach [disguising the damage] and extending the life [opening up further engaging fronts] of politically inept incorrect and uselessly adept adopted practices.
If you want a magnificently risen Linux omelette, you have to crack open a few easter eggs and -Werror is the whisker/blender/fork to unify and energetically mix as yet unformed living bodies into an elixir for cooking with ingredients fit for a'gorging and a'feasting.
* .... where a this is actually a that and can be a RAT and whenever entangled and working remarkably well and extremely badly together, quite something else altogether different and challenging and worthy of engagement and support .... or repair and fixing ....... or immediate obliteration to absolute smithereens, with this trio of options really being the only viable ones available to exercise and effect meaningful sustainable progress ‽ .
Always compile with full-warnings enabled, and do not accept code until it compiles without warning. Rules to live by. The problem with imposing -Werror at some later point in time in a long-running project is that many programmers do(did) not follow that rule and chose to live with non-standard compliant idioms in their code. Type-punned pointers, strict-aliasing violations and comparison between signed and unsigned types are a common few.
For code written before the C11 Standard (or for that matter before C89/90, C99), the current standard now flags what were loose, but accepted coding practices in C. Applying the current requirements to the millions of lines of code that make up the kernel project -- you can bet there will be howls.
Adding -Werror is the right move, but it will need lead-time to be phased in. There simply are not enough man-hours in effort available to waive a wand and say -- it will be done for this release. Don't get me wrong, it should be done and Linus is 100% correct in his "goal", but rather than a "by next release" fiat, it should be more a "by 2023" fiat (or some other reasonable time-frame).
I did that on one major project, together with one other developer. Well, we didn't turn werror on, we just turned warnings on one after the other, and then removed whatever what easy. Started out with about 4,000 warnings. Most of them could be removed trivially, like int status = somefunction(); "warning: unused"; (void) status; or incorrect formats that worked. Many printf formats, many || vs && precedence and so on.
We then were left with about 3 or 4 genuine bugs, and some cases where we had to change code to make warnings go away. The most annoying was that you can't write if (x >= lowerLimit && x < upperLimit) if x is unsigned and lowerLimit is a constant zero.
But the real benefit wasn't the three or four bugs fixed. The benefit was that we _always_ had no bugs introduced that would have been found with the right warning, but the warning was never seen because we had too many. So now there is over a million lines of code with no warnings, and werror turned on. Unless the Indian guys who took over removed it.
I remember getting warnings for “if (I >= 0 && I < limit)” for a range check if I was an unsigned integer. Because I >= 0 is always true. And I wanted to write a check for the lower bound even if it was 0. So I changed to “if (I == 0 || I > 0 && I < limit)”. Another warning. Mixing || and && without parentheses. “If (I == 0 || (I > 0 and I < limit))”.
Apart from that being free of warnings is good because then every warning means something and doesn’t get ignored. I couldn’t use -Werror for ages because we used FIPS, and the FIPS source that you are not allowed to touch or lose your certification produced warnings. I could have changed compiler flags for that source file but considered that cheating.
gcc Is right to warn you about unsigned >= 0.
Writing that implies you think it's signed, which leads to other mistakes.
Warnings in 3rd party libraries is why cmake (and other build systems) allow you to give subprojects (and their headers) different warnings levels.
Of course, the same technique also gives you some new footguns, which is nice...
"gcc Is right to warn you about unsigned >= 0."
No, it's not. It's right in a context where I perform a loop until i < 0, because it will not ever become less than 0, and that's a big trap. It's nonsense in code that is obviously a range check. The range check (I >= 0 && I < limit) will produce the correct result, even if I started with I = 0 and subtracted 1 from i.
It's not right, it's programmed to do it, including in a situation when there is obviously no problem with the code. It is stupid to think that I should write a range check for 1 <= I <= 10 in a different way to a check 0 <= I <= 10.
This post has been deleted by its author
there may be a saner fix involving a cast to an integer as long as you know that the range checks will still work properly... or use a variable of the correct type instead?
I run into this kind of signed/unsigned comparison warning a lot with microcontrollers that have 8 bit unsigned integers as counters for "reasons" (like speed+range). I usually just type-cast the warnings away and make sure the code is sane. It's good to hand-optimize microcontroller code anyway.
NOTE: magic numbers in code should use #define or a 'const' type anyway, so you could fix it in the definitions.
and with the original argument
"if(thingy >= 0 && thingy <= llmit)"
where 'thingy' is unsigned, and you know it is unsigned, why leave the '>=0' in the actual code? At least use a comment if you need it there for some reference type of reason, maybe:
"if(/* thingy >= 0 && */ thingy <= limit)"
(then add a comment that says 'thingy is unsigned' or similar)
I sometimes do this with an ending 'else', comment out an 'if' following it that indicates the condition in the 'else' if the 'if' would always be true (and also indicate in the comment that it's always true), so that someone reading the code (including me a year later) will see that and go "ok".
"...I sometimes do this with an ending 'else', comment out an 'if' following it that indicates the condition in the 'else' if the 'if' would always be true (and also indicate in the comment that it's always true), so that someone reading the code (including me a year later) will see that and go 'ok'."
Seems like a good idea. My variant on that has been to use
assert( statement that'll always be true);
both to indicate to the reader that it'll always be true, _and_ to enforce it (in debug builds). Sort of an "asserts as comments" method.
In re "...including me a year later" : as a 25-year-old programmer, I didn't comment my code very well, partly because I remembered everything. Thirty years later, I put comments everywhere.
You're assuming "i" isn't modified anywhere else except in your loop. Which of course a compiler suitably concerned by this could verify, possibly, by analysing every codepath involving "i". Or it could just emit a warning and let you sort it out.
> It is stupid to think that I should write a range check for 1 <= I <= 10 in a different way to a check 0 <= I <= 10.
Other languages are available, but if you choose C that's the price you pay. And I say that holding up my hand as someone who has iterated with unsigned bytes/shorts in C and been caught by this very issue.
@gnasher729 “It is stupid to think that I should write a range check for 1 <= I <= 10 in a different way to a check 0 <= I <= 10.”
It is also stupid to think you should write a lower bound test for a range that starts at 0 when the variable is an unsigned integer. Unsigned integers can never be less than 0, therefore a test is not needed.
I would hazard a guess that was what the warning was trying to tell you.
Warnings in 3rd party libraries
If possible I'd rather leave -Werror on (or at least clean up ALL of the warnings prior to release) and if they are in 3rd party libraries:
* patch it in my own branch
* submit fix(es) upstream (and keep the patch files just in case)
this is ALSO related to why I do not like endlessly chasing moving targets in 3rd party libraries... especially when "upstream" suddenly decides to CHANGE THE API WITH NO BACKWARD COMPATIBILITY causing the use of "new version" (to get the most recent fixes) to be EVEN WORSE.
(and propagating "worse" up/down the dependency chain, the stuff of nightmares)
I would like to think this would be LESS of a problem in kernel code, though
FreeBSD has its own branches of contributed 3rd party things that are in the base distribution (including llvm). Fewer moving targets this way.
the compiler should have a warning that using a single character name for a variable or other entity is incredibly stupid if you ever want to be able to review and/or refactor the code without additional (unnecessary) difficulty, particularly variables that can easily be mistaken for the number 1 or 0, or a vertical bar. Most people doing this sort of thing are old enough to require glasses to read or see the monitor, and slowing down a manager's gaze at your code because of bad programming habits is likely to PISS HIM OFF (let alone the maintainer that has to fix something 2 years from now).
And I'd rather not try searching for every 'i' in the code instead of 'ii' or 'iterator', Just because a particular IDE has "idiot hand-holding" features does not mean it is an excuse to have BAD HABITS in variable naming.
I once had to modify and re-run Monte Carlo simulation code from an contemporary PhD student in gaseous electronics, as he'd mixed up the difference (IIRC) between collision cross-section and momentum-transfer cross-section. In the code he was keeping track of an average, but he apparently forgot about the FORTRAN standard at the time (at the time before it became Fortran...). Which led to my favourite variable definition ever:
REAL MEAN
:-0!
... not that that would stop the pedants around here from bitching about it. It's not just what they do, it's what they live for. Poor things.
More likely, your comment was so far down the chain that it made it difficult for most of the pedant commentards to figure out what you were talking about, so they left it alone.
Regardless, have a beer. We're human, we make mistakes, we learn, we move on.
I remember getting warnings for “if (I >= 0 && I < limit)” for a range check if I was an unsigned integer. Because I >= 0 is always true. And I wanted to write a check for the lower bound even if it was 0. So I changed to “if (I == 0 || I > 0 && I < limit)”. Another warning. Mixing || and && without parentheses. “If (I == 0 || (I > 0 and I < limit))”.
This makes no sense. If I were a compiler I would issue warnings and errors galore and then go and sob in the corner.
If you have an unsigned integer, why would you check if it's greater than or equal to zero? It's like you want your code to run slower or something. Can't you just write it in Javascript?
The opposite was quite a common optimisation for signed integers at one time: Cast to unsigned and just check the upper bound:
int somevalue = SomeFunction();
if((( unsigned int )somevalue ) < somelimit ) // 0 <= somevalue < somelimit
I still sometimes do it today.
There are probably dozens of old performance tricks that I still do from time to time without thinking about it.
Like shifting instead of multiply/dividing by powers of two. I've even had some young whippersnappers complain that things like that are too hard to read.
My first languages were BBC BASIC (school), ARM, <some shitty PC BASIC> (college), C (job 1), 68000/65C816 (job 2), HTML (Pre-CSS; job3), C++/x86 (job 4)
I've even had some young whippersnappers complain that things like that are too hard to read.
They probably learned Python as their first programming lingo. Ask these kids the difference between a signed and unsigned variable type, see if they choke on 2's compliment math. Bring popcorn and beer.
I have to wonder if any of them could hand-optimize inner loop code by looking at the assembly code generated by the compiler, even when 'objdump' is readily available
bit shifting and bitwise operations used to be a lot more important back in the day, but are still VERY important in kernel drivers and microcontrollers.
To be fair to them, the company wasn't a C++ shop when I started and they freely admitted their knowledge of the language was lacking. But I gently persuaded* them to switch as it was a much better option than their current choice of language. The result is their video player tech out-performs the standard players on iOS and Android.
* During a few months of forced leave when money was tight, I just rewrote their whole stack in C++ and said "we're using this now".
"by looking at the assembly code generated by the compiler"
I wonder how many of them could even understand that that lump of assembly is the same thing as they wrote.
It can be useful for debugging to look not at what the code is doing (because it's easy to be led astray by assumptions on what one thinks it should be doing) but what the executable is actually doing. Sometimes it's a real ballache, other times it's "why the hell are you..." (look at the code, little animated light bulb goes "ping!", followed by the obligatory forehead slap).
I do a lot of microcontroller code and one issue that always pops up is building values for registers within the device.
One family of devices has an internal register where a pointer to a DMA descriptor (and many other things) needs to be loaded. That is interesting, because it must point at the address but the value is in fact a 32 bit unsigned value.
That meant that for every internal pointer register I loaded, there had to be a typecast from unit32_t to uint32_t *.
A bit of a pain but it probably saved my successor(s) some pain. I had a comment in the source to the effect that the various internal register loads needed a typecast to satisfy the compiler.
On shifting vs. multiplication, I once worked with a very interesting system where doing something like x*=2 was actually faster than x<<=1
Fun times.
> On shifting vs. multiplication, I once worked with a very interesting system where doing something like x*=2 was actually faster than x<<=1
I can only suggest to also always try the unoptimized code and see what the compiler makes of it. Many optimizations that were OK two decades ago are actively bad today.
I meanwhile stopped to do the shift-for-multiply stuff because the compiler outsmarts me as a rule. Compiler-writer-appreciation-pint ---->
With platforms/compilers where the optimizer is not so great things may well be different.
There's plenty of hacks I've retired. For example, I used to write flow control as arithmetic, but branch prediction has neutered that.
On the other hand, storing lookup tables in registers has grown in usefulness as the processor width has expanded. 64 bits is like 8 bytes - half the words in this post don't have that many letters - so imagine a whole 8-byte table in a register!
Here's another one from the old time book:
x => x * 0xcd >> 11
Obviously you need one of those fancy, modern high-end processors that can multiply two 8 bit regs and store the results in a composite 16 bit reg. But it's good for all 8 bit values. (I've written the shift as 16 bits but you discard the low byte reg and do a three bit shift on the high one.)
If your chip doesn't have a hardware multiply instruction, other constants may be preferable, depending on the range of input.
And while this was essential, back in the day, it remains perfomative today in many situations.
This post has been deleted by its author
That's why they're warnings, not errors.
What they're supposed to do is get you to look at the code, because it usually means somebody made a typo.
Eg a flow clause that has no effect or is constant is almost always a mistake.
What some people do is just turn off the warning.
In my current project, there are exactly two warnings turned off: One where the compiler tries to tell me that using a semaphore is inefficient (when the efficiency is totally of no concern, and the workaround would be an awful lot of work and very hard to get right), and one where I intentionally crash the program, like * (int *) NULL = 0; and the compiler won't let me do that without a warning.
Hmmm...can't you simply SIGINT or raise an interrupt known to shut down the beast?
There was a point in a past lifetime where I needed to crash an embedded program if all hell broke loose unexpectedly, so I simply divided 0 by 0. The code reviewers never quite got their arms around that, even with full commenting. So I reverted to calling a trap instruction that would crash the program.
It was a while ago, and recollection is faulty, but I might have called the trap instruction that fielded division by zero...
Hmmm... hadn't tried such a thing, but
int main( void)
{
*(int *)0 = 0;
return( 0);
}
compiled with gcc -Wall -Wextra -pedantic -o z z.c
got no warnings. Chalk one up for clang, though; it realized this was a no-no and warned me. However,
int main( void)
{
int *null_ptr = (int *)0;
*null_ptr = 1;
return( 0);
}
slips past both compilers without warning. (A thank-you to jake for reminding me that one can use HTML tags on these fora.)
The issue (as has been alluded to above) is that we don't have a strong semantic definition of what constitutes a "warning". Warnings live in an ambiguous space between supplementary "information" and actual "error" and, depending on circumstances, could be either. Or both.
If you've simply written something that's ambiguous or confusing, clearly the correct thing to do is fix it to eliminate the message - but, more importantly, to clarify your intent for others.
But what about "Warning, Feature X is deprecated"? You know it's deprecated, but you know it's still supported. Rewriting the code may require further changes you don't want to make now, but turning the warning off would perhaps mean a future developer fails to make the change - this is a scenario that I've come up against in practice (though not with gcc).
Perhaps, in an ideal world, you would be able digitally to sign-off certain lines of code as being acceptable for a certain length of time so you could have clean builds for the lifetime of the waiver without losing the longer-term tripwire.
No, I wrote some perfectly fine code. I wanted to check that 0 ≤ i < n, so I wrote the obvious code. If I don't write "i >= 0"then some reader will assume that I forgot to check the lower bound.
The second code was not worse, it was an obvious workaround against the compiler's stupidity. And ran into another stupid rule of the compiler, where it assumes you don't know the relative priority of || and &&.
Hope that explains it. As for criticising, well, I assume that I know better than you any day.
You should never assume operator precedence. It varies between languages. Even if you know it, the people maintaining your code in the future might not know it. Or they they might not know that you know it. In the interests of readable code, always insert extra parentheses to make it clear what you are intending.
ie.
if ((value < lower_limit) || (value > upper_limit))
is more readable than
if (value < lower_limit || value > upper_limit)
Even though they compile to the same code
Just try writing the second form in Delphi/Pascal and you will come unstuck.
Upvote, because people that have looked at my code have in the past whinged about too many brackets.
As you say, operator precedence isn't a fixed immutable thing. It changes depending on the language in use. Therefore, writing code with additional brackets not only makes it extremely clear what relates to what, it also means one isn't relying upon said precedence.
Warning are by definition not errors, so they shouldn't be treated as such. It's always good to write code that doesn't generate warnings, but some brain dead compilers generate warnings for the most idiotic of reasons. Of course you can usually restructure the code to remove the warning, but sometimes there are good reasons for not doing so.
Worked on a product where we decided to fix all the warnings. One dev slaved away for a fortnight and fixed *most* of the warnings. He found a few errors, so it was actually worthwhile.
The only place he couldn't fix the warnings was in some test code which achieved 100% code coverage on what it was testing. We decided to just disable 'warnings as errors' on that test code and chalk it up as a win overall.