User topics

Article topics

Log in Sign up

Let's Encrypt? Let's revoke 3 million HTTPS certificates on Wednesday, more like: Check code loop blunder strikes

On Wednesday, March 4, Let's Encrypt – the free, automated digital certificate authority – will briefly become Let's Revoke, to undo the issuance of more than three million flawed HTTPS certs. In a post to the service's online forum on Saturday, Jacob Hoffman-Andrews, senior staff technologist at the EFF, said a bug had been …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Tuesday 3rd March 2020 20:32 GMT Twanky

Minor arse-ache

E-mail arrived, checked it for authenticity, ran the renewal processes, done.

I can well imagine that an out of normal sequence cert renewal is going to be a right royal pain in the arse for some though.

Really not great for Let's Encrypt's reputation either.

19 2 Reply
1. Tuesday 3rd March 2020 21:39 GMT whitepines
  
  Re: Minor arse-ache
  
  The real fun starts when you hit LetsEncrypt's API rate limiting after a problem like this. It's entirely possible to be down for days if you cock up your update command more than a couple of times and had previously set a low renewal interval.
  
  Saw a customer site go down because of that, complete with mad scramble to buy a replacement (expensive) commercial cert since once you've hit the request limit, no one can override it (it's all automated 'guv!)
  
  20 0 Reply
  1. Wednesday 4th March 2020 20:39 GMT Gordon Shumway
    
    Re: Minor arse-ache
    
    Well, if you manage to cock up *that* simple of a command, multiple times even, you should seriously be considering a career change. To one that does not involve electrons.
    
    2 2 Reply
    1. Wednesday 4th March 2020 21:41 GMT Glen 1
      
      Re: Minor arse-ache
      
      Because, until you hit that limit there are no consequences.
      
      That said, the staging environment exists for a reason.
      
      3 1 Reply
    2. This post has been deleted by its author
    3. Thursday 5th March 2020 00:07 GMT whitepines
      
      Re: Minor arse-ache
      
      Well, if you manage to cock up *that* simple of a command, multiple times even, you should seriously be considering a career change. To one that does not involve electrons.
      
      The problem is that all it takes is one cock up (e.g. "up enter" accidentally re-running the update command, in an environment primed with "update often" where re-running an issuance command is not expected to be able to knock everything offline). Or even in some cases no cock up at all is required; the company referenced earlier basically did this:
      
      * Set short update interval (which is encouraged, given short cert expiry times!)
      
      * Migrate several subdomains from a commercial cert to LetsEncrypt over a period of several hours (this was before wildcard DNS, the systems involved were quite complex, and under no circumstances was "shut it all down for hours to enable a single, untested cert migration" an option)
      
      * Test the automatic update system with one last manual update. Oh bugger, API request limit hit with that last test, no way to override, sites offline, scramble for commercial cert.
      
      When the limit is hit it shouldn't immediately knock out requests for days. It should soft fall back to minutes and maybe increase over time to hours then days, and in all cases LetsEncrypt should always be able to override it (possibly for a small fee if it was customer induced stupidity versus an external attack).
      
      In fact there's some concern a neer-do-well could effectively knock sites offline via a DoS attack using this little known "feature" of LetsEncrypt, though I haven't really looked into it in detail.
      
      And don't give me any bull about LetsEncrypt having a useful staging environment. It literally goes after your public Web servers on your production domains to even issue the certs in the first place, so you can't have a fully isolated staging environment that would have prevented the sequence above.
      
      1 0 Reply
Tuesday 3rd March 2020 20:48 GMT rcxb

Why the mass revocation? Doesn't sound like anything bad happened. The checks just weren't up to snuff. So LE should go back and re-check the domains that were issued certs to see which, if any, should not have been issued from them, and only revoke those few. The certs will all be gone in a couple months anyhow.

7 2 Reply
1. Wednesday 4th March 2020 09:11 GMT hazzamon
  
  I imagine that the mass-revocation is a requirement of the CA/Browser Forum rules.
  
  7 0 Reply
2. Wednesday 4th March 2020 09:22 GMT richardcox13
  
  Having read read some of the comment threads it seems revoking all potentially insecure certificates within 5 days of the discovery of the flaw is part of the Baseline Requirements agreed between CAs.
  
  14 0 Reply
3. Wednesday 4th March 2020 11:07 GMT tialaramex
  
  The rules for CAA makes it essential to check at the time of issuance, not before or after or just whenever you get around to it. Specifically they require the check to happen no more than 8 hours before you issue the certificate.
  
  Let's Encrypt published a much smaller list of names for which if the check is run today it fails, but there is no way to determine (without time travel) which names, on this list or otherwise, would have failed the CAA check if it had been performed as intended at the time.
  
  5 0 Reply
  1. Thursday 5th March 2020 03:29 GMT rcxb
    
    The rules for CAA makes it essential to check at the time of issuance, not before or after or just whenever you get around to it. Specifically they require the check to happen no more than 8 hours before you issue the certificate.
    
    That's a flimsy argument. You can't model a practical threat where I want to allow LE to issues certs for my domain today, but I didn't yesterday, and so the cert issued yesterday (which also validated domain control and the like) is dangerous. The other case is infinitely more risky.
    
    a much smaller list of names for which if the check is run today it fails
    
    Exactly! Those small number are the ONLY ones that should be revoked. Maybe they were valid when issued, or maybe not, but there you actually have a high risk that the bug caused something untoward to occur. The rest are meh.
    
    0 0 Reply
4. Wednesday 4th March 2020 11:13 GMT Wayland
  
  Yeah right, they used a loop variable wrong.
  
  This is actually a power flex. They are testing their muscles. Potential investors will see the power a Certificate Broker can wield. Google Chrome last year made HTTPS mandatory if you don't want warnings. In some cases you simply can't open the site even if you drive around all the road blocks.
  
  What is Let's Crypt's business model? Give away certificates like the child snatcher gives away candy?
  
  I can see Google buying all the Certificate Brokers and owning all the web browsers. They will then have web superiority over the Internet.
  
  1 16 Reply
  1. Wednesday 4th March 2020 13:07 GMT Anonymous Coward
    
    Their business model is to sell professional packages to companies that have a lot of domain names in a single account, such as hosting companies.
    
    That's what the API limit is for.
    
    10 0 Reply
  2. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
    
    I'm not particularly fond of Let's Encrypt or the HTTPS Everywhere movement, because of increased threats such as typosquatting and IDN homographs, and because they're an excuse to avoid fixing the real problems. But even as conspiracy theories go this is impressively feeble.
    
    Let's Encrypt is primarily a free service achieved by maximizing automation to reduce costs, and funded by premium packages and donations. There are plenty of commercial CAs already, and anyone who's interested can learn as much about the CA business as they like - including what sort of power CAs wield.
    
    Your hypothetical "potential investors" either already understand the CA business, or won't notice this glitch.
    
    If Alphabet (or Apple, or Amazon) wanted to buy up the major CAs, they would have. They haven't because the CA business sucks, frankly. Margins are lousy and no one likes CAs - they're a hassle when they work and a danger when they fail. The CA/BF slaps one band-aid after another onto a fundamentally broken PKI so it can just manage to make the work factor too high for non-targeted attacks. The whole thing is dreary.
    
    2 1 Reply
5. Thursday 5th March 2020 11:31 GMT CrazyOldCatMan
  
  Why the mass revocation?
  
  Because they could (theoretically) have issued certificates to people wo were not entitledto them - if you submit a list of domains and 'accidentally' included a Microsoft sub-domain in the list their old process wouldn' have picked it up.
  
  So, safer to revoke all the certs and get people to renew - and that will pick up the wrongly-issued ones.
  
  1 0 Reply
Tuesday 3rd March 2020 20:52 GMT Anonymous Coward

What's this, a bug caused by a language quirk?

And there was me thinking Go was one of the new breed of languages that didn't have any nasty semantic gotchas like C++ or Java.

Yet again it proves that there's no such thing as a safe language since to do anything useful it has to give you the power to blow your foot off in which case what is the point of these new languages other than an ego trip for its developers and a lock-in to the environment.

40 9 Reply
1. Wednesday 4th March 2020 09:12 GMT MacroRodent
  
  Re: What's this, a bug caused by a language quirk?
  
  > what is the point of these new languages other than an ego trip for its developers and a lock-in to the environment.
  
  Would you then prefer coding in FORTRAN IV or COBOL, since newer languages are just ego trips? Of course not. No language can ever guarantee absence of bugs, but improvements can be made nevertheless.
  
  Ob car analogy: Modern cars are much safer that what people drove in the 1960's, but fatalities still sadly occur. That does not mean all the improvements were pointless.
  
  15 4 Reply
  1. Wednesday 4th March 2020 12:32 GMT Anonymous Coward
    
    Re: What's this, a bug caused by a language quirk?
    
    "Would you then prefer coding in FORTRAN IV or COBOL"
    
    I'd sooner do scientific programming in fortran than Go. And if Go, Rust etc had invented any new paradigms like LISP, Prolog or Smalltalk did then you would have had a point but they haven't. They're just old wine in new bottles.
    
    12 3 Reply
    1. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
      
      Re: What's this, a bug caused by a language quirk?
      
      Rust is significantly innovative, thanks to borrow checking.
      
      I'm not particularly impressed with Go.
      
      (And I'd do my scientific programming in Julia, probably. Python may have the libraries and community these days, but I find Julia much more to my taste.)
      
      2 0 Reply
    2. Thursday 5th March 2020 10:06 GMT DrXym
      
      Re: What's this, a bug caused by a language quirk?
      
      Here is Rust's paradigm - safe by default and portable. Look at the CVE database - it's filled with exploits / vulns caused by null pointers, buffer overflows, data races etc. from code written in C or C++. Realise that NONE of those can happen in Rust because the compiler will kick your ass if you try. The consequence is that compiled code is safer code, slashing the number of bugs which means less development time, better quality and happier customers. Oh and often that code can be more performant because devs can utilitise threading, asynchronous operations etc. because the compiler catches all those aforementioned issues.
      
      Here is Go's paradigm - scaling, portability, stability and convenience. It is very terse, has a very large standard library (crypto, networking, images, math, hashing, encoding etc. all out of the box), is well suited for scaling network applications, and it compiles code into executables. It also does not suffer from the same kinds of vulns as C or C++ thanks to runtime checks.
      
      So yeah you can use some older language if you prefer but you have to realise why these languages exist and it helps to use them to speak from a position of experience. What I don't quite follow is the irrational attitude that always permeates discussion about new languages, as if some old timer sees this new fangled language as a threat or something. Nobody is forcing you to use them, but personally I like to keep my skillset fresh and relevant and using a new language or technology is a way to do that.
      
      1 1 Reply
      1. Thursday 5th March 2020 12:02 GMT Loyal Commenter
        
        Re: What's this, a bug caused by a language quirk?
        
        And I expect it's perfectly possible to create the equivalent of
        do while (true) {}
        in both languages.
        
        If a language is Turing-complete, it's not possible to specify it such that you can't create a bug, in much the same way it is not possible to determine if any program will complete.
        
        3 0 Reply
        
        Thursday 5th March 2020 16:57 GMT DrXym
        
        Re: What's this, a bug caused by a language quirk?
        
        Well that's a stupid straw man argument. I never claimed a language like Rust fixed bugs like application logic errors. But it does fix entire classes of problem that C or C++ allow through - buffer overflows, NPEs etc. It should be very obvious why that is a good thing in a language.
        
        1 0 Reply
        
        Friday 6th March 2020 08:14 GMT MacroRodent
        
        Re: What's this, a bug caused by a language quirk?
        
        As another commentard pointed out, most of the new stuff really has appeared in older languages. Sometimes much older. For example, Simula 67 (from 1967 like the name says) had most of the same features that make Go or Java safer. Managed memory, run-time checks, no wild pointers, compilation with strong typing. Even classes and inheritance. But almost nobody uses it any more.
        
        Go etc. add finesses, and also follow a syntactic style that people are now familiar with. Simula 67 syntax is based on Algol, so long keywords, begin ... end instead of { ... } ...
        
        I think one reason the older innovative languages have fallen by the wayside is that at the time they were introduced, known implementation techniques did not allow making the fast enough for production use, and computers also were slower and had less memory. I recall Bjarne Stroustrup saying he started developing C++ for a project where he first tried to use Simula, but it ran too slowly.
        
        So programmers were enticed by the low-level, anything goes C, and later C++. Then managed languages became more feasible, thanks to faster computers and innovations like better garbage collection algorithms and JIT, but these were applied to new languages instead of attempting to resuscitate old ones. It is easier to "sell" something syntactically C-like to programmers who grew up with it.
        
        I'm fine with this. Having done most of my professional programming in C for decades, I now believe very few programs should be written in it. Mainly kernels and drivers. (Perhaps one should have a license to use it). Everything else should be programmed only in managed, checked-to-hell-and-back languages. Even then programmers will keep making stupid errors, but there is some hope there will be a a bit less of them, and they are caught earlier.
        
        0 0 Reply
2. Wednesday 4th March 2020 11:06 GMT tialaramex
  
  Re: What's this, a bug caused by a language quirk?
  
  Go isn't "one of the new breed of languages". It's from Pike and Thompson whose names may be familiar because they designed Unix. It's young the way a 14 year old Prince starting at Eton is "young". Physically it's not very old, but it does not represent great modernity or innovation.
  
  This mistake would actually be caught at compile time in some "new breed" languages such as Rust.
  
  4 3 Reply
  1. Wednesday 4th March 2020 17:52 GMT Claptrap314
    
    Re: What's this, a bug caused by a language quirk?
    
    At that level, Go is Frankensteinish.
    
    No (initial) support for exceptions, because we hates them!
    
    Channels, while probably not novel, certainly much more accessible way of handling the matter than most.
    
    Two-value return as common case to discourage magic. (And we hates exceptions!)
    
    Screaming fast compile (good for unit testing).
    
    NO unified testing strategy.
    
    Classes? Let's just use interfaces.
    
    Generics? Nah, brah. Type faster.
    
    Perhaps most of these decisions are the right one. But still, quite a mix compared to "new" & "trendy".
    
    3 0 Reply
    1. Thursday 5th March 2020 10:30 GMT DrXym
      
      Re: What's this, a bug caused by a language quirk?
      
      I'd describe Go as a swiss army knife. The standard library is huge and ideally suited for writing scalable web servers, utilities etc. that compile into standalone executables and run a multiple times faster than if they were written in languages such as Ruby, Python and NodeJS.
      
      I hate the way the language handles errors though which is inexcusably crap.
      
      0 0 Reply
3. Wednesday 4th March 2020 13:31 GMT Julz
  
  Re: What's this, a bug caused by a language quirk?
  
  The point of these new languages is so that all of the ip in the stack used by the mega-corp is owned and controlled by the mega-corp. There is also the fact that locking in developers to your language is also a fun and potentially lucrative thing to do. They seem to be, on the whole, chimeras of other languages rapped in whatever shiny the owner can rustle up. Nothing really new, including it would seem, the built in ways to silently cock things up.
  
  6 4 Reply
  1. Wednesday 4th March 2020 16:46 GMT MJB7
    
    Re: The point of these new languages
    
    What are you talking about? Rust is open source, Go is open source. Google have explicitly given up control of the IP in Go, Mozilla never were a mega-corp but they have given up control of the IP in Rust too.
    
    5 0 Reply
    1. Thursday 5th March 2020 09:19 GMT Julz
      
      Re: The point of these new languages
      
      Open source licenses don't really seem to have the effect one would hope with languages. If the main contributor and initial designers of a language moves things in ways that the community doesn't agree with, then the language tends to loose it's wider appeal.
      
      1 0 Reply
4. Wednesday 4th March 2020 14:13 GMT DrXym
  
  Re: What's this, a bug caused by a language quirk?
  
  Every language has pitfalls. Some languages, in particular C and C++ have more pitfalls than others. If there is any doubt about this, search for the causes of the Heartbleed or "goto fail" bugs in OpenSSL.
  
  And regardless of the language, all software can still have application logic errors.
  
  3 0 Reply
  1. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
    
    Re: What's this, a bug caused by a language quirk?
    
    The cause of Heartbleed was a missing length check. It was a trivial error, even if it had non-trivial consequences. Variants of it could have happened even in languages with array boundary checking, for example if the heartbeat code reused sufficiently-large buffers.
    
    Goto-fail wasn't in OpenSSL at all, but in Apple's TLS stack. And it happened because of poor code style and a failure to use an adequate static-code analyzer. (Some C compilers, run with appropriate options, are sufficiently good linters to catch the goto-fail bug, because it leaves dead code; any decent standalone linter or general static-code analyzer should catch it.)
    
    Both are in C, so say nothing about C++.
    
    In any case, there are much more prominent dangers in C, particularly unsafe functions in the standard library (which should be avoided or wrapped in safer abstractions, not used directly in application logic); the use of in-band signaling in strings, stdio formatting, etc; relatively weak typing; and a cumbersome error-handling model.
    
    I agree with your general point - all programming languages have pitfalls, and some are more difficult to use properly than others. But your supporting argument leaves much to be desired.
    
    3 0 Reply
    1. Thursday 5th March 2020 11:05 GMT DrXym
      
      Re: What's this, a bug caused by a language quirk?
      
      No, both of these were caused by problems in the language.
      
      - Heartbleed might have been "trivial" but it was caused by unsafe code that would have thrown a runtime exception in other languages and been instantly caught.
      
      - "goto fail" would have thrown a compile time error in other languages where blocks MUST be delimited or goto isn't even a thing.
      
      The point is the language caused these bugs to enter production with massive consequences.
      
      And yes C++ can mitigate some of these things, but the real world says hello. These things still happen and do happen.
      
      1 1 Reply
      1. Thursday 5th March 2020 18:22 GMT odyssey
        
        Re: What's this, a bug caused by a language quirk?
        
        "goto fail" specifically couldn't have happened in languages where "goto" isn't a thing. But you can create logic errors in every language. As we've seen with this Let's Encrypt bug, you can construct logically wrong code. You could write the equivalent of "goto fail" in a "goto-less" language, by getting a break or termination condition wrong.
        
        For example the Zune player bug where a missing break clause in date calculation code caused an infinite loop. All languages could have such a bug, the language itself can never prevent it. That's why we need careful, rigorous testing. These bugs should really be seen as testing failures. A failure to verify that the code is correct in every possible condition.
        
        0 0 Reply
        
        Thursday 5th March 2020 22:11 GMT DrXym
        
        Re: What's this, a bug caused by a language quirk?
        
        Of course you can create application errors in other languages. The point is that C and C++ have their own errors on top. This honestly isn't up for debate.
        
        0 0 Reply
        
        Friday 6th March 2020 13:20 GMT odyssey
        
        Re: What's this, a bug caused by a language quirk?
        
        No I don't deny that. Higher-level languages make exploitable buffer overflows impossible, for example. They should be used for front-facing systems where possible, and usually are. This bug however isn't due to low-level C memory handling, neither are many other bugs. Therefore it's more rigorous testing that's needed.
        
        0 0 Reply
5. Wednesday 4th March 2020 17:33 GMT Arthur 1
  
  Re: What's this, a bug caused by a language quirk?
  
  Go is a niche throwback language, don't confuse it with truly modern languages. It's made by plan 9ers for plan 9ers, really only works properly on *nix since it has a ton of OS-y things baked in, and really is only useful for massive multithreading on servers. It's well known for playing fast and loose, not a safety focused language as you seem to assume.
  
  As for what's the point of it? Overall there is no better way to write massively lightweight-threaded code in Unix-like environments.
  
  2 0 Reply
6. Wednesday 4th March 2020 19:55 GMT odyssey
  
  Re: What's this, a bug caused by a language quirk?
  
  I'd say Java has very few gotchas. Other than equals() vs == it's hard to make silly mistakes.
  
  0 1 Reply
  1. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
    
    Re: What's this, a bug caused by a language quirk?
    
    One word: Deserialization.
    
    2 0 Reply
    1. Thursday 5th March 2020 11:15 GMT odyssey
      
      Re: What's this, a bug caused by a language quirk?
      
      I wouldn't say deserialization is a language quirk or gotcha. It can be used unsafely, but so can anything that lets untrusted users do things. Go's use of loop variables, the difference between = and ==, etc are gotchas because they look like they should do something different to what they actually do.
      
      0 0 Reply
Tuesday 3rd March 2020 21:12 GMT Steve @ Ex Cathedra Solutions

Easy-peasy

Got email - 3 out of about 30 certs affected, one of which was redundant now anyway. Ran the renewal, done and dusted in 5 minutes.

Compared to the last time a certain company cocked up my certificate issuing and I was down for a week, this was heaven!

I agree it's a bit of overkill - but on the other hand NOT doing it might be worse for reputation.

14 1 Reply
1. Tuesday 3rd March 2020 21:40 GMT Anonymous Coward
  
  Re: Easy-peasy
  
  Didn't affect me but doing this improved their reputation in my eyes.
  
  14 1 Reply
2. Wednesday 4th March 2020 01:11 GMT boxplayer
  
  Re: Easy-peasy
  
  Same here, only one domain of about 50 affected anyway, but deleted the cert file and ran my script and it went through like clockwork.
  
  5 1 Reply
3. Wednesday 4th March 2020 16:01 GMT Anonymous Coward
  
  Re: Easy-peasy
  
  Similar experience here, though just one instance needed the update.
  
  I suspect the possible reputational damage might end up being neutral. LE's transparency on this has been pretty good. The only thing that might have been added is something in the email to indicate it wasn't some scam, perhaps a non-clickable URL to a third party site explaining the email, or somesuch.
  
  As you point out, compared to the experience of shenanigans of proprietary CAs, this was well handled.
  
  0 0 Reply
4. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
  
  Re: Easy-peasy
  
  Not doing it isn't an option. Basic Requirements.
  
  0 0 Reply
Tuesday 3rd March 2020 22:48 GMT tin 2

I picked a bad day to get into working with LetsEncrypt then! (been seeing random API errors all day - I now presume due to load)

0 0 Reply
1. Wednesday 4th March 2020 08:12 GMT knelmes
  
  Looks like I picked the wrong week to quit sniffing glue.
  
  22 0 Reply
  1. Wednesday 4th March 2020 13:09 GMT Anonymous Coward
    
    And don't call me Shirley.
    
    7 0 Reply
    1. This post has been deleted by its author
      1. Wednesday 4th March 2020 13:38 GMT Loyal Commenter
        
        I just wanted to say
        
        Good luck, we're all counting on you.
        
        7 0 Reply
Wednesday 4th March 2020 00:43 GMT VerySlowData

got email

We got the email, followed the advice and the certs for our affected domains were renewed.. no muss, no fuss. Kudos to LetsEncrypt for being upfront about this..

11 1 Reply
Wednesday 4th March 2020 04:52 GMT Temmokan

The only problem is the email notification came approx. 12h before the revocation is supposed to start.

I can consider myself lucky to have only 2 certificates to re-issue, and neither required manual work.

A sad error, yes. Just curious, do they ever run any tests, or "commit directly to production, what can possibly go wrong?"

4 1 Reply
1. Wednesday 4th March 2020 11:07 GMT tialaramex
  
  The code with the mistake in it was written about three months before it went live, and then was live for several months before anybody reported a symptom from that mistake, after which it took several days for Let's Encrypt staff to realise the problem wasn't just cosmetic (a user complained that they got lots of the exact same error, the staff took a while to realise that really was because the exact same FQDN was being tested over and over, instead of a different one each time)
  
  This code had tests when it was written verifying that the correct number of validations is done, but that test doesn't actually check those validations exactly match the set of names to validate. It was also covered by another test to see that all necessary validations are done for a test issuance, but that test uses fresh accounts and so didn't test the scenario where the fault happens. Let's Encrypy's remediation included new unit tests to verify this precise problem.
  
  8 0 Reply
  1. Thursday 5th March 2020 08:40 GMT Anonymous Coward
    
    Which meant:
    
    1) The programmer didn't know how to write code
    
    2) The programmer didn't know how to write tests
    
    Anyway, I'm surprised by the bad design of Go about parameters passing.
    
    1 0 Reply
    1. Thursday 5th March 2020 11:15 GMT odyssey
      
      Same, I have thought that Go looked interesting as a language. But if it behaves in counter-intuitive ways, it clearly hasn't been thought through that well.
      
      0 0 Reply
Wednesday 4th March 2020 06:21 GMT Mike_R

No email so far: 08:17 local time 06:17 UTC

so to be doubly sure ran a certbot update, which took all of 30 seconds, and then check returned success.

Lucky (?) I saw the article!

0 0 Reply
Wednesday 4th March 2020 06:49 GMT demaniak

TDD still a thing?

Seriously, this sounds like something that should have been caught by a (probably very simple) unit test.

Yikes.

3 0 Reply
1. Wednesday 4th March 2020 10:53 GMT Stephen Wilkinson
  
  Re: TDD still a thing?
  
  Agile of course!
  
  2 0 Reply
2. Wednesday 4th March 2020 13:21 GMT baud
  
  Re: TDD still a thing?
  
  I think the test case isn't that trivial, since it's in a case where more than one domain names need to be checked (batch mode) and a domain name other than the first isn't valid.
  
  Edit: from the GitHub repo, they only have 65% code coverage, so it's not that surprising that case wasn't covered.
  
  1 0 Reply
3. Wednesday 4th March 2020 21:03 GMT Michael Wojcik
  
  Re: TDD still a thing?
  
  Yes, it could be caught by a very simple unit test. The problem, as always, is combinatorial explosion: There are a huge number of potential problems that could be caught by a similarly huge number of trivial unit tests. Someone or something¹ has to create those unit tests.
  
  Per a comment above, there was a unit test which exercised the code path in question, but its verification logic didn't check all aspects of the output, and so missed this error.
  
  ¹There's a fair bit of interesting research into automating test creation.
  
  0 0 Reply
Wednesday 4th March 2020 08:16 GMT Mike 137

The real world

Testing is a legacy concept.

Engineer "The tiny software of the Saturn launch computer took two years to test"

Manager "We don't have that long, so just get it out the door"

10 1 Reply
1. Wednesday 4th March 2020 09:56 GMT eamonn_gaffey
  
  Re: The real world
  
  The testing legacy has been replaced by the mentality of : "sling any shit live because my bonus depends on it", in an "agile" fashion, because we need to "act fast and break things(literally)" yawn, yawn, yawn..... Some things are fundamental, like a comprehensive test pack.
  
  16 2 Reply
  1. Wednesday 4th March 2020 14:27 GMT Ozan
    
    Re: The real world
    
    Remember what Agile did to MS and Windows. Al lthose years, I was afreaid to install a service pack.
    
    1 1 Reply
Wednesday 4th March 2020 09:14 GMT Anonymous Coward

Who is governing Public PKI providers ?

This is such a simple coding error, where is the human intervention to sample the certificate requests for integrity.

TLS/SSL is relied upon for assurance. This makes you wonder how other PKI providers automate requests and how they are audited.

Regulations require TLS/SSL yet there is obvious a failing in the regulation of this key architecture artefact.

0 0 Reply
1. Wednesday 4th March 2020 11:13 GMT ItsAFunnyOldGame
  
  That would be the CA/Browser forum, they agree and set the baseline requirements. However the balance of power there, at the moment, lies with the browsers.
  
  2 0 Reply
2. Wednesday 4th March 2020 21:41 GMT Michael Wojcik
  
  This makes you wonder how other PKI providers automate requests and how they are audited.
  
  Well, no, it doesn't, because there's a ton of material readily available regarding the state and history of PKIX and the public X.509 CAs. So anyone who's interested in this question can easily find an answer.
  
  Ivan Ristić's Bulletproof SSL and TLS has some good historical material, for example; and you can stay up to date with Hanno Böck's Bulletproof TLS email newsletter. For background there are the PKIX and HTTP-over-SSL RFCs (5280 and 2818 in particular). The CA/BF makes all sorts of stuff available on their website. And so on.
  
  Anyone who's paying attention knows PKIX and other aspects of the public X.509 PKI such as code signing are a mess. Arguably they're less of a mess than they were, say, a decade ago; Certificate Transparency and the increasing willingness of browser manufacturers to shut out bad CAs have helped. But it's still a sausage factory.
  
  1 0 Reply
3. Thursday 5th March 2020 12:50 GMT Anonymous Coward
  
  > sample the certificate requests for integrity
  
  It's highly likely that all of them were perfectly valid. A human review of the inputs/outputs would not have caught this. Only a review of the actual code or test suite.
  
  0 0 Reply
Wednesday 4th March 2020 09:44 GMT NerryTutkins

CertifyTheWeb

Been using CertifyTheWeb on IIS for a while for letsencrypt certs, they have a new version, if you go in and apply the update, you can then ctrl click the 'renew all' button and it will force all certs to renew, even ones that have not expired.

Just done three servers in about 10 mins. So all in all, I think while the error is a bit of a howler, all parties involved in the clean up seem to have stepped up.

2 0 Reply
Wednesday 4th March 2020 10:11 GMT Blane Bramble

Had one (of many) SSL certificates affected. Read the email, forced renewal, ran a script to put the updated certs where they are needed. Restarted a service.

OK, a bit more notice would have been nice, but it took all of 5 minutes.

2 0 Reply
Wednesday 4th March 2020 11:15 GMT Simian Surprise

FTFY

"The proximate cause of the bug was a common mistake: Go..."

5 3 Reply
Wednesday 4th March 2020 12:03 GMT MtK

Akamai CDN certificates not renewed yet

One of our domains appeared in the list. It's for a certificate issued for an Akamai CDN endpoint. Still not renewed as yet.

0 0 Reply
Wednesday 4th March 2020 13:06 GMT Richard Pennington 1

Whatever happened to code review?

I'm retired now, but in my day (and in my security-critical environment) all critical code was peer-reviewed.

2 0 Reply
1. Wednesday 4th March 2020 13:31 GMT mr-slappy
  
  Re: Whatever happened to code review?
  
  Hear hear!
  
  In my day not only was code peer-reviewed, but specifications (does anyone do those any more?) as well.
  
  I still encourage that approach whenever I am able to (sadly not that often in these days). Peer-review can identify many defects that the author will never see because they are immersed in the work.
  
  5 0 Reply
  1. Wednesday 4th March 2020 19:58 GMT Alterhase
    
    Re: Whatever happened to code review?
    
    Many years ago, I was the team lead for RAS (Reliability, Availability, Serviceability) for a new storage product at a large company.
    
    In that role, I sat in on the code reviews and brought up issues as I saw them. I was explicitly "uninvited" to the review meetings because I was slowing down the process.
    
    Guess what: when the product was shipped it had a number of issues that I had explicitly warned about!
    
    2 0 Reply
2. Wednesday 4th March 2020 13:31 GMT baud
  
  Re: Whatever happened to code review?
  
  Well, it was reviewed, you can take a look at the process there: https://github.com/letsencrypt/boulder/pull/4134
  
  1 0 Reply
3. Wednesday 4th March 2020 16:57 GMT MJB7
  
  Re: Whatever happened to code review?
  
  Code review is great - but it is very easy for errors to slip through code review.
  
  Not exactly the same, but I remember three VERY experience engineers looking at a buggy piece of firmware and agreeing it was buggy - but not the end of the world (just a DOS potential, and it wouldn't be exposed to the internet). Then it all went quiet when one of them realized it was actually a critical security vulnerability. Oops!
  
  2 0 Reply
  1. Wednesday 4th March 2020 21:41 GMT Michael Wojcik
    
    Re: Whatever happened to code review?
    
    Yes. In software as in all things you really want defense in depth. Code review, static analysis, dynamic analysis, fuzz testing, as much unit/functional/system testing as you can manage (ideally with automated and human test generation), ...
    
    But code review, done properly, improves the incentives for writing readable code. That's very important not just for security but for maintainability, and thus for reducing total cost of development. Having looked at that Let's Encrypt bug report, I'm not surprised the error was missed; the code follows the "every letter is expensive!" style popularized by generations of mediocre C developers. (The formatting is also awful, but I assume that was due to pasting it into a blog post that doesn't support proper whitespace-preserving formatting. Like some other sites I could mention...)
    
    2 0 Reply
    1. Thursday 5th March 2020 09:44 GMT baud
      
      Re: Whatever happened to code review?
      
      You can take a look at the actual formatting there:
      
      https://github.com/letsencrypt/boulder/pull/4690/commits/ba6b85ee126b50d19837a53a52cb6d894594935f
      
      0 0 Reply
Wednesday 4th March 2020 17:26 GMT Anonymous Coward

A message for Sir Andrew Parker

0nbb0SWz0Vaw1EcD0Ln41MQp03bX1gQu1GT11gwh

0Pk91lsg0c8=0=dG0Tia0mib18vL1P8P0Vv21NdB

1H4U1mWr17c81bRH149X1WIK01go1GHa1iUJ16FA

0WgQ0Jam0GOd0VBG1CkS1Rj30reB0pMx0YMd0uka

0eZh19n60R6=1cpN0PJz0GyI0V5D0Kif0dY00H9b

13xn0h4=13aH0=up1BCF0KTy0J0X1kT=1P=R1Ud9

0iqI0GcT1B5e13Fb0kjI1GMv1HiU11ft0KWj10CJ

1a8$1eBI1CEi0MGw1WxQ1kIr14AX0cD10c$t0zYP

0ESh15Gw0K3=1i9w0RhL1Mb90jd91EaX1Jof0kym

0jOx1d6x1Xmv1KCZ1Xtn0ffd15xt0BB103P50iV9

0XgE19og1azy0e6U0i$s18mP1cfA0GwY1WGK0JjV

05GV04fp16P41Z911cHr0e7a19cm0ZV40X$h1MOa

0hYk1kN70eQl1eQl0$fv192N0mOp1QaW05p50alB

0MtW1AEU1m=s1GN40ebt1JEz1YTz0Eot017e1b39

117$05E21PiB1f5Z1TkM1h3N1Kxg0VDJ01X004Yx

0Cow18u01jWO0$X51LoY1UXn0pIJ1Juc1Zg=1W=d

07NQ16Cw1fBB043T0Tca1B$d1fkY0k9=0JI=0FE5

0HJ=1JaD1If=05KD1jIe02K70BoV1khy1QwN028o

0BIr0mE60Iex1aeV0l9n0e9r1Uwh1IPR09jJ1UZ7

0Q7X1Qs40y9G0pvI0H$t0j$b0Iwy1Axw1bq019X7

1H4Z0TuC0XlM06bM1bRW1PZN1mVf0LYx1i9Q0CHV

0e8113EP0AxD1XTW1f6d0HQS0L=J1Tsa1DQ404g6

0lnP1g$=19oM0KmO17QL0Mty0KNO1A5M0lKi0WO2

05U50$dG1Io605lc1LT80PvQ1Wtu1laN0C4Y1HZ7

1Uyd1a7i01Jh1E6Y1HGN0PCM1XbP19D60S9W0foz

16pf0rfP1ETj01AI07CS00ak0qEF00fh09MK1hzI

0Mt21Prb0uTX11qG094j0v3v0fkO0nZE0XSm0HAC

16n70s9q11491DAW1fzg0rsI0BX51WnV0OwM0d5S

0iHK0cgI

0 1 Reply
Wednesday 4th March 2020 18:15 GMT Lunatic Asylym Seeker

Obviously the code was written by somebody learning on the go :-)

Well somebody had to say it ....

0 0 Reply
Thursday 5th March 2020 13:53 GMT Anonymous Coward

Here we Go again

Can we please stop creating languages with semantics where common bugs are so easy to create and hard to spot? When your language guru says that "The proximate cause of the bug was a common mistake in Go: taking a reference to a loop iterator variable," the right response is "does the language have to allow or encourage such a common mistake?". Is there a valid use case for taking a reference to a loop iterator variable and more importantly, *is that use case so necessary that it's worth the risk?*.

And yes, I'm sure somebody will be along to say "perhaps you'd rather be programming in FORTRAN or assembler?". Of course old languages have their own problems. That's not an excuse for propagating flaws in C half a century ago, either from deliberate compromises at the time, or where we simply know better now, into new languages.

0 0 Reply

POST COMMENT House rules

Not a member of The Register? Create a new account here.

Other stories you might like

Meet clickjacking's slicker cousin, 'gesture jacking,' aka 'cross window forgery'

Web devs advised to do their part to limit UI redress attacks

Security 3 Apr 2024 | 9

Microsoft slammed for lax security that led to China's cyber-raid on Exchange Online

CISA calls for 'fundamental, security-focused reforms' to happen ASAP, delaying work on other software

Security 3 Apr 2024 | 37

Rust developers at Google are twice as productive as C++ teams

Code shines up nicely in production, says Chocolate Factory's Bergstrom

Devops 31 Mar 2024 | 134

Sleuths who cracked Zodiac Killer's cipher thank the crowd

Fifty-one years of community contributions, software, and clever cryptanalysis contributed

Bootnotes 4 Apr 2024 | 37

Malicious SSH backdoor sneaks into xz, Linux world's data compression library

STOP USAGE OF FEDORA RAWHIDE, says Red Hat while Debian Unstable and others also affected

CSO 29 Mar 2024 | 123

Majority of Americans now use ad blockers

We're dreaming of a white list, because we're just like the ones you used to know

Security 27 Mar 2024 | 113

Easy-to-use make-me-root exploit lands for recent Linux kernels. Get patching

CVE-2024-1086 turns the page tables on system admins

Patches 29 Mar 2024 | 26

US government excoriates Microsoft for 'avoidable errors' but keeps paying for its products

Analysis In what other sphere does a bad supplier not feel pain for its foulups?

Cyber-crime 5 Apr 2024 | 20

Microsoft rolls out safety tools for Azure AI. Hint: More models

Defenses against prompt injection, hallucination arrive as Feds eye ML risks

AI + ML 29 Mar 2024 | 10

In-app browsers are still a privacy, security, and choice problem

Regulators reminded that longstanding concerns haven't been addressed

Applications 27 Mar 2024 | 25

Row breaks out over true severity of two DNSSEC flaws

Updated Some of us would be happy being rated 7.5 out of 10, just sayin'

CSO 26 Mar 2024 | 11

Microsoft squashes SmartScreen security bypass bug exploited in the wild

Patch Tuesday Plus: Adobe, SAP, Fortinet, VMware, Cisco issue pressing updates

Security 10 Apr 2024 | 21

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024