back to article The first point release for Linux 5.10 came out barely a day later because storage bugs broke RAID5* partitions

Hopefully not shooting for parity with Windows, the Linux team followed the weekend release of version 5.10 of the kernel with... an update barely a day later. "Nothing makes me go 'we need another week'," said Linux supremo Linus Torvalds of Sunday's emission. However, perhaps a few more days might have been handy as kernel …

  1. karlkarl Silver badge

    "We're not taking cues from Windows now, are we?"

    If the Linux ecosystem was such that the kernel was packaged and forcibly shoved onto our machines, then yes, that would be the case.

    Luckily individual downstream distros tend to do their own testing and packaging which usually catches this kind of regression. Yes, bugs happen but the key is to keep the user in control which is something that companies are starting to fail to provide.

    That said, I am fairly impressed this doesn't happen more often. Linux is not only getting inevitably complex but it is also being pushed around by a number of competing entities (both ethical and non-ethical) trying to integrate their own agendas. I am very surprised we don't hit more technical conflicts and thus regressions. Especially since from my experience, code coming from commercial vendors is typically shoddy and bodged compared to open-source enthusiasts who show some passion and correctness for their trade.

    1. Anonymous Coward
      Facepalm

      Speed

      M$ sins are much greater than anything Linux has done.

      But the code to deadline is never a good idea.

      "Nothing makes me go 'we need another week'." This makes no sense whether it's said by Satya Nadella or Linus Torvalds .

      1. jake Silver badge

        Re: Speed

        "This makes no sense whether it's said by Satya Nadella or Linus Torvalds "

        You have to release it sometime. If there are no obvious show-stoppers, then there is nothing that says "we need to go another week". When the inevitable bugs show up, you patch them. Withholding a release until all the bugs are guaranteed to be gone would mean the kernel would still be back in the 0.x range ...

        Besides, history has shown that Linus knows when to pull the trigger on releases. He'll hold it another week if his lizard hind-brain says "not quite yet ..." Redmond? Maybe not so much. I certainly know who I trust.

    2. jake Silver badge

      "the key is to keep the user in control which is something that companies are starting to fail to provide."

      Starting? Where have you been these last thirty years or so?

  2. John Robson Silver badge

    Bug found on point release.

    No one should have this on any non dev machine anyway, so the comparison with the Micros~1 bug, which was pushed out to production machines, is somewhat contrived.

    The open nature also allowed the specific patches to be identified and rolled back trivially, and from what little I've read it was a mounting error, not a data loss error.

  3. Anonymous Coward
    Anonymous Coward

    RAID 6 ...

    ... is crap anyway. Use RAIDZ2

    Ok I'm trolling a bit, but not entirely :-D

    1. Colin Bull 1
      Mushroom

      Re: RAID 6 ... nor the Fs

      I have seen RAID 5 fall over catastrophically so many times ..

      This should be compulsory reading

      http://www.baarf.dk/BAARF/BAARF2.html

      Needs bringing up to date include the latest marketing wheezes

      1. J. Cook Silver badge

        Re: RAID 6 ... nor the Fs

        I've seen RAID6 die catastrophically as well. (single drive failure, no hot spare, second drive dies due to increased load on the array having to re-build from the first failure.) Sadly, this was the disk pool for the backup server. we lost three days worth of incremental, IIRC. We were able to rebuild the array and move on from it without too much headache, but we migrated to new hardware shortly thereafter.

        Then there was the numpty that made a raid 0 single disk 'array' for a quorum drive on a two-node SQL cluster; thankfully, we managed to virtualize the remaining node on it precisely seven days before that drive failed, which took the cluster down for good. (it was being retired anyway, but still....)

        1. DevOpsTimothyC

          Re: RAID 6 ... nor the Fs

          > I've seen RAID6 die catastrophically as well. (single drive failure, no hot spare, second drive dies due to increased load on the array having to re-build from the first failure.)

          So what was the problem ? You've described 2 disks dieing at the same time in a RAID array designed to handle 2 disks dieing at the same time. "single drive failure, no hot spare" Sounds like a RAID5+1 (without the +1) rather than RAID6.

          RAID6 array can most easily be described as the +1 of a RAID5+1 being an active part of the RAID array rather than a warm (+1) standby just waiting to put the array under load at a time when it's best to remove any load. From memory the smallest RAID 6 array you can create has 4 disks (2 data + 2 parity). Yes you can create an array in a degraded state if you REALLY want, but then you're just asking for problems

  4. Anonymous Coward
    Anonymous Coward

    BTRFS has a bright future behind it

    I used to be very excited when BTRFS came on the scene but gradually it was, seemingly perpetually, always just a year or two before it was ready for production. Since then it has suffered some poor data loss bugs and it's not hard to find anecdotes from people with unrecoverable data issues due to powerloss etc. Thirteen years on and I still wouldn't trust my data to it.

    1. DS999 Silver badge

      Re: BTRFS has a bright future behind it

      It will be stable in time to be used in the datacenter at the first operational commercial fusion power plant.

  5. Anonymous Coward
    Anonymous Coward

    And the fix was done

    > Fortunately, developers keen to have a crack at the final release spotted the problems and the fix was done.

    Well, not really. A revert was done. If they still want to change 'chunk_sectors' from int to unsigned then they'll need to go through the code to find out where it matters and make changes there.

  6. Anonymous Coward
    Anonymous Coward

    Testing...

    Makes me wonder how regression testing is done for Linux. Is there a huge warehouse filled with various obscure hardware where they try out changes?

    1. jake Silver badge

      Re: Testing...

      "Is there a huge warehouse filled with various obscure hardware where they try out changes?"

      Yes. Its called the LKML.

      1. Anonymous Coward
        Anonymous Coward

        Re: Testing...

        Then I can see why they might have a problem with effective regression testing.

  7. Anonymous Coward Silver badge
    Boffin

    > "The latest has cut this down to barely a day. One can but hope the time will not be measured in hours next time around"

    Umm, no, I'd prefer any discovered bugs to be fixed ASAP so that the 'latest' release has those bugs for less time.

    Of course not having bugs would be the ideal, but that's not really possible. Certainly not in 2020.

  8. fishman

    Compiling my own...

    I compile my own Linux kernel - but wait for at least 4 or 5 point releases to have come out.

    I've been compiling my own Linux kernels for 25 years and I've never had a problem.

    1. Anonymous Coward
      Anonymous Coward

      Re: Compiling my own...

      Gosh, you're so clever.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like