back to article What it takes to keep an enterprise 'Frankenkernel' alive

Maintaining the kernel of an enterprise distro is not only hard work, it also involves conflicting goals. A talk by Red Hat Principal Kernel Engineer Jiří Benc at this year's DevConf.cz event covered some of the inherent contradictions in keeping an enterprise distro's kernel on its feet. Or at least on somebody – or something …

  1. BinkyTheMagicPaperclip Silver badge

    Just because it's hard doesn't mean you're in the right

    I do sympathise to some extent, but you can't have your cake and eat it. Want to do as you wish? Rewrite/with permission relicense all necessary code.

    This is also still a considerable level below what Microsoft does with Windows. If you read Raymond Chen's Old New Thing blog you'll know Windows has custom installer code built in to shim 32 bit programs that still shipped with 16 bit installers, custom memory allocators to work around broken memory allocation, and many other shims to allow for incorrectly coded but popular applications.

    Also, Windows has a driver model that doesn't involve sticking everything into the kernel, and moving functionality where possible up to userland/ring 3. Although I'll admit I'm not up to date on how much progress Linux has made moving driver functionality outside kernel space.

    Alternatively, the OpenBSD route could be taken : security first, (almost) no compromises. If the change breaks applications, the applications have to fix it. In the end it works but can involve a lot of short to medium term pain.

    1. FIA Silver badge

      Re: Just because it's hard doesn't mean you're in the right

      I do sympathise to some extent, but you can't have your cake and eat it. Want to do as you wish? Rewrite/with permission relicense all necessary code.

      This.... Red Hat has built it's business atop of a GPL-2 codebase (in the case of the Kernel), it has been successful too.

      According to Wikipedia it was founded some 30 years ago and has annual revenue in the billions, yet suddenly it seems like the opinions of others who licence the code it builds on top of means nothing because Red Hat engineers work oh so very hard or something??

      If you can grow a company to total assets over 5 billion dollars, whilst complying with your licence requirements, then it doesn't really seem that the model it's built on needs to change.

      If it does need to change then fork FreeBSD (used by many servers around the world), and start selling Red Had Enterprise BSD. You could keep the source code completely closed, whilst still taking the work of others from upstream as this is precisely what the BSD licence allows. (See Apple).

      But whilst you build atop of the work licenced by people who would like the improvements you make to be freely distributable in source form then it's probably good form to listen to them.

    2. Liam Proven (Written by Reg staff) Silver badge

      Re: Just because it's hard doesn't mean you're in the right

      [Author here]

      > Rewrite/with permission relicense all necessary code.

      But the Hat *does*. It is all out there.

      The kernel tree is right there on Gitlab.

      Want the latest bleeding-edge dev code? Fedora Rawhide. Free of charge, no licensing restrictions.

      https://docs.fedoraproject.org/en-US/releases/rawhide/

      Want semi-stabilised semi-annual releases, with everything fresh and current, but at least tested and assembled into a coherent whole? Just like Ubuntu, but you can develop your skills on the RH tools, RH commands, RH config files in RH locations in RH syntax?

      Fedora. 100% FOSS, no proprietary code, free open licence, use it for whatever.

      https://fedoraproject.org/

      Want the stable subset, branched off Fedora every couple of years or so, which allows you to contribute code, fixes, etc.? CentOS Stream.

      https://www.centos.org/centos-stream/

      There is nothing in RHEL you can't get from them.

      What you pay for is tonnes of testing and integration work, rapid fixes, and enterprise support.

      Only want to test? Only want a few boxes? Gratis to you. Free of charge. Free as in beer. Includes source code access.

      https://developers.redhat.com/products/rhel/download

      RH is not holding back code. It is not making anything proprietary. It just wants money for the oldest slowest-moving slowest-changing distro there is in the world.

      It pays for a huge amount of FOSS development and this is how it funds it.

      Nothing secret, nothing proprietary, nothing relicensed. All 100% GPL 1/2/3/AGPL compliant.

      What IMHO it really wants to do is stop Oracle making money from RHEL. That seems fair to me TBH.

      1. Doctor Syntax Silver badge

        Re: Just because it's hard doesn't mean you're in the right

        The linked stream is over 8 hours. Where does the talk come in that?

        1. Liam Proven (Written by Reg staff) Silver badge

          Re: Just because it's hard doesn't mean you're in the right

          [Author here]

          > Where does the talk come in that?

          Sorry about that!

          Somehow the time code got lost. The talk begins at 6h 07m 30s. In theory, this is a direct link:

          https://www.youtube.com/live/nS0Z1OilOas?feature=share&t=22050

          1. Doctor Syntax Silver badge

            Re: Just because it's hard doesn't mean you're in the right

            Thanks. Will take a look at that tomorrow.

  2. TrevorH

    > No API changes, and no internal ABI changes either

    This is a bit disingenuous. The so called "Stable KABI" almost *always* breaks at a RHEL point release. And since this is Stream and the kernel will be continually updated with new changes during the lifetime of one RHEL point release so I would expect multiple KABI changes to happen during Stream's lifetime between one RHEL point release and the next. If you run RHEL then you just get used to the "stable" KABI not being stable over a point release. If you run Stream then it could break at any time.

    1. Tomato42

      "Stable KABI" is not the whole "KABI". See https://access.redhat.com/solutions/444773

  3. Pascal Monett Silver badge

    "The goals of any kernel update are simple: [..] no changes in behaviour"

    Somebody needs to brief Borkzilla on that.

  4. Stuart Castle Silver badge

    Good article, but the headline made me think it was an "On Call" story about a Linux Sysadmin who'd been forced to build their own Linux Kernel, using bits from other kernels because their employer has a strange set of requirements that couldn't be completely met with one distro..

  5. wtarreau

    Missing fixes are hard to detect, one must not use these dangerous kernels

    I'm not deying Jiri's massive amount of testing, but testing for absence of regression doesn't mean testing for lack of known bugs. The SOLE purpose of LTS kernels is to provide fixes for all known bugs. It should be seen as a collection of fixes. Whenever you skip a fix from a stable branch to maintain your own, you're in fact keeping a bug that was already fixed in -stable. This is almost undetectable, unless, of course, you know how to test all bugs. But given that reporters themselves don't always know how to test them and only rely on long observation, you cannot verify that you're having all needed fixes.

    So please, distro vendors need to really stop this madness of reinventing a parallel maintenance effort that does not involve users. This huge amount of work would be so much better spent testing LTS kernels! And this particular vendor got caught in the past with severe local vulnerability (local privilege escalation IIRC) that had been fixed in mainline and stable something like two years ago but still present in their kernel as not identified as needed.

    Sure, stable and LTS kernels occasionally regress. When they do so, they're immediately reverted. Nobody's asking vendors to ship the very latest patch, it's perfectly fine if they emit one release out of 5 after much longer testing. But it's really important that they closely follow the stream of fixes that go into mainline and -stable. And if they're doing the backport themselves, considering how many times a subsystem maintainer disagrees with a backport and proposes another one, that's a luxury they don't have here and they probably keep a list of incorrect or incomplete backports without knowing. I'm still really really irritated whenever I see non-LTS kernels on LTS distros. This definitely does not contribute to the perception of Linux' quality nor security in field.

    1. Steve Graham

      Re: Missing fixes are hard to detect, one must not use these dangerous kernels

      These were my thoughts too. Why not spend the same effort testing, and if necessary, debugging a kernel release from Torvalds?

      I've been compiling my own kernels from kernel.org for almost 20 years now and have never had one that crashed or wouldn't boot. (Except once when I forgot to include the new-fangled ATA drivers.) OK, so I'm a hobbyist, not an enterprise.

    2. Anonymous Coward
      Anonymous Coward

      Re: Missing fixes are hard to detect, one must not use these dangerous kernels

      <disclaimer, I work for Oracle, opinions are my own>

      > So please, distro vendors need to really stop this madness of reinventing a parallel maintenance effort that does not involve users. This huge amount of work would be so much better spent testing LTS kernels!

      Totally agree - Oracle UEK kernels are based off LTS kernels and are pulling regularly all LTS fixes, it is of tremendous help to keep them secure, even if there are occasional vulnerabilities/bugs that are fixed before they are in LTS if their impact is high and Oracle was notified early.

      > And this particular vendor got caught in the past with severe local vulnerability (local privilege escalation IIRC) that had been fixed in mainline and stable something like two years ago but still present in their kernel as not identified as needed.

      Latest example to date (that I know of) is CVE-2022-42703 which took months to get fixed in RHEL even when GPZ wrote an article on how to exploit it https://googleprojectzero.blogspot.com/2022/12/exploiting-CVE-2022-42703-bringing-back-the-stack-attack.html

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like