back to article Data-destroying defect found after OpenZFS 2.2.0 release

A data-destroying bug has been discovered following the release of OpenZFS 2.2.0 as found in FreeBSD 14 among other OSes. This file-trashing flaw is believed to be present in multiple versions of OpenZFS, not just version 2.2.0. It was initially thought that a feature new to that release, a feature called block cloning, …

  1. phuzz Silver badge

    The older I get, the less happy I feel about using a distro (or any software really) that is 'bleeding edge'. The last release is fine for me thanks, I'll let the rest of you check for bugs in the most recent version ;)

    1. mattaw2001
      Holmes

      I'm sticking to mint for that reason, and Debian on my servers, however had to go to the latest kernel and mesa for my laptop to work well.

    2. zuckzuckgo Silver badge
      Trollface

      But now the definition of 'bleeding edge' is any change made since 2013.

      1. Roland6 Silver badge

        Better locate those OS/390 tapes and the hardware to run them on…

    3. Anonymous Coward
      Anonymous Coward

      The only problem with that is that this issue may have existed for some time but occurs infrequently enough to have remained largely unnoticed.

  2. DaemonProcess

    ZFS here we go again

    "ZFS is fast, more fully featured and totally safe"

    " Oh, there's another teeny-weenie buggette that may corrupt some data but most likely not"

    I'm still steering clear of this. From the description it sounds like multi-user / multi-process testing may be a bit lacking at the moment. Could all fanbois please volunteer.

    1. Paul Crawford Silver badge

      Re: ZFS here we go again

      And your choice of totally safe file system is?

      1. DoContra

        Re: ZFS here we go again

        Can't speak for OP, but in me own (worthless) opinion:

        - There is no such thing as total safety in anything (other than the monotonic increase of entropy)

        - Once I accept non-total safety, I'll prefer to partake on the in-tree FSes unless I know I need/want the features on offer

        So far, ZFS hasn't given me any must-have features[0] to patronize it :) (checksumming[1] is near-universally good, but not worth it for me)

        On a more serious note:

        - btrfs (w/zstd compression) plus off-device backups on personal devices/daily drivers with SSDs

        - ext4/xfs on HDDs or when I (feel I) need more reliability and can do without subvolumes/snapshots/other btrfs goodies

        XFS likely has the better code quality for Linux-supported FSes (they literally made/maintain the testsuite), but e2fsck is pure magic.

        [0]: In fairness, I never had a massive JBOD NAS-type device

        [1]: Had a bad experience with checksumming on btrfs: After a period of general machine hangups/unclean shutdowns, I had a couple of corrupt files. btrfs very helpfully flagged this file corruption by very unhelpfully throwing errors on read() and printing the offending inode to dmesg.

        1. Johannesburgel12

          Re: ZFS here we go again

          Any filesystem without full checksums (metadata+data) is out. Sorry, that's technology from the last millennium. There has been some work on checksums in ext4 and XFS, but it's only half-way supported.

          That only leaves BTRFS from your list, which doesn't just still have lots of issues with corruption in any mode (not just RAID5/6), but also still handles a surprisingly long list of normal issues like a full filesystem very poorly. I can't even count the number of times I had to temporarily add a second device to a file system just ro get out of ENOSPC.

          ZFS on the other hand only has one single issue: it doesn't support removable devices very well.

          It's a shame Linux doesn't have a single, decent in-tree file system.

          1. Paul Crawford Silver badge

            Re: ZFS here we go again

            ZFS on the other hand only has one single issue: it doesn't support removable devices very well.

            To be fair for a FS designed to include RAID as part of its mass storage operation then "removable device" is not a typical use-case.

            It's a shame Linux doesn't have a single, decent in-tree file system.

            Sadly that is true of most OS (other than the dead-man-walking Solaris, and FreeBSD).

            1. Happy_Jack

              Re: ZFS here we go again

              Are hot-swappable disks not removable devices?

              1. Paul Crawford Silver badge

                Re: ZFS here we go again

                I would not simply pull a hot-swap disk out of RAID and hope for the best. Normally you would identify the failing/faulted disk and remove it from the RAID set first (if not already automatically managed by some NAS appliance with flashing LED to show you it is ready to be pulled, etc).

                The main point of "hot swap" is you don't need to reboot the controller to see the new disk.

        2. John Brown (no body) Silver badge

          Re: ZFS here we go again

          "I'll prefer to partake on the in-tree FSes"

          Using FreeBSD, ZFS is "in-tree", ie it's part of the kernel. Linux doesn't have it there due to licencing, but I think that may be changing in some now.

          1. Sudosu Bronze badge

            Re: ZFS here we go again

            I only use Solaris derivatives for ZFS nowadays, mainly Omnios...though it can be hard to find compatible consumer grade hardware in the HCL.

          2. Graham Perrin

            ZFS in FreeBSD

            Strictly speaking: maybe truer to say that ZFS is in FreeBSD base (not entirely part of the kernel).

            1. John Brown (no body) Silver badge

              Re: ZFS in FreeBSD

              Correct, but I was speaking to Linux people, so put it words they would understand :-)

              (That's my excuse, and I'm sticking to it)

      2. DS999 Silver badge

        Re: ZFS here we go again

        Maybe not "totally safe" but ext4 is so widely used that it is at least for my purposes perfectly safe so long as you aren't following the bleeding edge via your distro.

        1. Sudosu Bronze badge

          Re: ZFS here we go again

          ZFS is more of full stack RAID+file system toolset, where EXT4 is a file system that can be applied on top of a RAID or similar configuration.

          I still use EXT4 on my Linux OS drives.

          ZFS also has some facilities for preventing bit rot.

        2. Paul Crawford Silver badge

          Re: ZFS here we go again

          ext4 might be "dependable" in terms of mature code and no bleeding-edge features, but it lacks a lot of the data integrity checks that ZFS has. AFIK it only checksums the journal by default, and lacks some of the atomicity guarantees on file replacement (other than a few hacks to detect a move-rename and flush the recently updated file, etc, to keep it in line with the ext3 behaviour).

          1. DS999 Silver badge

            Re: ZFS here we go again

            Data integrity checking didn't help them find a "data destroying defect" during whatever testing they do (which obviously isn't much) so it failed in the place you need it most - because hardware errors that cause bit rot are far far less likely to happen than software bugs that mangle your data!

            Before you object "well once the data is mangled the checksums will be on the mangled data" if they are doing it that way, they are doing it wrong. Too much software does it like that because it is more efficient to checksum at the output, but the goal is not efficiency the goal is minimizing the possibility of undetected error. You need to separate the process of checksumming from the process of writing the data on the drives, specifically to catch this case. That's how it is done in the networking world, you checksum directly on the received data, then checksum on the data as it is sent separately. It is done that way specifically to detect any data that is corrupted as it is being moved about during the switching/routing process.

            1. Nanashi

              Re: ZFS here we go again

              The bug here is with hole detection, for sparse file support. Userland software asks "is there a hole in this file?" and in some circumstances ZFS has a small window in which it incorrectly claims there's a hole when there isn't. That's the entire extent of the bug; the data is still correct and uncorrupted in memory and is written to disk correctly, and if you issued a read for the data then you'd get the correct data back. But if you were checking for holes in the file so you could skip reading those parts, you're not going to read those parts.

              I'd argue that describing this bug as a "data-destroying defect" is clickbait, since it doesn't do that, it just gives you the wrong idea about whether there's any data in a file, and even then the wrong answer is temporary (although I do not mean to claim that this bug causes no problems or that data loss isn't one of the possible eventual outcomes of the incorrect hole detection).

              This isn't the sort of problem that checksums in the filesystem can catch. It would be difficult to use checksums to do any validation on a read that doesn't happen.

      3. iTheHuman

        Re: ZFS here we go again

        Maybe UFS, or ext4. I'll leave file corruption checking to the media and other tooling, same with replication.

        1. Benegesserict Cumbersomberbatch Silver badge

          Re: ZFS here we go again

          When you can tell me how the hardware or other tooling (not built into the fs code) will tell you if your data has been corrupted, we can continue the conversation.

      4. phuzz Silver badge

        Re: ZFS here we go again

        And your choice of totally safe file system is?

        Backups. Ideally multiple backups on different media/filesystems/OSs.

        Oh, and check your restores work too.

    2. Graham Perrin

      Re: ZFS here we go again

      How does "here we go again" apply to something that's extraordinary/unique?

      Maybe it's "here you go again".

  3. druck Silver badge

    Checksums

    I'm surprised that file level checksums aren't automatically enabled for any experimental feature such as this, so problems can be detected immediately, and eliminating the risk of undetected corruption reaching backups.

    1. Johannesburgel12

      Re: Checksums

      ZFS always checksums everything. If a scrub doesn't detect any errors, that means the data gets corrupted before the checksum is calculated.

    2. Paul Crawford Silver badge

      Re: Checksums

      The insidious problem here is the corrupted data from the bug is then check-summed, written elsewhere, and then appears good on disk.

      You can get the same issue from ZFS and the likes if you don't have ECC memory - the block in memory gets corrupted and nobody knows, it then ends up of disk with a "good checksum" afterwards, and has been whitewashed.

      Bottom line is it is incredibly hard to write a safe file system, even simple designs have had serious bugs in them.

      1. Graham Perrin

        Integrity of original data

        "… corrupted data from the bug is then check-summed, written elsewhere, and then appears good on disk. …"

        If I understand correctly, original data is unaffected. I mean, the bug may bite when data is written elsewhere i.e. copied; not before (not with the original).

        The first four words of openzfs/zfs issue15526:

        some copied files are corrupted

        1. Bronek Kozicki

          Re: Integrity of original data

          Yes exactly - original data is stored fine. It is only the tiny window while it is still being written (to memory, not to disk - it is not IO bound) when, if read, it will appear there are all zeroes. IF there is anyone reading the data at this *very* specific moment. And that's a big IF, which is why this bug actually was in ZFS for a very long time. Some started looking back at Solaris at this point - it might have been "forever", or as long as ZFS supported hole reporting.

  4. bazza Silver badge

    Oooo Nastie

    This kind of bug can be a nightmare to track down. It warrants a thorough code review, as that might just be the quickest way.

    Also, ftrace was designed to help with this kind of thing, ii wonder if anyone has interrogated that yet?

  5. ReaperX7

    The specific use-cases where ZFS zpools are being corrupted is not even commonplace in normal usage scenarios if you actually READ the article.

    Under traditional usage, ZFS zpools will be stable and secure, and will not suffer corruption.

    This article is not meant for general usage scenarios in any way. So please do NOT spread FUD about ZFS or any software to entice a panic that isn't needed.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like