back to article OpenZFS 2.3 is here, with RAID expansion and faster dedup

The latest version of OpenZFS offers RAID expansion, plus faster data deduplication donated by iXsystems. The code will be available very soon in the beta of TrueNAS SCALE 25.04. OpenZFS release 2.3.0 is out, and will be in Linux distros that include ZFS, such as Ubuntu, Proxmox, NixOS and Void Linux – and eventually in …

  1. firstnamebunchofnumbers

    Give ZFS another try

    I have used ZFS for home storage briefly over the years and always found it slightly restrictive and ended up going back to mdraid+LVM. For basic filesystem I was very pleased but the physical volume management felt clunky. This zRAID expansion sounds like it is worth me taking ZFS for a spin again though.

    My main confusion point with ZFS was that I never figured out a sequence (perhaps I just didn't RTFM enough) to move ZFS datasets between pools of physical disks while keeping the mountpoint available and online. Every process of migration seems to need a manual copy of data (or snapshot) and a pool destroy/re-create, which necessitates a short downtime to flip the replacement dataset to the desired mountpoint. It's even more complicated if you are limited on chassis bays/ports/drives which might require an intermediate copy to a temporary larger-but-lower-redundancy disk pool. With LVM the killer feature is pvmove (I have done SAN migrations using pvmove between LUNs or arrays and keeping everything mounted and running is superb). Is there an equivalent method for online backend disk/volume migration in ZFS that I have perhaps missed?

    1. Liam Proven (Written by Reg staff) Silver badge

      Re: Give ZFS another try

      > move ZFS datasets between pools of physical disks while keeping the mountpoint available and online

      Wow. TBH I have never even contemplated this might be possible, let alone tried to do it.

      1. Freddellmeister

        Re: Give ZFS another try

        Most LVM:s can do it.

        Spectrum Scale can do it..

        Very handy for moving underlying storage while the application is up and running.

        I remember being so frustrated when Sun announced ZFS, I felt they already had a much better FS in the form of SAM-QFS.

        ZFS might be great for NAS boxes, but at the time we used Sun machines for active/active enterprise applications stored on SAN.

        1. Anonymous Coward
          Anonymous Coward

          Re: Give ZFS another try

          SAM-QFS came from the acquisition of StorageTek, so the Solaris team's Not Invented Here arrogance would never have allowed it to be included as a standard part of the OS.

      2. firstnamebunchofnumbers

        Re: Give ZFS another try

        Well I am under the impression that online migration of datasets to different vdevs (if that is the term) still isn't possible with ZFS while keeping the data mounted at the same mountpoint. Hence me continually sticking with LVM in almost all scenarios.

        pvmove with LVM is brilliant. About 15-20 years ago I was using it to migrate many PBs of SAN storage to new arrays while applications and LUN consumers were none the wiser (taking care to stay well within SAN IO limits of course).

    2. eldakka

      Re: Give ZFS another try

      > and a pool destroy/re-create, which necessitates a short downtime to flip the replacement dataset to the desired mountpoint.

      why do you need to do a destroy/re-create?

      By default the mountpoint for a filesystem is inherited from the parent pool/filesystem (e.g. poolname tank, filesystems will by default inherit that and be mounted at /tank/<filesystem>/ ...), but you can always manually set (and change) a mountpoint for a filesystem.

      So if you send from tank/mydata mounted at default /tank/mydata to new_tank (so be default mounted at /new_tank/mydata), you can change the mountpoinf of /new_tank/mydata to /tank/mydata:

      zfs set mountpoint=/tank/mydata new_tank/mydata

      The mountpoint doesn't need to be the same as the pool or filesystem name:

      zfs set mountpoint=/data/fred/bob new_tank/mydata

      Of course, you'll need to either take the old filesystem ofline first,

      zfs unmount tank/data

      Or unmount the new filesystem, specify the new mountpoint, then unmount the old and mount the new:

      zfs unmount new_tank/mydata

      zfs set mountpoint=/tank/mydata new_tank/mydata

      zfs unmount tank/mydata

      zfs mount new_tank/mydata #which will now be mounted at /tank/mydata

      Note however, if you already have manually set the mountpoint on a filesystem, doing a zfs send/receive of all properties will also apply that mountpoint name to the new copy (the target of the receive), but in that case if it's on the same host you probably weant to use a 'zfs receive -u' to prevent the destination filesystem from being mounted pre-maturely if the source filesystem is still mounted.

      1. firstnamebunchofnumbers

        Re: Give ZFS another try

        > Of course, you'll need to either take the old filesystem ofline first,

        ...

        > Or unmount the new filesystem, specify the new mountpoint, then unmount the old and mount the new:

        My original question was whether I could keep the mountpoint online and data in-use throughout, identical to using pvmove with LVM. The ZFS migration process described here doesn't seem to achieve that.

        So just to follow-up, with ZFS is there no way I can move a bunch of data (a dataset) from one set of physical disks (vdev?) to a different set of physical disks, while keeping the data mounted and fully in-use by applications at all times?

        1. eldakka

          Re: Give ZFS another try

          > My original question was whether I could keep the mountpoint online and data in-use throughout,

          Your original post included other statements besides the specific question you asked, which seems to imply doing additional things that may be unnecessary:

          "Every process of migration seems to need a manual copy of data (or snapshot) and a pool destroy/re-create,"

          So I tried to be helpful to point out that some of those steps you listed as doing may not be necessary.

          You can also rename a pool too if you want. if you've created a new pool and still want to just use the default mount points for consistency with the pool/dataset name, you can rename the pool once you've exported (or destroyed) the old one to the same name as the old one had.

          But with respect to doing the whole thing live without even a minor hiccup in re-mounting, I do not know, but am not aware of being able to do so, however I am far from an expert in ZFS.

          A better place to ask such questions would be on something like the level1tech forums where many 'enterprisey'-type people (e.g. Wendel from Leve1tech's) contribute,

  2. K555

    "HP Microservers running TrueNAS"

    The old Gen 7? There was a point where they were going for £190 new (with a 250Gb drive and 2GB RAM) with a £100 cashback offer. Lost track of how many of those became very well prices NAS boxes for customers. You can still pretty much trade them on eBay for as much as they cost new.

    I'm running a pair with an add in LSI SAS card and a 4x 2.5" HDD adapter in the 5.25" optical bay. Bumping into 8GB of RAM being a little limiting now and if I use ZSTD compression rather than LZ4 it can bottleneck on the CPU, but they still perform 'well enough'.

    What I can tell you is that the PSUs really like to go pop and make burning smells! I'm down to my last couple of spares now and I think that'll be what finally kicks me off the platform. I'll take that as a sign I have to suck it up and try the Linux based Scale over the FreeBSD Core TrueNAS offerings. Then grumble about how they used to be fine with 2GB of RAM doing exactly the same job before shouting 'get off my lawn!' at some kids.

    1. Liam Proven (Written by Reg staff) Silver badge

      Re: "HP Microservers running TrueNAS"

      > The old Gen 7?

      In daily use with TrueNAS Core, one N54L and one Gen 8.

      There's an older N40L with a trifling 0.9TB array made of 4 old 300GB drives that only has 6GB RAM, and its fan controller is shot so it's noisy. It currently runs OMV but I'm thinking of trying Proxmox on it as an experiment. Not much room for VMs in so little but it wouldn't be for production use.

  3. brainwrong

    Still no

    The lack of raid expansion is the one thing that keeps me away from ZFS. However, what they've added here is half arsed and inflexible, and doesn't meet my needs. I'm still waiting for BcacheFS with erasure coding, which would be perfect for my needs, but who knows when that will be ready * . BTRFS promised to do what I want, but has been a big let-down thanks to being badly designed (raid56 breaks copy-on-write, looks unfixable without adding journaling to btrfs). It seems the only way to get flexible erasure coding with self healing bitrot detection is mdadm on top of dm-integrity, which is dog-shit slow, and comes with a big write-amplification penalty on ssd's. May just as well have several independant mdadm arrays with btrfs single disk fs on top, and manually fix any read errors from another good copy.

    * My understanding is that BcacheFS existed and worked before the current effort to integrate it into the Linux kernel, so I don't understand why it seems to be being re-written from scratch

    1. Liam Proven (Written by Reg staff) Silver badge

      Re: Still no

      > I don't understand why it seems to be being re-written from scratch

      I don't think it is at all.

      I think it being in the kernel means it's now exposed to millions more people than are prepared to build custom kernels, and it is also subjecting it to far more bug scrutiny.

      TBH I think there is some truth in the argument that it _wasn't_ truthfully ready for it, but OTOH, without the pressure of being in the main tree, maybe it never would have been -- and now it's having to mature fast and suffering growing pains.

      There _is_ a long-term plan to rewrite it in Rust, and that will be interesting if it ever happens.

    2. Androgynous Cupboard Silver badge

      Re: Still no

      I switched to ZFS years ago, so it's been a while since I used MD - but I recall you could add disks, and also upsize an array if you replaced (eg) all your 1TB with 2TB. I don't recall being able to change things like RAID parity, but it's a fairly niche requirement.

      ZFS can certainly upsize arrays, and now that it can add a new disk to a volume I'm curious what you need to do that MD can manage and that ZFS can't? I've been pretty satisfied with it overall, the main issues I've had are poor support from GRUB, and the fact it preallocates memory in an odd way which means it plays badly with other processes that preallocate (ilike Java). Also a niche requirement, to be fair.

  4. An_Old_Dog Silver badge

    Kitchen-Sink and Reliability

    The new storage-related features seem great.

    Running apps, containers, and VMs on your storage boxes? Why?

    That-radical of an expansion of the codebase screams, "bug-breeding grounds!" Isn't reliability the #1 priority of a storage subsystem?

    1. Liam Proven (Written by Reg staff) Silver badge

      Re: Kitchen-Sink and Reliability

      > Running apps, containers, and VMs on your storage boxes? Why?

      Because the big k8s users will pay $LOTS for it, and that is driving sales.

      iXsystems said FreeBSD wasn't developing fast enough. I do not know about the big-systems side of things, but it certainly is true that FreeBSD lags behing in areas like Wifi chipsets and graphics card support.

      Perhaps this is also true in other cutting-edge hardware area, like very high-speed networking and storage drivers. That's more in its core territory but it is not beyond the bounds of the believable. Linux certainly is stronger there.

      1. Androgynous Cow Herd

        Re: Kitchen-Sink and Reliability

        “ Because the big k8s users will pay $LOTS for it, and that is driving sales.”

        Mmmmmm or so the crack marketing team would have you believe.

        There are a few large ZFS implementations in the world, true enough, but those are built on the pure open source, not the commercial offerings of iX or anyone else.

        TrueNAS is not an enterprise product - it is departmental at best. These stunning feature announcements in 2025 actually underscores that: “Hey, you can now expand a RAID pool (but not increase parity! Hope you enjoy those increased rebuild times as your parity thrashes harder to keep up!”) and “Hey, our Dedupe now sucks slightly less than it always has” are not super compelling

        It’s referred to as “Ghetto NAS” in the storage world for a reason.

    2. Len

      Re: Kitchen-Sink and Reliability

      Running apps, containers, and VMs on your storage boxes? Why?

      Those features (and therefore that code) is not part of the OpenZFS code. It's part of the appliances that some companies make and that are build around OpenZFS. OpenZFS is still only a storage subsystem.

      1. An_Old_Dog Silver badge

        Re: Kitchen-Sink and Reliability

        1. I understand that the additional apps/containers/VMs code just came along with (or was apt-get install AppName'd into) whichever Linux distro SCALE 25.04 is now built on, but that's still additional complexity which, though not directly connected to OpenZFS, can affect OpenZFS if a bug in those additional features slows, cripples, or locks up the underlying OS.

        2. Why would Kubernetes users want to run apps/containers/VMs on their storage boxes?

        1. Androgynous Cow Herd

          Re: Kitchen-Sink and Reliability

          Think storage adjacent services. Tree crawlers/scanners, graphana, snapshot or replication managers or engines,

          Containerizing those sorts of things isn’t a bad architecture at all. Many modern storage vendors do exactly that for their functionality, or even for the protocol stack itself…Ganesha in a container for NFS etc, FTPd or whatever in another.

          But that ability is completely outside ZFS. Of four definitely not ZFS NAS storage platforms at my shop, at least 2 of them do exactly this for their various advanced functions, and other systems that have been tested here do this as well. Containerizing the various parts means an OOM instance of Samba or your whiz bang GUI doesn’t take down the whole house of cards, for example.

  5. Anonymous Coward
    Anonymous Coward

    Interesting.. I've always avoided dedup as the standard advice is don't bother as it's a slow memory hog, but might be worth a look.

    raidz expansion isn't a thing I've ever needed in practice (generally I just swap in larger drives over time instead.. esp. with nvme storage being limited by lanes).but should boost takeup.

  6. Throatwarbler Mangrove Silver badge
    Facepalm

    I don't understand the popularity of ZFS

    ZFS seems beloved by the socks-and-sandals brigade, but, having worked with it fairly extensively in recent years, I don't get it. As we see with this announcement, it's been less flexible than other RAID solutions, it's less dependable and more brittle than other filesystems, and it doesn't even perform particularly well. Apart from engendering the same sort of satisfaction that accompanies compiling one's own Linux kernel, what's the point?

    1. Anonymous Coward
      Anonymous Coward

      Re: I don't understand the popularity of ZFS

      I was looking for OS/filesystem options for a home NAS and the main thing that put me off was almost every forum thread where a newbie was asking questions quickly degenerated into "if you're not using ECC/a UPS/enterprise grade hardware/etc then you clearly don't care about your data and you might as well delete it and die!".

      At least now it offers some form of RAID migration, unsuspecting people wanting to expand their storage slightly won't be told "just buy enough disks for a whole new NAS, create a new pool, and copy the data across."

      "Oh, and make sure they're enterprise drives or you literally hate your data etc.".

    2. Anonymous Coward
      Anonymous Coward

      Re: I don't understand the popularity of ZFS

      ...it's less dependable and more brittle than other filesystems...

      Excuse me? Can you name me one file system that is less brittle than ZFS? You can accuse ZFS of many things, being slow, being a ten-tonne-truck when you need a little run around, being too enterprise for home users, but you can't accuse it of being brittle. Most of the valid criticism against ZFS is usually because preventing data-loss is prioritised over things such as speed, or ease of use, or flexibility.

      1. Androgynous Cow Herd

        Re: I don't understand the popularity of ZFS

        Oh, someone’s baby just was identified as ugly.

  7. Luiz Abdala
    Windows

    Steam?

    Could I run a NAS with this thing to run a personal Steam repository? Can it understand usability patterns and move more accessed stuff to faster media? I mean installed drive games, not a download cache Netflix-like.

    Another question: Steam Deck with Linux?

    Coming from a zero knowledge of ZFS and Linux.

    1. phuzz Silver badge

      Re: Steam?

      To answer your first question: quite a few different NAS products use ZFS. The big one is TureNAS, but you might want to look at something that's designed to be more user-freindly. The second part is what's known as 'storage tiering', and there's not really any cheap solutions that do exactly what you want (now Storage Spaces has mostly died), but you can certainly add SSDs to ZFS as a cache, which will mostly do what you want.

      You second question isn't really a question, and I guess you already know that the Steam Deck already runs Linux? If you're asking if you can use a NAS as extra game storage on the Deck, then yes you can,google for 'steam deck smb share' for instructions. Obviously this would only be useful if you're at home with your NAS.

    2. K555

      Re: Steam?

      I keep Steam Libraries on mine for a couple of PCs.

      Yes, you can fit SSDs in to cache the most frequently/recently used data, although if it's not heavy random access, HDDs can usually keep up.

      I've then got a 250GB partition on an NVMe drive on the local machine used as fs-cache for the NFS mount, so the most recent stuff is held locally. Even with 1gbps networking, most titles load just like they're purely local in this configuration.

      1. Luiz Abdala
        Angel

        Re: Steam?

        Exactly what I was looking for. Nailed it. Will study the setup later...

    3. chuckufarley

      Re: Steam?

      I wouldn't try it because it will turn into a PITA down the road. If you want a RAID use a simple mdadm, LVM, or btrfs setup with NFS. I actually have my Steam Library hosted on btrfs with LVM at home but I am not using a Steam Deck. However any *nix and Pro versions of Win10 and Win11 can mount NFS shares.

      NFS is simple to set up even though it has little built insecurity. It just works by default. It is well documented, robust, and stable.

      You will have to spend more time researching and deploying the storage system then the NFS share.

      1. Liam Proven (Written by Reg staff) Silver badge

        Re: Steam?

        > simple mdadm, LVM, or btrfs setup

        No. So very much no.

        1. "Simple" does not go with "LVM" and I'd challenge its use with Btrfs.

        2. Btrfs is as unreliable as hell. Btrfs is the reason Bcachefs says it won't eat your data. Btrfs makes the `df` command tell lies, and it doesn't have an `fsck`, and the repair command destroys filesystems. This is easily confirmed: both the Btrfs official docs and SUSE docs tell you _not_ to run it.

        2. You use "or". In other words, you seem to be saying use _either_ mdadm, _or_ lvm, _or_ Btrfs. This means you've missed the point of ZFS, which is that it does the primary functionality of all of those, and it does it in one tool.

        None of those work alone. mdadm makes block devices but then you need a filesystem on top.

        LVM manages partitions dynamically -- good -- in a horribly complicated way -- bad -- and then you need a filesystem on top.

        Btrfs can do some of what both of them do, so it overlaps, because it can't do all of it, so you need to use multiple tools anyway, and the result is more complex than any of them alone. This is bad.

        ZFS does all of that, in one place in one tool.

        It makes arrays, it manages pools, you can add and remove disks, enlarge arrays, and it also manages formatting them and mounting/unmounting them, and dynamically moving volumes around within them.

        It's also very very solid, unlike Btrfs. Fill a Btrfs volume and kiss your data good bye: it's dead, gone, irretrievable.

        I've not lost a byte on ZFS yet, in coming up on 5 years. Btrfs died on me at least twice a year and I never _ever_ successfully retrieved anything.

        As if this combination of provincialism, ignorance, and disinformation wasn't bad enough, then in another comment you double down on it:

        > if things go South with your storage server you can boot any Linux live CD/USB then (if needed) modprobe btrfs to have access to your data.

        Yeah, no. Because that applies to ZFS as well, and I've done it. I bunged Xubuntu on a key and manually did a file-level dedupe when I accidentally filled an array.

        And the good thing is, with a full array, it coped fine and didn't drop a single packet.

        No. I call BS.

        Look, you may be familiar and comfortable and happy with Btrfs and mdadm and LVM and maybe all of them. Good for you. I am not saying you are wrong or they don't work. They do. There are big caveats but they are in use by millions of machines.

        But you are using your apparent ignorance of why ZFS exists to decry it, when the responsible adult thing to do is go "hey, people say this is good, I should try it". Because that is what I did. I tried all of them. mdadm is great. I've been using it for decades. It's fine.

        LVM is a PITA and the kernel team should have picked EVMS instead.

        Btrfs is dangerously fragile and I will never trust it again. It is I am sure perfectly possible to plan for its fragility and work around it but you should not have to.

        But they OVERLAP and that is BAD.

        Using 2 of them means intersection of functionality. Let's say LVM offers even-numbered tools and Btrfs odd-numbered ones. You need an array across lots of disks. Alice will build a server use tools 1, 3, and 5, while Bob uses 1, 2, and 4. Charlie uses 2, 3, 4 and 5.

        Trying to work out which and why and where is dangerous territory.

        But worse is using all three, when you don't know if the RAID is by mdadm but the encryption is LVM and the snapshots are Btrfs, or was it LVM snapshots on a Btrfs RAID...

        No.

        The point is, overlapping tools are hazardous. Which is _why_ ZFS was built. It does the logical volume management, and the array management, and the encryption, and the mountpoint management, and the monitoring, and it does it all in one place.

        The result may be less flexible but the win from having it all in one place is _significant_.

        Before you pontificate on it, you need to research this stuff and know, and it looks to me like you didn't.

        The result is like a Vim user taking the mickey out of Eclipse without ever taking the time to look and realise that while Vim is faster and they know it well, Eclipse also does a dozen other things that the Vim user had to use other tools _as well_ to achieve.

        1. chuckufarley

          Re: Steam?

          I am sorry you disagree so strongly, but I'll stand by opinion that for simple shares using NFS on Linux ZFS is overly complicated and makes life harder if you ever want to switch distos or recover a system. You are right, fsck should not be used with btrfs. In fact btrfs even comes with it's own scrub command, just like ZFS. Does ZFS have place? Yes. Is that place a home user storing game files on a NAS? I don't think so.

          As for losing your data to btrfs I have never lost anything to it in over eight years of use. Sorry it happened to you.

        2. collinsl Silver badge

          Re: Steam?

          I've been using ZFS on Linux for the last 10-15 years (I really can't remember how long TBH, sometime around 2015) and I've also never lost a byte that wasn't my direct fault being stupid. I have made sure to use NAS grade disks and ECC RAM though, plus weekly scrubs of the data to ensure consistency.

          I've also used LVM in enterprise spaces in multiple companies in a very basic way, to allow VMs to grow disks provided by a virtualisation platform or a storage array - we never used any advanced features of LVM like thin provisioning. We also never lost a byte that wasn't down to human error because the technology was backed by enterprise hardware and the most advanced thing we ever did with it was a pvmove on a FibreChannel presented disk when we needed to migrate between storage arrays.

          For what we used it for, I'd argue that LVM is simple. Where it starts getting complex is when you get into the advanced topics like thin provisioning and deduplication/compression etc.

    4. Tom 38

      Re: Steam?

      Can it understand usability patterns and move more accessed stuff to faster media?

      Sure, I have a large old ~20TB raidz (it hasn't been powered up in 2-3 years...), spinning disks are slow, so I have a bunch of memory dedicated to the ARC, and an SSD to serve as L2ARC. I was more read heavy than write heavy, but writes don't return until confirmed on all devs, so you can add a ZIL on a SLOG to speed that up.

      Glossary:

      ARC - Adaptive Read Cache - main memory devoted to caching frequently accessed blobs

      L2ARC - Level 2 Adaptive Read Cache - an optional cache you can configure on a block device, eg an SSD

      ZIL - ZFS intent log - all writes get confirmed in the ZIL first. The ZIL is usually part of your array unless...

      SLOG - Separate log - a dedicated device for hosting the ZIL

  8. chuckufarley

    I am pleased to see the progress...

    ...but I will be sticking with btrfs. Not only is it far more flexible and just as reliable, if things go South with your storage server you can boot any Linux live CD/USB then (if needed) modprobe btrfs to have access to your data.

    1. bazza Silver badge

      Re: I am pleased to see the progress...

      I've heard bad things about some aspects of btrfs, and so have avoided using it directly. However, I also run a Synology box, and that's using btrfs and I've had zero problems with that.

      I vaguely recall that Synology have been careful to avoid some of the more tricksy aspects of btrfs. And - for a home NAS - I have to say that it's all pretty slick and easy. Swapping out drives for bigger drives is very simple (one simply needs patience).

      I think the world of file systems is extraordinary. Clearly, it's perfectly possible to put a huge amount of effort in to developing one and wind up with a lemon. Good ones seem to be as rare as hens teeth. They're ferociously complicated, and getting more complex when a naive view is typically "it's just files, what's the big deal".

      I can remember years ago when HP were still making a big deal about memristor, and how it'd replace all storage (volatile and non-volatile) because it was superior in all ways to both. I remember thinking then "well, that's the end of filesystems because everything will simply be a memory allocation by an OS". I think I also thought the opposite, that memory allocators would die out because finally everything truly was just a file. We didn't get memristor of course... And then there's the hoary old question of why isnt' a file system a database, and why isn't a database a file system.

      But the lines are somewhat blurred these days. It's going to be interesting to see whether we're still using all of SQL, file I/O and malloc / free in the future.

      1. K555

        Re: I am pleased to see the progress...

        I had to take a look at a Synology box recently and I think I clocked it using mdraid to handle the discs then btrfs on top of that.

        I assume that's because it was configured as a RAID5 which is still a no-no on btrfs.

        1. bazza Silver badge

          Re: I am pleased to see the progress...

          Yes indeed, it is md (or at least, the results of ps -ef show a lot of familiar looking kernel threads).

          They layer a management system on top that's pretty effective (from the point of view of the end user looking for simplicity, low skill and not much reading). If interested take a look at Synology Hybrid RAID. Obviously it's not meant to delight and entertain a Unix / Linux purist, but it's an effective consumer product. One can opt for whatever RAID system one likes and manage it oneself if one wishes.

  9. RAMChYLD Bronze badge

    No Kernel 6.13 and 6.14 support out of the box tho?

    Surely that's a bit shortsighted?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like