Reply to post: Insider: Technical differences of XIV Gen3 vs. FlashSystem A9000

XIV goes way of the dinosaurs as IBM nixes fourth-gen storage array

Axel Koester

Insider: Technical differences of XIV Gen3 vs. FlashSystem A9000

Let me offer some technical background about the differences between XIV Gen3 and its successor FlashSystem A9000, and the reasons why we changed things. (Disclaimer: IBMer here.)

First, both share the same development teams and major parts of their firmware. Notable differences include the A9000's shiny new GUI, designed for mobile.

The major reason why there is no "XIV Gen4 with faster drives" is that you can order faster or larger drives for a Gen3, or even build your own flavor of XIV deploying a "Spectrum Accelerate" service on a VMware farm - or deploy a Cloud XIV as-a-service. They're all identical in look and feel.

The original XIV data distribution schema was designed to squeeze the maximum performance out of large nearline disks leveraging SSD caches. For an All-Flash storage system, that doesn't make sense.

Plus we noticed two roadblocks in x86 cluster storage design: First, standard x86 nodes full of dense SSDs are not up to the task of driving all that capacity (plus distributed RAID) at desirable latencies; NVMe fabrics might relieve some bottlenecks in the future.

Second, even the best SSDs get depleted after heavy use, and we want to avoid having to deal with too many component failures at once. Also we preferred a design without opaque 3rd party SSD firmware mimicking disk drives, which would have serious limitations in lifetime / garbage collection control / health binning control etc.

The A9000 therefore leverages FlashSystem's Variable Stripe RAID, which is implemented in low-latency gate logic hardware. Think "Variable Stripe" as "self-healing" - a feature known from the XIV, but with RAID-5 efficiency. On top lays a data distribution schema that uses a 2:1 ratio of x86 nodes to Flash drawers, or even 3:1 when it's just one pod (for lack of workload entanglement, among other things). The result is a system that runs global deduplication + real-time compression at latencies suitable for SAP databases and Oracle. Which also implies that incompressible or encrypted data is not ideal - so it's not a system for ANY workload. But it's definitely not restricted to VDI, like some others. I'd encourage everyone to run simulations on data samples.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon