back to article Multi-petabyte open sorcery: Spell-binding storage

Mixing petabytes of data and open-source storage used to be the realm of cash-strapped academic boffins who didn’t mind mucking in with software wizardry. The need to analyse millions, billions even, of records of business events stored as unstructured information in multi-petabyte class arrays makes ordinary storage seem like …

  1. Anonymous Coward
    Anonymous Coward

    Not to mention open APIs?

    There is a halfway house between open source and closed proprietary: proprietary that implements open APIs. For instance, you might adopt a proprietary product like IBM Elastic Storage for the support and performance; but adopt the discipline of only accessing data through Swift object APIs or POSIX-compliant file APIs to give yourself some measure of portability.

    Is it perfect? No. Is it an alternative worth considering? I'd say yes.

  2. Duncan Macdonald

    Cost of SSD seems overestimated

    You can currently get 1TB SSDs for about £300 each so each petabyte of storage should cost about £400,000 (allowing 33% extra for RAID) (this excludes the cost of the controllers, enclosures etc).

    All flash is perfectly possible for a cost that is likely so be small compared to the value of the data.

    1. Ben Norris

      Re: Cost of SSD seems overestimated

      Well for a start a £300 SSD does not have a super capacitor so would likely corrupt your RAID at the first power outage. Then there is all the supporting hardware, you can't just buy a petabyte of disks and bung them in willy nilly and expect any kind of performance or reliability. And how are you planning to back all this up?

      As they say, the devil is in the detail.

  3. Dave Filesystem

    ZFS

    What about this product:

    http://www.dnuk.com/zetavault/zetascale/

    It's software based so you can supply your own hardware.

    1. Anonymous Coward
      Anonymous Coward

      Re: ZFS

      Looks like ZFS on Linux with an Infiniband interconnect. I've used OpenZFS on Linux and the recent versions are getting closer to ZFS on Solaris and BSD in terms of stability and features. But ZFS is not a distributed/parallel/cluster filesystem like pNFS, GPFS, Lustre or Ceph, though it serves very well as a foundation for them (except GPFS which has its own). To scale out horizontally with multiple NAS heads you will need a distributed filesystem. The Sequoia supercomputer team at LLNL has done a lot of development integrating Lustre and ZFS, which is a bit odd since Sequoia is an IBM Blue Gene and you'd think they would just use GPFS.

  4. pyite

    pNFS - block & object

    IIRC, the pNFS spec is supposed to have an API for block & object storage too. Is this the case, and if so is anyone offering this functionality yet?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon