Is it the end of Big Data? Quarta Horribilis for high-end storage

Friday 6th June 2014 16:46 GMT frank ly

I think it's 'quartarius'

(see comment)

0 0
1. Friday 6th June 2014 17:33 GMT tentimes
  
  Re: I think it's 'quartarius'
  
  I hear it's not as messy as the Queens anus horriblus...
  
  0 0
Friday 6th June 2014 17:58 GMT AnoniMouse

Not the end of Big Data volumes, but a consequence of Big Data approaches

It's not that the (Big) volumes of storage are decreasing, but rather the eclipse of high end (and very expensive) storage controllers, brought about by a) increasing amounts of storage attached directly to servers (e.g. in HDFS clusters) and b) various SAN virtualisation (Software defined Everything) technologies.

1 1
Friday 6th June 2014 20:51 GMT Ian Michael Gumby

Most clusters don't use storage servers.

Most clusters are built on commodity hardware with the disks local to the box.

So as one moves to Hadoop, you lose the need for the large SANs, NASs whatever you want to call them.

2 1
Saturday 7th June 2014 00:22 GMT Anonymous Coward

Flash

The last time "other" led the pack with this much growth was before the 2008-9 downturn. At that point, the vendors who did the most with cheap disks in arrays (Data Domain, Isilon, 3Par, Equallogic, etc.) got bought, and "other" crashed, leading to a solid growth wave for the acquirers. Seems like this could also be the beginning of a repeat, but this time it's the vendors who figured out better early recipes for flash.

0 0
Saturday 7th June 2014 00:22 GMT Anonymous Coward

Not so horrible an anus

Um, doesn't anyone realize that both HDS and EMC are at the trailing stages of their higher end product cycles? HDS has announced G1000 publicly, but big customers were told earlier under NDA and who've chosen mostly to wait for it instead of older VSP. It's pretty obvious that a corresponding new VMAX is imminent with the same purchase delays by EMC's big customers, who've also been notified early.

1 1
Saturday 7th June 2014 00:23 GMT Anonymous Coward

"Put another way; are we seeing the cloud and flash put a temporary or permanent kink in the disk array business?"

No, you're seeing HDFS propagate through the enterprise, and those enterprises have suddenly realised that disk arrays are snake oil; a solution looking for a problem, laden with drawbacks for all use cases. We've won five contracts this year alone, including a couple of major public sector jobbies, and I can tell you none of the hardware purchases have gone to HP or EMC!

ODM sales figures have gone through the roof, as have Cloudera's, Hortonworks's etc.

Flash will continue to have its niche, as will "cloud", but the real disruption here is Hadoop. That shouldn't surprise anyone in the storage business. Its storage costs are on the order of 1/10-1/100 that of a "high end" disk array, but its MPP SQL engine (i.e. Impala/Parquet) outperforms Oracle/Teradata et al. Plus you get MapReduce, Solr, Spark and HBase all for "free" on common commodity hardware with complete linear scalability.

Arrays are, rightfully, falling back to the niche in which they belong.

3 1
1. Saturday 7th June 2014 20:26 GMT Anonymous Coward
  
  Hadoop isn't a multipurpose storage array
  
  I have read some about hadoop, and I get how cool it sounds and that you can use your own systems and storage and not pay a lot for a bunch of proprietary hardware, but I still don't see how it allows a company to stop using traditional storage arrays, especially if they are now.
  
  My company has SQL and Oracle clusters on physical servers, and scores of VMware hosts that use mostly fiber channel storage.
  
  We also have loads of enormous CIFS and NFS shares that hold unstructured data for our applications.
  
  All of this is accomplished using one type of storage device (NetApp), and it works very well. It isn't inexpensive, but its also not too bad for what it does for us.
  
  Again, hadoop sounds great, but so far I have only found a project to run NFS gateways to it, and that's about it for access. Can you enlighten me on how we can get rid of NetApp with hadoop without rewriting all the applications to use some new API?
  
  And btw, I and my firm would give my left arm to get rid of Oracle RAC.
  
  2 0
  1. Wednesday 11th June 2014 08:17 GMT Anonymous Coward
    
    Re: Hadoop isn't a multipurpose storage array
    
    It helps if we break it down into a few areas.
    
    The one area Hadoop can't really displace FOGB storage arrays is in driving big virtualisation farms*. But then, at the end of the day, most companies don't have big virtualisation farms.
    
    *(this may well change soon as secondary projects mature and Docker-On-Yarn becomes a one-stop-solution for all distributed computing)
    
    Storage purchases are driven largely by the need to back big databases. Oracle RAC being a prime example. Those bulk storage requirements in databases tend to be driven by "warm" storage - warehousing. We're not usually talking realtime read/write, or supporting massive numbers of users. We're talking write once, change almost never, read often.
    
    In that use case, Hadoop excels - and that should be no surprise as that is exactly what it was designed to do. I've personally been involved in 3 Oracle-to-Hadoop migrations (there was an article here recently about AMD doing exactly that), and am currently involved in a project that was going to be a £30m+ Teradata warehouse consolidation, but is now instead a Hadoop project at a fraction of the cost.
    
    For these cases the development overhead is almost zero. You go to Cloudera (or Hortonworks if you're that way inclined), you pay their trifling license fee, maybe their extortionate day-rates for deployment support, send off to china for your £2k-a-pop boxes and you're pretty much done. After that, in this age of Impala and Parquet, it just comes down to churning out SQL and Sqooping all your tables across. We pay 19 year olds to do it. And it still manages to be faster for something like 19 out of 20 queries.
    
    If performance starts grinding? No worries, just throw more boxes at the problem. No going off to Oracle/Teradata/Whoever to get tuning consultancy or more licenses. It's cheap. Really cheap. That's the main advantage.
    
    Yes, that BI use case is pretty narrow, but it is common as muck and a *major* driver of storage purchases. Another major use case of bulk file storage is more than doable - HDFS now has an HTTP API, and can be mounted as transparently as any other fileshare. Plus you get all the other perks (MapReduce, Hbase, Solr etc.) free out of the box.
    
    Stray further from these write-once cases and you're absolutely right that development becomes more difficult (HBase isn't exactly easy), but what we've found is that once organisations are bought into Hadoop and have realised just how cheap-yet-bloody-effective it is, they're quite willing to start pulling in other components to build it out.
    
    0 0
Saturday 7th June 2014 07:33 GMT P. Lee

One possible explanation

It's one no industry seems to recognise.

We bought one. We don't need another one.

3 0
Sunday 8th June 2014 18:41 GMT Trevor_Pott

The important category is "other." All those startups that are so readily disparaged by the fanboois of the big array vendors? They are not all crap. And people with money to spend know it.

0 0
Monday 9th June 2014 08:01 GMT Anonymous Coward

Decline in the growth of out of house storage

Could it be as a result of a lack of confidence in the security of such solutions?

Usually, the value of the data is many times that of the cost of storage, even if you have to do it yourself. Could it be that data owners think the extra cost of storing it themselves is a worthwhile investment?

0 0
1. Monday 9th June 2014 08:15 GMT Trevor_Pott
  
  Re: Decline in the growth of out of house storage
  
  "extra cost of storing it themselves"? Storing it yourself is cheaper.
  
  0 0
Monday 9th June 2014 11:24 GMT John L Ward

I would have replied earlier, but...

...I fell off my hype-cycle.

0 0
Tuesday 10th June 2014 00:09 GMT KrisMac

I had to laugh at The Reg's choice of links on the right of this article:

"HP: You know what's hot right now? Cloud* storage "

http://www.theregister.co.uk/2014/05/25/hp_thinks_cold_storage_is_hot/

0 0

Topics

Special Features

Vendor Voice

Resources

User topics

Article topics

User topics

Article topics

COMMENTS

I think it's 'quartarius'

Re: I think it's 'quartarius'

Not the end of Big Data volumes, but a consequence of Big Data approaches

Most clusters don't use storage servers.

Flash

Not so horrible an anus

Hadoop isn't a multipurpose storage array

Re: Hadoop isn't a multipurpose storage array

One possible explanation

Decline in the growth of out of house storage

Re: Decline in the growth of out of house storage

I would have replied earlier, but...

I had to laugh at The Reg's choice of links on the right of this article:

About Us

Our Websites

Your Privacy