Store flaw? Naw! The hyper-converged vendor and the 'bug'-bash

Tuesday 18th October 2016 13:51 GMT Platypus

I feel bad for everyone involved. For the customer, the reasons are obvious. For Maxta, this is all too reminiscent of experiences I had working at small companies, and especially in storage. One of the main culprits seems to have been bad controller firmware. Even companies that control the hardware sometimes have trouble with that one. When you ship software to run on hardware the customer controls, the situation becomes impossible. The second issue sounds like the good old Linux "OOM KIller" which was an incredibly stupid idea from the day it was conceived. At both of my last two startups, we ended up having to disable memory overcommit because of the havoc that would result when the OOM Killer started running around like a deranged madman shooting random processes in the head. To be sure, Maxta probably could have done a better job controlling/minimizing resource use, but I know that's a difficult beast to fight so I'll cut them some slack. Put both of these problems in a context of confused business relationships and expectations, and it's no surprise that a disaster ensued. The lesson I take away from this is that vendors need to keep the list of Things To Avoid complete and up to date, while customers need to be clear and open about what they're doing to make sure they don't fall afoul of that list. Amateurs and secret-keepers have no place in production storage deployments.

1 0 Reply
1. Tuesday 18th October 2016 14:08 GMT Nate Amsden
  
  So as a Linux user for 20 years now what would you replace the OOM killer with? A simple kernel crash because the system is out of memory? Or maybe just a lockup?
  
  I equip my VMs with 512MB of swap because I'd rather them fail then go into swap and choke on their own blood in the process of impacting other systems with the high disk i/o from swapping on shared storage. I've thought about just having no swap but having a tiny amount seems ok and in most cases it never gets used so no disk space consumed.
  
  1 0 Reply
  1. Tuesday 18th October 2016 15:05 GMT Platypus
    
    Way to play the false-dichotomy and appeal-to-authority cards, Nate. I've been a Linux user just as long as you claim, and a UNIX user for a decade before that. There are other options besides a crash or hang. I even mentioned one already: don't overcommit. If there's no swap (really paging space BTW but I don't expect you to know the difference since you don't even seem to realize that allowing overcommit increases the page/swap pressure you so abhor) then memory allocation fails. The "victim" is statistically likely to be the same process that's hogging memory, to a far greater degree of accuracy than any OOM-killer heuristic Linux has ever implemented. If you want to avoid paging, limit your applications' memory usage and don't run them where the sum exceeds memory by more than a tiny amount (to absorb some of the random fluctuations, not steady-state usage). If you fail to follow that rule, adding overcommit will just push the problem around but not solve it.
    
    There are cases where overcommit makes sense. At my last job we had users who'd run various scientific applications that would allocate huge sparse arrays. Since these arrays were guaranteed to be very thinly populated, overcommit was safe and useful. However, for general-purpose workloads overcommit makes a lot less sense. For the semi-embedded use case of a storage server, which is most relevant to this discussion, it makes absolutely no sense at all. Unconstrained memory use is the bane of predictable performance. Turning performance jitter into something that's easier to recognize and address is actually pretty desirable in that environment, and that's what disabling overcommit will do.
    
    2 0 Reply
    1. Tuesday 18th October 2016 16:36 GMT Nate Amsden
      
      I must be confused because where is this over commit coming from. Is it something you enable in the kernel. When i think of over commit i think vmware.
      
      Say you have 16g of ram and processes are using 15g at a steady state maybe a mysql server. Some new thing comes in and executes a really terrible query blowing the memory out by 2g.
      
      Or do you mean size your memory with the apps so you can never blow up? Since in general that seems difficult to do and still be efficient (e.g. not give an app 5x the memory bevause it might have a leak that gets tickled now and then). Requires a lot of testing, of which in the past 15 years I've never seen done.
      
      If you regularly encounter OOM killer something is wrong. Across the 1000 or so systems i have it is quite rare but when it does happen usually because someone built the VM with too little memory (in our case default is 1g).
      
      Sorry if i am confused here :)
      
      1 0 Reply
      1. Tuesday 18th October 2016 19:00 GMT Platypus
        
        The overcommit at issue on a storage server is probably not VM overcommit (or oversubscription) but process-memory overcommit. If you allow memory overcommit what you're saying is that the system can allocate more virtual pages to processes than it can actually back up with physical memory plus swap. It's kind of like fractional-reserve banking, and we've all seen what happens when that goes too far. Everythingl works great until there's a "run on the bank" and every process actually tries to touch the pages allocated to it. Since it's not actually possible to satisfy all of those requests, the kernel picks a victim, kills it, and reaps its pages to pay other debts. It's just as evil as it sounds. It works to a degree and/or in some cases, but IMO it's an irresponsible default made worse by the fact that the Linux implementation has always tended to make the absolute worst choices of which process to scavenge.
        
        In a virtual environment, things get even more interesting. You can allow memory overcommit either within VMs or on the host, or both, and that's all orthogonal to how you size your VMs. Where most people get in trouble is that they oversubscribe/overcommit at multiple levels. Each ratio might seem fine in isolation, but the sum adds up to disaster. The OOM killer within a VM might take down a process, the OOM killer within the host might take down a VM, you can get page storms either within a VM or on the host, etc. It's much safer to overcommit in only one or two places, and then only modestly, but those aren't the defaults.
        
        0 0 Reply
2. Tuesday 18th October 2016 14:42 GMT Rainer
  
  Firmware
  
  That's why everybody with a few brain-cells to spare goes Software-RAID these days.
  
  One layer of problems less.
  
  3 0 Reply
  1. Tuesday 18th October 2016 16:51 GMT Jon Massey
    
    Re: Firmware
    
    An HBA can still have firmware issues that will ruin your day
    
    1 0 Reply
    1. Tuesday 18th October 2016 20:41 GMT Rainer
      
      Re: Firmware
      
      :: An HBA can still have firmware issues that will ruin your day
      
      Indeed. The general rule is: don't use OEM-HBAs where firmware updates don't exist.
      
      0 0 Reply
  2. Wednesday 19th October 2016 10:56 GMT Lost_Signal
    
    Re: Firmware
    
    HBA and SAS expanders have driver/firmware problems too.
    
    Ex. SAS buffering bugs, SATA tunneling protocol, T10PI bugs, SES bugs...
    
    0 0 Reply
Tuesday 18th October 2016 13:56 GMT Anonymous Coward

So it's all just a matter of...

He said... she said...

2 0 Reply
Tuesday 18th October 2016 14:02 GMT Nate Amsden

seems like

A customer with no money for a real solution grasping at straws for some golden parachute to save them.

I wonder how experienced the tech team at that customer actually is. Wonder if they were fighting back against stupid management in wanting to deploy such an unprooven solution or not (having been there myself at another company and gleefully watching the unprooven solution that got bought and installed after I left suffer massive outages for the next 18 months). I was so happy to be in a position where I could leave to a new job within a few weeks, before they could even cut a PO to their vendor for that solution.

Having been more serious about storage over the past decade I came to realize how complicated it really is and because it is persistent data (and not a stateless host or network device) I have become ultra conservative in my storage decisions (and still have been burned by the likes of Nextena and HP store easy, fortunately the data sets on those were very small and easy to move).

Next up will Isilon be able to handle my NAS workload(which IMO is quite modest 2 to 3 TB of data), their software defined product imploded during testing though the hardware side has features to prevent that from happening, won't know for sure until next year when I expect to start real POC.

HCI from any vendor is not mature enough for my needs. Not sure when or if it might get there for me anyway.

1 1 Reply
1. Tuesday 18th October 2016 14:18 GMT Nate Amsden
  
  Re: seems like
  
  Can't edit posts on mobile.
  
  3PAR makes up probably 98% of my exported storage to hosts. Obviously a fairly mature platform currently my oldest system has been going 5 years(since purchased) 100% uptime across failures of various kinds.
  
  But 3par has not been trouble free by any stretch during my decade as being a customer, and havinf known people there for so long have heard stories about other customers over time as well(don't get me wrong I am a very happy customer).
  
  Which just reinforces the belief that getting storage right is hard even when you do control everything hardware and software.
  
  Shit I've been waiting for data compression on 3par for 5 years now. Something like that would be nice to have but is not nearly as vital as a solid hardware architecture.
  
  As my new manager told me yesterday when he saw our main datacenter for the first time, the 3PAR systems are the heart of the datacenter. That's where the data lives.
  
  The heart better be solid.
  
  1 0 Reply
Wednesday 19th October 2016 19:09 GMT Anonymous Coward

Latest...

https://cloudflux.co.uk/maxta-a-follow-up/

0 0 Reply
Thursday 27th October 2016 01:48 GMT pneyman

As part of an overall product strategy, Waypoint has to be very selective on deciding which products to represent. For our client base, we have to balance between a rich feature set, simplicity and affordability. In addition, we have our post sales engineering team evaluate each product in order to see how well the product achieves their own marketing claims. For us, Maxta meets that criteria. It gives our clients a low-risk, high-value choice when seeking to move to a hypercoverged strategy that is backed by Intel. We can position Maxta with confidence knowing it will service our client’s needs.

0 0 Reply