Re: I have wondered that ...
Petrol to get the taste out of your mouth from the things you did at Uni?
21 publicly visible posts • joined 1 Nov 2011
But they taste different...
The big difference between the two, is that the Violin represembles a more "Complete Array" with features like de-duplication, compression, thin provisioning, clones/snapshots, remote replication and controller redundancy.
Where as the TMS RamSAN, the beast that they are, is more akin to a big, fast flash drive without the extra features. It doesn't have remote replication, compression, dedupe etc. (it needs somthing like NetApp v-series or SVC in front of it to get those features, and even then there isn't the N+1/HA controller function).
Yes, similar media, similar purpose, but different capabilities.
No, I'm not affiliated with either of them outside of implementation involvement, I don't get any kickbacks from them either - though I wouldn't say no.
So, from reading the info from their website, it appears to make a compressed clone of the DB and then present R/W snapshots for each database.
How is this different to virtually any of the major storage hardware vendors snapshots out there, other than putting the snapshot in front of the the storage array; rather than using the array's clever bits to do the work and aleviate the hosts and storage network of excess workload?
It's just doesn't seem any different to PIT, RoW or Journaling offereings by the likes of EMC, NetApp, IBM or anyone else really.
But good on them for trying.
@AC, shame on you!
Server based Flash/SSD cache is about solving a problem that can't be solved in the array - yup, HDS included - latency created from the interconnections and layers. This is about putting the data as close to the application that needs it, without much of anything in between. And as much as I love HDS and BlueArc(HDS), I've also seen plenty of issues with IO Performance and latency, but the reality is, that can't be helped, it's the architecture; the AMS/HUS, and VSP are just that little too big to fit inside a normal sized computer, damn designers.....mumble....mumble... can't get the data close enough....grumble... Flash based cache for all then.
I really hope that's not what you do.
Where's your plan 'B'? Snapshots are dependent on the primary data being available, what happens when that's gone? Snapshots should only ever be a plan 'A'.
Snapshots, whilst could last for a very long time, you'd need an Andre the Giant sized fist full of capacity to keep several years worth of back data, and even still, if the primary is gone, you've got to rely on your synced data and snapshots on your remote site. What if the loss was mallicious and both primary and replicated are destroyed? Bug in the array code, internal or external hack, mis-configuration of snaps and replication, shipped corruption going back before your available snapshots. - all of these are just some of the reasons that snapshots alone should NOT be considered an alternative to snapshots, just as your first port of call in an event! Backups are your next buffer between recovery and a P45.
Backup provides an airgap for such an event.
"With adaptive caching a 90% write workload would use 90% of the 3par cache which is already better than any other systems with fixed read/write cache sizes."
You are aware that there are a great number of storage arrays which the DRAM based cache to accomodate higher and lower read/write workloads and some that also cache read and writes in the flash based cache as well.
Aus Storage Guy.
Firstly, if the customer using the 2050 wanted to go to a higher model in the FAS range, it would be a controller upgrade only, not a forklift (I really hate how vendors Bander this around)
Second, both the 2050 and 3240 support FC disks which would have increased performance, the 3240 also supports SSD and Pam cards, both of which would have increased performance without increasing latency or risk.
I think it's fair to say, there is no such thing as the perfect storage array in this world, or any product for that matter.
To this day, I've not yet seen a product which is free from bugs and faults, however, thankfully, most of these faults are isolated to very few environments.
Now, find me an array which has not had a fault at some time.
Given the amount of NetApp FAS arrays deployed, this may be isolated to only a few select cases and that only a few will be affected and will most likely be a possible batch fault with the Pam card in high densities (read hotter chassis), possible with issues where the wear leveling of the card may be suffering due to high change rates.
Hopefully they'll have it sorted soon.
As for the gag orders, well, that's a bit off, but they all do it.
Now, if you could place the data requiring low latency in the host, whilst being able to use the efficiencies (such as tiering and replication), and data not requiring such ultra-low latency tolerances back in the array, then you’re on a win and so if the customer requesting it.
To take the view of “stores an application's entire working set in the server and takes the primary data storage role away from the storage array” that’s not its role at all.
Taking data which is very frequently accessed by the host, such as the most heavily accessed proportion of a Database index file and “Caching it” if you will in the host means the host can perform the look-up from within the host and getting the remainder of data from the array.
If that data no longer requires the latency or performance, this would provide the ability (I assume) to tier the data back to the array and down the tiers.
Sure, you could place a much lower cost solution such as Fusion-IO, but as a whole the cost for the entirety of data on the flash would far exceed the savings and again, there would not be the level of protection or efficiency offered by a shared storage array.
Fusion-IO do offer very similar capabilities with ioTurbine and directCache, and arguably better in that its vendor agnostic (apart from using Fusion-IO cards), however, it could be said that it’s not as robust, it’s not array aware, sure it’ll see array LUNs, but isn’t aware of what tiering may be underlining it nor able to provide redundancy against card loss.
That’s what EMC are doing with Lightning. Smart really.
Whilst there may not be a huge sub-species of customers out there needing it, there must have been enough demand for EMC to invest the money in it. And you know there will be others who follow suit with arguably better or similar solutions, such as tiering from host based SATA/SAS and PCIe SSD, but it’s a start.
Aus Storage Guy
Just because customers are not asking Nimble, doesn't discount the need or want of the customer base (or potential customer base for that matter).
The idea of having a high speed ultra-low latency in a storage array is wonderful, but unachievable, the reality is that there are many components in between the array and the host which create latency that are beyond the control of the array designer.
Let's look at an environment where low latency is required and look at what’s in between:
- The application
- Data source
- The operating system
- File System
- File System drivers
- Volume management
- Volume management drivers
- Multipath drivers
- The host bus
- The Host bus adapter
- Protocol layering
- The media from the HBA to switching
- Protocol layering
- Media from the switching to the array
- The arrays own internal magic
- The disks
- and back again, each adding latency to data requests.
If for example, the distance between the host and the array is hundreds of meters (switches in between), then there is a significant impact on latency, no matter what you do to the array, the media latency is out of the arrays control.
However, if you have the ability to place the data in the host, then you’re about as close as you can get to the data its self, any closer and it’d be in RAM.
Whilst I partially agree with you, not everything can be optimised, almost every customer I deal with now does use dedupe, thin provisioning, compression, archive and a whole range of methods to reduce their consumption of drives, customers are not reluctant, only weary, but many have adopted such techniques.
But the reality is, even with all that and NFO (Interesing way to plug you business by the way, interesting product, i'm going to look into it more.), companies still suffer data growth.
Not many companies would be able or willing to allow post-processing of their unstructured data incase there was a compliance issue, most but not all systems/applications work well with thin provisioning, compression or even dedupe.
I agree that many more businesses could and should approach this, but to use your car analogy, customers rarely buy a Prius to be green when they need to cross a rugged wilderness and then expect the 4x4 to be every bit as efficient - it's horses for courses.
I second J.T's comments, EMC as with almost every other major array vendor do not use the same bog standard hard drives used in desktops.
Most enterprise array vendors use specially qualifed disks designed for 24/7 operation (vs desktops 9-5) often with several major differences including:
*520-bytes/sector vs 512 for desktop
*Time limited Error Recovery vs not
*End to End ECC vs desktop with often only write ECC
*Bit Error Rate (risk of data coruption) of <1 in 10^15 (for NL-SAS or NL-SATA) vs < 1 in 10^14 for desktop
*Higher Rotational Vibration tollerance designed for multiple drives vs limited RV for desktop
*Interposer cards and special "caddies" to further reduce vibrations
*Dual-Processors and Dual Ports (sometimes on the interposer card) for redundant path access vs single for desktop.
*FC or SAS interfaces (and sometimes SATA with expensive FC or SAS interposers) vs just SATA for the desktop.
*Extensive batch testing and qualification of drives
*Often the drives have screws at both ends of the spindle vs desktop's single
The list of differences is endless, but the result is the same, EMC as with almost every other vendor uses different disks to those used for desktops.
It's easy to build a killer array that oozes performance, but there are many more reasons why companies chose enterprise arrays.
It's got a lot of neat trick, but when a company picks a T1 array like USP/VSP or DMX/vMAX is because of the multi engine reliability.
Much like a plane when you're traveling accross the great blue span that is the pacific, you want to be on a plane with 4 engines (or more, but not likely) like a 747 or A380. even if you could, you'd probably not want to try with a 2 engine plane like the 737 or A320. When 1 engine fails, you get a sudden urge of adrenalin when you realise there is only 1 more engine.
Thats the same with T1 arrays, you've (typically) got more than 2 engines or atleast have the option.
Then they also have things like FICON, ESCON, multipoint replication, array virtualisation (USPv VSP), Whole clones, zero size clones.
When it comes to T2 arrays, you also get things like file and block, tiering, remote replication, 3rd party virtualisation.
The other reasons, i'm finding it a bit vague from Pure Storage is:
*Remote Replication - do the have it? is it bandwidth optimised? is it SYNC or ASYNC? Multipoint or 1-to-1?
*NVRam - is it battery backed? if so, how long does the battery last? does it de-stage if the outage is longer than the battery can handle.
I'm not poo-hooing pure storage, I think that it sounds great, but only from a straight out speed perspective, it's deffinately a big step up from violin, TMS and alike interms of high-availability, but so far as I can read, it's in no way a competitor to the likes of NetApp, HDS, EMC, HP or even Dell for that matter.
Sure it's faster, but even EMC have an all Flash array and most of the others could do it without too much re-jigging.
But speed and capacity are not the sole reasons to but an array.
@AC: You're right in that this is very interesting technology, fast, seemingly capacious and a damn sight cheaper, however, that's not the reason why businesses choose Tier 1 arrays, such as HDS USP/VSP and EMC DMX/VMAX's or Tier 2 arrays like AMS, VNX, FAS, EVA etc.
The Fusion-IO Octal card as it stands is a FAST and solid card, but is dependent on other means to achieve redundancy - In card there's no replication, multi-pathing, limited redundancy, QoS, failover, reporting.
Additionally, PCIe cards, like individual flash drives tend to be a 1 trick pony - not being able to share their high speed goodness with other devices easily and achieve the same performance.
These cards and other like it are very fast and simple to implement, considerably more so than a SAN, however there's a place and situation for both.
You could of course "roll yer own" solution to achieve similar results, but remember, companies like EMC, NetApp and HDS spend millions if not billions on R&D and support to achieve the levels of protection that you'd not get with a "roll yer own" solution.
Plus, when you look further into it, as a card, it's limited in capacity (I know it's 10TB and you could achieve 20TB in 2u and 40TB in 4u or more as per their release) but beyond that if you need 100TB+ of the stuff, you're going to run into problems. (Yup, there are businesses and groups out there that need that much and more.) - That's where big SAN's come in.
Cheaper than VMAX (or any decent storage array for that matter)? Definitely
Faster? Quite possibly.
Better technology? Not by a long shot.
I'm certainly seeing more of the PCI based cards as well, but as most customers who take this approach are discovering, they're not shared and as a result, constrained to the host/s the card is installed in and of course the lack of or limited protection offered by these cards.
Where as, from a SAN perspecitve, there may be an inherit higher latency than PCIe Cards accessed directly as SAN based has to account for the latency introduced by having to go from Bus>HBA>Fibre>Switch>Fibre>Array>disks and then back again, but the SAN based solution offers protection against a single point of failure. These Cards (FusioIO/VeloDrive etc.) do not.
And as an added benifit, they share the available capacity and performance, where as the cards do not. For example. the smallest FusionIO Drive is 160GB, but if you only need 40GB, then you waste 120GB, where as in a SAN environment, if you have the same, you could re-use that 120GB else where on another application that could do with it.
And, again, get the protection of not having a single point of failure.
What Project Lighning is doing is introducing a hybrid of the best of both worlds:
-Caching frequently used data closer to the application using it; via the PCIe bus with a lower latency and;
-Providing all of the goodness and protection that a SAN array provides. (ie. No single point of failure)
(which is a lot more than anybody else really)
What really counts here is that with the offering of a converged storage and compute is the ability to have a consolidated envrionment, opening many more potential customers up to EMC. (Or any other vendor who takes the same approach.)
- If EMC were to take the converged storage/server capabilites into the VNXe line, customers would be given the opportunity to utilise Virtuallised servers within Virtualised storage, all pre-qualified with all or most of the bit's they would ever need in a low cost package.
That would cover the SMB market very well as the VNXe has a very low cost of establishment.
- Put it in the VNX line and you'd have the SMB<Midmarket and upper mid market squared away.
- With the VMAX lineup, covering the upper MidMarket (VMAXe) to big end of town with the VMAX.
Which ever market segment customers chose, having all of the bases cover, increases foot print, and this can only be good for EMC and thier customers.
The biggest advantage to the customers, if of course having it built as a pre-qualified, pre-tested, high-grade storage and servers doing exactly what they need and because it's all pre-qualified and tested, customers wouldn't have to worry about what will and won't work.
As for Project Lighting, I'd suggest it's more of an extension of FastCache rather than FastVP, which would make sense as it would behave as a local cache rather than a component of a tier, similiar to PAM-II/Flash Cache but at the host side.
You're right, auto tiering is not typically "Real Time" but for the most part it doesn't need to be.
It's all about concurrency of data, if data is new/hot, it will mostly go in the top tier, but if it's cold, the lower tiers.
SSD's can handle concurrency very well, SATA not so (10/15k Drives a bit better) - But that being said, if data is tiered down to SATA, it's not being accessed very often, and therefore - in most cases anyway - the SATA drives are not being accessed very much and therefore handle the odd request quite well.
For many of the Tiering algorithums in use, it only takes a few "out of the norm" requests to have it tiered up and many decent arrays handle additional workload in SSD/DRAM based cache very well.
If it were real time, any time some joe decided to start streaming his movies uploaded to the server, he'd get SSD straight away, sacrificing real workload. This is "Noisy Neighbour", where a LUN or segment of data may create un-neccessicary workload, impacting other data arround it.
QoS is a factor for tiering, some (most decent) arrays give you the option of dictating which LUN should get which tier - such that the financials gets the SSD durring EoM, mixed other times and the File Server gets SATA all of the time.
QoS is also posible to set so that a LUN can get the IOPS it requires - guarenteed (most of the time).
As for File Tiering, it's almost always possible, but not always practical - the DBA will get very shirty with you, should you tier his DB files to SATA if it needs 15k+ speeds, where as sub-lun block tiering is amost always practical, not always viable (Exchange 2010). In the DBA example, only some of the TableSpace may be hot and therefore can live in the SSD, whereas the rest may be cold and can move to more economical storage tiers.
For the most part, many of the arrays are able to track a usage pattern and determine over different lengths of time which tier data should belong to.
Take your example of a 2TB LUN where only 10% is hot, as you said that's 200GB of data that is easily tiered up and 1800GB which goes down a tier or two, now if the overall IOPS requirement of the LUN is 3500 IOPS and of that 3000 is comming from the 200GB (10%) alone with the remaining 1800GB seeking 500 IOPS, then tiering the 200GB up to the SSD and the rest on 15k and/or NL is the most economical use of the SSD storage.
But if you were to stick all of it in 15k RPM drives you'd need (assuming 50/50 read/write and RAID 5) about 58~60+ drives, plus the trays, connectivity, power and cooling to accommodate the IOPS.
If you stuck it on SSD only, you'd only need 1 or 2 drives to accommodate the IOPS, but depending on the size of the drives, (picking 200GB (190~GB formated) drives and RAID 5) you'd need about ~12 very expensive Drives.
If you were outright daft enough to try an pull that off with SATA/NL_SAS Drives (7.2k RPM), you'd be looking towards 110 Drives and the associated costs with that. (assuming 50/50, RAID 5 again)
Now if we took it into a simple sense and said we'd use (I know it don't work this way, but just focusing on the 2TB LUN in the example) and said that 10% of the LUN is hot, 40% is warm and 50% is cold, we would be able to use tiering as such:
1x 200GB SSD (3000+ IOPS)
2x 300GB 15k RPM (360 IOPS (~180 IOPS each))
2x 2TB 7.2k SATA (160IOPS (~80 IOPS each))
We'd be bang on 3,520 IOPS needed, with the 200GB (10%) covered in the top tier, and the rest filtered down to the lower tiers. (again, just want to make clear, that I over simplified the arrangement as most (if not all) would require the drives to be protected by RAID and therefore more drives needed)
As for File Tiering, most arrays with NAS capability and file servers attached to arrays for that matter can do file tiering quite easily with additional software/licenses/other, such as Enterprise Vault, EMC Rainfinity FMA, F5, etc. and a whole host of other systems out there, or even your own scripts if you're brave enough.
- Compellents strategy is not to write to the top tier, but the heighest tier with available capacity, be that SSD, 15k/10k RPM FC/SAS or Sata, depending on the pool config. This is normally pretty good, unless the top tiers are full already.
- EMC FAST VP will write to the block in it's last tier of designation, which is assorbed by FASTCache, such that if the block is currently living in the SATA/NL_SAS Tier then writes are buffered in the DRAM/SSD Cache and de-staged to disk. If the data was in the SSD tier, it'd just write straight though to the SSD. The advantage being that, just because the Data is on slow disk, doesn't mean the writes have to be slow.
- XIV does do Tiering - of sorts - , in that if a disk which has multiple blocks of data which are hot, it will attempt to locate other disks which are not so active, but it's still limited by the 7.2k drives in a sense. (cache handles a bit of it well enough most of the time)
- EVA does a similar this to the XIV
-Hitachi does tiering in a similar way to the EMC VNX/DMX/VMAX, though with a little more granularity.
-SVC/V7000 - Barry's right in the size of the pages, but so is the AC who follows him. the V7000 also buffers data to cache.
Anyway, the point of tiering is a bit like paper documents:
- You have the most recent/urgent documets in to in/out tray on your desk. (SSD)
- Your recently used documents in the filing cabnet behind your desk (15k)
- Your least used/older files go to the records room. (SATA)
- and if you want to go into the Archiving Tier, possibly to an offsite records management facility. (TAPE/Other)
Each decreasing in the speed at which you can access said paper document.
So yes, tiering is good, useful and money saving - In most cases!
Dimitris, we get it, your devotion to NetApp is commendable; but surely even you understand it's limitations - It's ok to like Engenio now, NetApp own them.
The FAS Series does have lots of speed, but the WAFL architecture and the copious abstraction layers before even presenting a LU is it's own downfall when it comes to sequential workloads - even NetApp as a whole recognised this; which is why they acquired the Engenio business.
The Engenio range is an excellent solid array, whilst not feature rich by any standards, it does it's work nicely due to the fact it is quite simply Traditional RAID Groups (or Volume groups) with LUNs (Volumes) without all of the overhead that comes with:
Nor does it have to find contiguous blocks in the same way that WAFL does.
As a side benefit; the Engenio also has a better fail-over function - that is, it's failover is near instantaneous (more akin to EVA, Clariion, AMS, Compellent) vs. NetApp with a 90+ second failover (or 120+ second without an RLM card) which I personally find interesting. (I've nursed more than enough WAFL checks to have no sense of humour on this subject)
So Dimitris, seriously buddy, I respect your right to an opinion, but you've got to get a sense of reality about the FAS range.
Seriously good hardware/software, but not suited to all workloads or environments, and neither is the Engenio (or anything else for that matter).