That's where something like a BlueArc NAS with tiered storage comes in handy, you can tier things file by file.
Storage arrays are getting so good at managing data that they can now move it between different types of drives in the array so you don't store low-value data on expensive drives. Is it really as easy as this? Let's suppose that you yourself are running this part of a hypothetical block-access drive array's operation. You can …
No tiering in XIV is actually a benefit, given how complex storage tiering is (according to the description in this article).
The bottom line is: what is the performance the storage array if providing your applications and how much time & effort does the storage administrator invest (waste?) on performance management.
For performance - check out the recent SPC-2 results for XIV Gen3 (and soon, SPC-1 too). For performance management with XIV - well, there isn't any.
Auto tiering sounded pretty nice to me until I saw it wasn't anywhere close to real time, it seems pretty common for the tiering tasks to run once, or maybe a few times a day at the most.
I do like Compellent's strategy where all writes go to the top tier by default, that sounds really nice.
Beyond that I'm not sold on auto tiering myself. It is nice that some(most/all?) manufacturers have tools that can show you whether or not you'd benefit from the technology before making the decision to purchase it.
Tiering by file(BlueArc) is better than full LUN tiering for course, but depending on the file size could be worse than sub LUN tiering, though as the article mentions some are more efficient than others. BlueArc, being a filer has the advantage of being able to tier to other storage arrays as well instead of just within a single array. So you could have their HDS storage as your tier 1, and the crappy LSI storage as tier 2, I think it can even tier to a data domain box as well for archive (since DD supports NFS)
You're right, auto tiering is not typically "Real Time" but for the most part it doesn't need to be.
It's all about concurrency of data, if data is new/hot, it will mostly go in the top tier, but if it's cold, the lower tiers.
SSD's can handle concurrency very well, SATA not so (10/15k Drives a bit better) - But that being said, if data is tiered down to SATA, it's not being accessed very often, and therefore - in most cases anyway - the SATA drives are not being accessed very much and therefore handle the odd request quite well.
For many of the Tiering algorithums in use, it only takes a few "out of the norm" requests to have it tiered up and many decent arrays handle additional workload in SSD/DRAM based cache very well.
If it were real time, any time some joe decided to start streaming his movies uploaded to the server, he'd get SSD straight away, sacrificing real workload. This is "Noisy Neighbour", where a LUN or segment of data may create un-neccessicary workload, impacting other data arround it.
QoS is a factor for tiering, some (most decent) arrays give you the option of dictating which LUN should get which tier - such that the financials gets the SSD durring EoM, mixed other times and the File Server gets SATA all of the time.
QoS is also posible to set so that a LUN can get the IOPS it requires - guarenteed (most of the time).
As for File Tiering, it's almost always possible, but not always practical - the DBA will get very shirty with you, should you tier his DB files to SATA if it needs 15k+ speeds, where as sub-lun block tiering is amost always practical, not always viable (Exchange 2010). In the DBA example, only some of the TableSpace may be hot and therefore can live in the SSD, whereas the rest may be cold and can move to more economical storage tiers.
The IBM Easy Tier is capable of varying the chunk size anywhere between 16MB and 1GB.
HP/3Par Adaptive Optimization uses 128MB blocks.
EMC FAST VP uses 1GB blocks on VNX/Clariion/Celerra but uses smaller chunks on the VMAX I think.
Having used auto-tiering arrays I can say that while there maybe "Hot" data that should not be on fast storage, in general having auto-tiering is better than not having it at all.
For the most part, many of the arrays are able to track a usage pattern and determine over different lengths of time which tier data should belong to.
Take your example of a 2TB LUN where only 10% is hot, as you said that's 200GB of data that is easily tiered up and 1800GB which goes down a tier or two, now if the overall IOPS requirement of the LUN is 3500 IOPS and of that 3000 is comming from the 200GB (10%) alone with the remaining 1800GB seeking 500 IOPS, then tiering the 200GB up to the SSD and the rest on 15k and/or NL is the most economical use of the SSD storage.
But if you were to stick all of it in 15k RPM drives you'd need (assuming 50/50 read/write and RAID 5) about 58~60+ drives, plus the trays, connectivity, power and cooling to accommodate the IOPS.
If you stuck it on SSD only, you'd only need 1 or 2 drives to accommodate the IOPS, but depending on the size of the drives, (picking 200GB (190~GB formated) drives and RAID 5) you'd need about ~12 very expensive Drives.
If you were outright daft enough to try an pull that off with SATA/NL_SAS Drives (7.2k RPM), you'd be looking towards 110 Drives and the associated costs with that. (assuming 50/50, RAID 5 again)
Now if we took it into a simple sense and said we'd use (I know it don't work this way, but just focusing on the 2TB LUN in the example) and said that 10% of the LUN is hot, 40% is warm and 50% is cold, we would be able to use tiering as such:
1x 200GB SSD (3000+ IOPS)
2x 300GB 15k RPM (360 IOPS (~180 IOPS each))
2x 2TB 7.2k SATA (160IOPS (~80 IOPS each))
We'd be bang on 3,520 IOPS needed, with the 200GB (10%) covered in the top tier, and the rest filtered down to the lower tiers. (again, just want to make clear, that I over simplified the arrangement as most (if not all) would require the drives to be protected by RAID and therefore more drives needed)
As for File Tiering, most arrays with NAS capability and file servers attached to arrays for that matter can do file tiering quite easily with additional software/licenses/other, such as Enterprise Vault, EMC Rainfinity FMA, F5, etc. and a whole host of other systems out there, or even your own scripts if you're brave enough.
- Compellents strategy is not to write to the top tier, but the heighest tier with available capacity, be that SSD, 15k/10k RPM FC/SAS or Sata, depending on the pool config. This is normally pretty good, unless the top tiers are full already.
- EMC FAST VP will write to the block in it's last tier of designation, which is assorbed by FASTCache, such that if the block is currently living in the SATA/NL_SAS Tier then writes are buffered in the DRAM/SSD Cache and de-staged to disk. If the data was in the SSD tier, it'd just write straight though to the SSD. The advantage being that, just because the Data is on slow disk, doesn't mean the writes have to be slow.
- XIV does do Tiering - of sorts - , in that if a disk which has multiple blocks of data which are hot, it will attempt to locate other disks which are not so active, but it's still limited by the 7.2k drives in a sense. (cache handles a bit of it well enough most of the time)
- EVA does a similar this to the XIV
-Hitachi does tiering in a similar way to the EMC VNX/DMX/VMAX, though with a little more granularity.
-SVC/V7000 - Barry's right in the size of the pages, but so is the AC who follows him. the V7000 also buffers data to cache.
Anyway, the point of tiering is a bit like paper documents:
- You have the most recent/urgent documets in to in/out tray on your desk. (SSD)
- Your recently used documents in the filing cabnet behind your desk (15k)
- Your least used/older files go to the records room. (SATA)
- and if you want to go into the Archiving Tier, possibly to an offsite records management facility. (TAPE/Other)
Each decreasing in the speed at which you can access said paper document.
So yes, tiering is good, useful and money saving - In most cases!
Biting the hand that feeds IT © 1998–2020