Just out of curiosity, what kind of small businesses generate 1TB or even 15TB of new data each month? Aside from video production work, I can't think of anything that qualifies. An uncompressed high-resolution multi-spectral aerial scan of Alberta's oilfields could get close to those figures, but every month?
Hold it! Don't back up to a cloud until you've eyed up these figures
Online data vaults are everywhere. On the small storage side, we have options such as Google Drive, Dropbox, and Teamdrive. My Synology NAS, the upcoming 2012 Microsoft Server Suite and any number of virtual appliances can all back up bulk data to the cloud. The software side of things may be settled, but is this all truly …
-
-
Thursday 15th November 2012 11:49 GMT JimmyPage
Doesn't surprise me ...
Storage is cheap, processing is expensive (and slow). So if you are into *big* data analysis, you'll build massive cubes to cut down on the processing needed for slice-n-dice multi-dimensional reports.
Company I worked for could easily build a 10Gb series of cubes overnight. Each one is unique, so no differential possible.
-
-
Thursday 15th November 2012 12:49 GMT Buzzword
Re: Doesn't surprise me ...
If I were doing big data processing, I'd be tempted to shift the whole lot to the cloud. Just have a thin local client application, while the cloud machine crunches the numbers. I'd even be tempted to have a hosted Windows 7 desktop, network latency permitting.
Photographers and video editors are admittedly a special case. Though I didn't expect 60 GB per shoot!
-
Thursday 15th November 2012 13:27 GMT The Wegie
Re: Doesn't surprise me ...
Even a home user can generate vast amounts once you start playing around with video. I'm currently taking my recordings of this year's Tour de France and encoding them in mp4 so that my brain-dead Sony Bravia can play them off my NAS and I can have some space back on my DVR. The raw data files run at something like 160 GB!
-
-
-
-
Thursday 15th November 2012 11:54 GMT AllyourComputers
From a small business who offer's IT support in the UK, I have 2 customer sectors who generate ridiculous amounts of data and make online backup simply not viable.
1. Professional Photographers - I have one client who recons an average shoot is approx 60Gb of Photographs - and she had 90 appointments in the first 3 months of the year...
2. Professional Video Producers
Hard drives continue to grow at an almost alarming rate but yet online storage hasn't kept pace. It is a big problem for certain market sectors.
-
Thursday 15th November 2012 16:14 GMT Trevor_Pott
@allyourcomputers: I ran these numbers only with my photographer clients. The 3d render chaps and the video chaps produce so much it wasn't even worth doing the analysis. I knew the answer before I started.
Let's not even start with the medical imaging folks, the mass spec labs, the geologists...
-
-
Thursday 15th November 2012 13:33 GMT Rampant Spaniel
As a photographer the maths run something like this. Per billed hour shooting:
200-500 shots depending on the length of the shoot and the type of shoot. A 90 minute wedding will have way more shots per hour than a 4 hour studio shoot. RAW's (leaving film scans aside) run about 30MB each so lets say somewhere around 12GB of raw files.
Edited shots (lossless compressed tiffs) adds another 5-7 GB.
JPG deliverables adds another 2GB so near enough 20GB per hour shooting.
Working on 4 hours shot per day, 5 days per week (on average over the year), its 80GB a day, 400GB a week and something like 1.7TB a month.
I tried 'cloud backup' years ago and when my primary drive died they said I couldn't restore over the net and they had to send me a $300 dvd or something like that. I just used my own slightly older local backup.
These days I buy external drives from costco. I get them in sets of 3. They are all duplicated across each other, one stays at home, one goes to a storage unit (climate controlled, every 6 months or so I power up the drives and check theyre ok) and one goes to a family members house. The cost of 3 drives is around 400-450 dollars. USB 3 is quick, even driving to get a backup, its far quicker than downloading. All jpgs also get uploaded to smugmug. Chances are if something happens bad enough to kill all 3 backups and the smugmug account and my local raid array, I will have other things to worry about.
God only knows what videographers need storage wise. I know a lot of the locals here run between 50 and 250mbps using at least 2 rigs, they must chunk some serious storage!
-
-
Thursday 15th November 2012 11:19 GMT McVirtual
Hmmmm
'Some' good points here. This is why it is important to understand your Cloud Service Providers (CSP), SafeHarbour and Euro DP policies, amongst other things.
However, there are always 3 factors in a decision making process of this type. Cost, Risk & Service.
This article addresses some of the raw costs here, but as your bag of salt statement impies, there are many other intangible costs here, and other non-functional costs that need to be included.
Needless to say, these are all re-baselined again when we look in the corporate space and begin to leverage other IT commodities and bandwidths, etc.
-
Thursday 15th November 2012 11:20 GMT Pete 2
Backup's not the problem
Restoring it all is the problem.
Sure, for a home user the "A" part of ADSL means you can (in theory, at least) pull data back off your cloudy storage faster than you can push it up there. But try restoring a worst case, of a whole 1TB data set in one go and see how far you get. Even with a 50Mbit/s fibre connection you're talking 2½ DAYS to restore, assuming you can get full-speed for all the time (and don't run into data caps). If you're using flaky backup/restore software, you could find that a break in the connection means you have to start again.
So, the best you can hope for, if you're running a business is that it'll be half a working week before you can get 1TB of stuff restored. How does that fit into your DR plan? That's assuming the plan works - and almost NONE of the DR plans I've seen have ever been tested in a "fire practice" situation.
So far as backups go, store stuff off site - that's just sensible. But remember than no network has the same bandwidth as a van full DVDs.
-
-
Thursday 15th November 2012 13:44 GMT PyLETS
wrong way around
If the sensible place to keep your data is where you process it and where your users can best access it, then keeping the main copy on a hosted server in a datacentre with professional operators, high speed multiple routed links close to the Internet backbone, secured power supplies and rigorous physical access controls makes more sense than locating it where I'm located. So in my situation the data is processed on the so-called "cloud" and backup occurs using the faster side of my ISP link, i.e. my download bandwidth. Also makes sense to automate it, encrypt it, and only download the differences. Rsync, SSH and Cron are my friends here.
I guess the exception described in the article is the seemingly legacy business model where most of the access and nearly all of the users are local to the site where you work. I guess that way around still applies in some internal data heavy environments, as opposed to where the bulk of your input and output relates to your external as opposed to internal relationships.
-
Thursday 15th November 2012 18:32 GMT Anonymous Coward
Re: Back up?
Probably the best point in the article-plus-comments I've read so far.
Cloud-stored data is in somebody else's hands. You don't even own a "thing." It doesn't matter how big the name is: whoever thought Woolworth would go broke?
Anyone who uses cloud for primary data storage needs to fail their own security audit.
-
Thursday 15th November 2012 11:54 GMT b166er
I saw a briefing once, where a local CDP unit did dedupe and cloud sync to an identical CDP unit.
In the event of a catastrophe, the firm would send the remote CDP unit (after taking a complete image onto a new CDP unit, and then courier you the CDP unit.
Can't remember the name of the firm, but I thought at the time it was the best possible solution if you want your data off-site in a cloud.
-
Thursday 15th November 2012 11:54 GMT Anonymous Coward
I would question the concept "back up"
By all means use the cloud as a resource, but it shouldn't be the be-all and end-all of your business continuity/DR planning. Quite aside from the technical risks, you're a hostage to other countries (looking at you PATRIOT act merkins) political machinations.
One interesting trend, is companies using the cloud to avoid investing in their own network infrastructure. Especially with outfits now offering a private cloud overlaid on the public one.
-
Thursday 15th November 2012 12:00 GMT petur
Personal Cloud
A much better solution is to create a personal cloud, easy to do with a couple of NAS boxes. You can but them next to each other for the initial sync, then drive them over to another location and do differential backups to them. When a restore is needed, go fetch the NAS and put it local for speedy restore.
Another solution (if data is limited) is to use external drives (USB3 or eSata recommended) and have a rotating set of those. Make them encrypted (my QNAP NAS will do that for me) and just store them at a location you visit frequently (home/work is a nice combination). Can't go any cheaper for the reliability given!
-
Thursday 15th November 2012 14:47 GMT Random K
Re: Personal Cloud
Storing backups at home? Doesn't sound like good access control (in the classic paper accounting sense). What happens when the employee doing this gets canned or the company owner keeping backups is suspected of fraud? A safe deposit box at your local bank branch is a much better place for those external drives, and they're often either free or cheap-as-chips with a business account. That way you have a record of who has accessed the drives that is kept by a disinterested third party. It also comes with the advantage of being able to send a minion to get the drive (then directing the whole thing remotely) if you happen to be out of town when things go titsup. In the SMB space there is usually a good chance that IT is a one man/woman show.
-
Thursday 15th November 2012 12:03 GMT Steady Eddy
Cloud = clown
Doesn't matter what your SLA says, if a bloke in a JCB digs up the cable outside your building, you're stuffed.
And you can jump up and down and escalate it with your Account Manager at Lowest Bidder plc, but the time to rectify is however long it takes for someone's subcontractor's subcontractor to splice the fibre back together.
-
Thursday 15th November 2012 12:14 GMT JimmyPage
where managers earn their money
circuits and lines are assessed as part of a businesses BCP plans. If you have predicated your business on a single telecoms provider and circuit then it needs to be flagged as an issue and either rectified (get a second supplier and circuitry) or devise a compensating control (which may be to power your entire internet pipe through a 3G dongle). Our BCP has an off site war room setup with a 3rd party (Sungard) where essential staff would be transferred in the event of a building becoming compromised (i.e. no internet access).
BCP/DR is a serious business - getting it wrong can result in going under.
-
-
Thursday 15th November 2012 13:45 GMT Anonymous Coward
Re: where managers earn their money
If only that was a hypothetical - seen it happen several times over the years.
Clue: if you want real diversity you must specify that to your providers and make sure its coming into different sides of the building. It really is a case of if you don't ask, you probably won't get it.
-
Thursday 15th November 2012 15:00 GMT Anonymous Coward
but
it's all very well specifying it to your providers. But what do you do when they ignore you ? I had a SQL cluster fallover once. When I asked our hosting company why the standby machine failed, they replied (unfortunately for them in an email) that it was on the same power bus. This was despite their (written) assurances to me that they split machines over data centres and power grids.
The first rule of planning for disaster is to distrust everything and one.
So despite specifying separate circuits, it would be as well to factor in a total loss of connectivity. As I have suggested before, possibly moving buildings, if it's that critical to you.
Remember, "disaster" can come in many forms. One company I knew lost 2 days, because their head office was sealed off after a murder happened in the park to the side. I am sure their connectivity was 100% available throughout.
-
-
-
-
-
Thursday 15th November 2012 12:48 GMT Scarborough Dave
We just had an adventure on the cloud
Someone nicked to copper from outside providers site (which was in fact fibre) and we had no contact for 3 days with our cloud services and media.
Also you have to rely on the provider's availability and recovery plans, which may not be a robust as they say they are as until you see them in action yourself you should not rely on what you are told in some sales man's spiel or an advert.
Backups we do it by Wi-Fi to a remote location (helps if you know other businesses in the area with similar issues) along with taking physical backup discs home with us (all encrypted etc..).
-
Thursday 15th November 2012 14:09 GMT technohead95
It's worth noting that not everything you back up to the cloud is something you need instant access to. For that there is Amazon's Glacier services which allows you to archive data. You get significantly cheaper costs compared to S3 but lose your instant access. Access is reduced to a few hours (which might be fine for rarely needed data).
-
Thursday 15th November 2012 16:36 GMT Anonymous Coward
Law enforcement is a risk for the cloud too.
What I consider the most compelling reason to make sure what you're doing is that you're using a virtual instance which runs on one or more servers. When the Feds (or any other global police force) suspect foul activity they usually get warrants to inspect, investigate or confiscate an entire server.
Very nice if that server happens to be something running a dozen virtual clients on top of it and one of them is yours.
-
Thursday 15th November 2012 17:23 GMT Beachrider
On 1 TB/month...
I have some questions:
1) If the cable-link is 'limited' to 1 TB a month and you backup 21 times a month. Then each backup is 'limited' to ~48 GB, no?
2) If Amazon is limiting you to you-managed twin 500 GB cloud-disks, don't you need to manually groom the disks to retain restore-points from several parts of the month? How does this get done?
3) Does Amazon provide 'point in time' images of your cloud-disks?
...Just because we run into these concepts with internal backup/recovery scenarios...
-
Thursday 15th November 2012 19:16 GMT Trevor_Pott
Re: On 1 TB/month...
@beachrider any decent "cloud backup" solution backups from your local stuff to a "buffer" appliance, dedupes the ever-living-crap out of it, then fires the blocks up to the cloud. S3-aware setups can just keep spinning up new instances of storage in 500GB increments and filling them with blocks as needed. Amazon's "backup" offering is called Glacier, and is offline tape managed by their robot.
-
-
Saturday 17th November 2012 21:39 GMT Infernoz
Build and use FreeNAS 8.3 boxes, over costly and flawed File Systems in of-the-shelf NAS
Yes, cloud capacity and ISP capacity are a joke for backup, and I've seen 80Mbit UK fibre regularly get congested; also, I would worry about data corruption too, if I was not sure that the data was stored in a ZFS parity RAID! The worst problem is actually the latency and data transfer time; this will be at least an order of magnitude slower than local or stored backup disks; this is the main fatal flaw for cloud backup and WiFi backup too!
RAID1 is not good enough, because standard RAID 1 won't protect you from hidden in-line or on disk corruption; see http://en.wikipedia.org/wiki/ZFS
DVDs are not reliable; I have seen so many unreadable disks and corruption that I stopped burning DVDs years ago, even a cheap USB flash stick and cheap bare USB hard disk are better!
FreeNAS includes support for:
* Commodity and some high-end PC hardware
* 32-bit and 64-bit CPUs
* trivial installation, given it runs off a small flash stick of at least 2GB.
* config backup/restore via a web browser
* OS updates via a web browser, with digest check
* scheduled snapshots (with configured timeout) for each ZFS Dataset, so that multiple earlier snapshots can be viewed _LIVE_, which is even better than differential backups, especially if the Dataset is used directly for storage.
* scheduled push or pull replication and rsync, and scheduled backup.
ZFS v28 include support for:
* 128bit storage addressing, so only limited by hardware!
* multiple software RAID models
* Single, double or triple parity RAID!
* ZFS Datasets, so none of this stupid fixed size partition nonsense.
* concurrent transactional filesystem processing, rather than common unsafe logged filesystem processing, so filesystem can never be corrupted.
* heavy duty data corruption detection and repair
So you could have a primary FreeNAS box, and keep swapping one or more slave FreeNAS boxes, so you are never without backup.
See:
http://www.freenas.org/
Plenty in features.
Loads in the manual.
One happy FreeNAS user :)
-
Monday 19th November 2012 16:13 GMT katsnelson
Why not use Amazon Import/Export instead of upload/download?
Working on Big Data, we routinely have to transfer terabytes in to the cloud and we would not use upload for anything that is over a terabyte. Amazon offers Import/Export service which allows you to send physical media (think cheap SATA drives) and for $80/disk they will import it for you and return the disk back to you. For large volumes of data it is the only way to go.