
Meh.
Gluent will be out of business within 2 years.
Everyone wants to have an easy way to pull data from silos in to RDBMSs.
Its pretty easy if you know what you're doing.
Three storage startups have hoved into view: Gluent, Tachyum and VAST Data. Gluent is a data silo virtualiser, liking RDBMS to Hadoop data for easier and simpler data analysis. Tachyum aims to, it appears, defeat Moore's Law by devising a new data processing architecture, and VAST Data, the stealthiest of the three, must be …
Ha, you're not grasping the concept. While it's relatively easy to extract, load and duplicate your data between all the silos that generate and consume the data - you will end up with dataset duplication and a lot of ETL overhead (a jungle of data feeds). Our customers have application constellations where the same large dataset (that all app modules in different databases need to use) is duplicated over 20x. This is 20x storage cost, 20x ETL overhead and you will still not see the latest state of your data unless you set up & run real-time replication to all those 20 databases.
Gluent's approach allows you to not duplicate data in RDBMS & SAN storage at all, just keep it in Hadoop once and consume it with native SQL queries of your database - on demand, on the fly - without having to first load & duplicate data anywhere. And that's just part of the story - now as bulk of your incoming (or offloaded) data resides in Hadoop, we can transparently push the related processing down to Hadoop, close to data as well, radically reducing the CPU licenses and SAN storage throughput needed for your RDBMS. Exporting/importing files around is not data virtualization.
See you in 20 years! :-)
No, its you who don't understand.
I posted this Anon because I happen to know a wee bit about this problem and how to solve it.
What Gluent wants to do will not work the way they think it will.
If you wanted a large clustered RDBMS, IBM had it when they acquired Informix, but they killed it off in favor of DB2. The advent of Tez and its related technologies, Drill and Impala show why you need to look outside of Hadoop's core to a distributed query engine. And those will only scale to a point and even there, you don't want to do OLTP.
And didn't you get the memo? Storage is cheap and getting cheaper.
Clearly you haven't worked with Hadoop or RDBMs long enough to understand the problem.