back to article Inside the Hekaton: SQL Server 2014's database engine deconstructed

It's 1996 and Mission:Impossible has just arrived on the cinema screens. RAM is $10 per megabyte and falling. Against this backdrop, Microsoft has quietly commenced its own seemingly impossible mission, a complete re-write of a little known database which will be released two years into the future, and will be known as SQL …

COMMENTS

This topic is closed for new posts.
  1. Richard Wharram

    Data Warehouse customers

    Aren't they supposed to be using PDW rather than SQL Server?

    1. Jon Reade

      Re: Data Warehouse customers

      Hi Richard,

      The Parallel Data Warehouse applicance is a data warehousing applicance based on SQL Server. There's a short, high level overview article on TechNet at:

      http://blogs.technet.com/b/dataplatforminsider/archive/2012/11/09/seamless-insights-on-structured-and-unstructured-data-with-sql-server-2012-parallel-data-warehouse.aspx

  2. Steven Raith

    Hekaton?

    So does that mean the desktop verisons of this (the old SQL Express thingies) will be called Waggleton P. Tallylicker?

    Apologies.

    Steven R

  3. Anonymous Coward
    Anonymous Coward

    The changes would be useful for us

    We use Sybase in a performance environment where every database access is counted (and yes, we've employed solutions where the most accessed data is on flash), so Microsoft's improvements look good. Problems? It would mean migrating a lot of legacy software from Solaris to Windows, I think (suspect having the database software on a separate box would cause issues of its own). I think Windows is the single biggest barrier to Microsoft increasing its market share.

    I'm not surprised Microsoft seems to have dumped replication. Sybase replication struggles at the transactions per second we require, and we often face problems external to us when dealing with geographic redundancy anyway (over African infrastructure). Of course, this is Telecoms, where Windows Server is only ever accepted for serving media (and Linux is eating into that). What we do is rarely elegant/often expedient. Still, if Sybase were to match what Microsoft have done, we'd find a use for these features.

    1. Anonymous Coward
      Anonymous Coward

      Re: The changes would be useful for us

      You should look at Sybase In Memory Database option which launched some years ago. It's at least as good as what MS have done.

      You might also want to look again at Rep Server as it normally does very well with high transaction loads. Have you tried replicating the SQL instead of the traditional converting of the log records to inserts/updates/deletes? Have you looked at the option to use bcp when it sees lots of inserts?

      1. Anonymous Coward
        Anonymous Coward

        Re: The changes would be useful for us

        Hi.

        I believe we've looked at the various options. On the Sybase side, the insert rate is relatively low compared to the update rate, so we've never had to consider bulk loading. Ultimately, for us I suspect the best mix of risk and reward comes from constant monitoring and massaging - problems are often fixed before the customer realises they need to call us.

  4. Małopolska
    Pint

    A pinch of salt

    This article is clearly written by someone very enthusiastic about SQL Server, which is fine but when summarising new stuff that hasn’t been used in anger much a lot of important detail and caveats get lost.

    A few specific comments:

    MVCC has been implemented in Oracle for over a decade and so I don’t see how it’s a technique that’s only been enabled by storing data in memory instead of disk. Unless SQL Server’s implementation is different, it doesn’t eliminate locks: it just means that readers are not blocked by writers. Also the comment about blocking and locking being a major bane for DBAs is something that’s only caused significant pain for me when working with SQL Server. On Oracle, Informix and my limited experience of DB2 this is generally not the case. Others’ experience may vary of course.

    The in-memory database sounds severely restricted and I can’t see how many existing applications would be suitable for a migration to it, even partly unless things like constraints don’t matter to you. In most organisations DBAs, at least production support DBAs, are unlikely to drive any change to this feature.

    Column stores will always be better suited to warehouses. While having them in a primarily OLTP database can be advantageous, it is usually report-style queries that see the significant speed-ups.

    Using SSD as a secondary buffer cache is nice, but many SANs effectively offer this already. Is this feature needed at the RDBMS level? The author is also in danger of giving the impression than in a typical OLTP system most reads go to disk (I am sure he must know this isn’t generally true). It is usually possible to get over 99% of disk reads from the database buffer cache.

    About the author’s summary, all the database vendors are looking at what the others are doing and integrating similar features into their products. There isn’t a feature listed, save perhaps the secondary buffer cache, which is not available from another vendor today in some form or other.

    1. Tony Rogerson

      Re: A pinch of salt

      Multi-Version Concurrency Control (MVCC) is indeed implemented without any locking, they use the CAS operator and time based versioning - versions remain in memory so long as an existing transaction (in Snapshot isolation) requires it.

      SQL Server's implementation is indeed different - research BwTree.

      The thing you are missing is that SQL Server is moving to the cloud, the cloud AWS and Azure certainly is not built on SAN, it's built on commodity servers with commodity storage with software doing the data replication for fault tolerance and distribution.

      In the cloud space buffer pool extensions and in-memory OLTP can be a real help in mitigating the latencies with commodity spindle storage.

      The industry is moving away from SAN's, certainly in the SQL Server space, that move is only going to accelerate in the years to come; look at what Violin have done - the embedded version of SQL Server runs as an appliance with their flash solution.

      1. Małopolska

        Re: A pinch of salt

        Thank you for your reply. Having read (most of) a Microsoft research article on BwTree and re-read the article here, I've realised that the authors are talking about the underlying implementation of MVCC which was not immediately obvious to me before.

        1. Jon Reade

          Re: A pinch of salt

          Yes, it's worth reading about, the implementation is that rare mix of intelligent simplicity. Again, I'd point interested readers to attend Tony Rogerson's presentation on Hekaton if you're lookig for a more in-depth understanding of how things are implemented, it's fascinating and answers a lot of questions everyone has when they first encounter this technology, including the authors.

  5. Slawek

    SQL Server revenue market share is growing. It is also fastest large database (check TPC benchmarks).

    1. Jon Reade

      "SQL Server revenue market share is growing. It is also fastest large database (check TPC benchmarks)."

      Hi Slawek,

      SQL Server's market share is I believe now at over 50% of corporate database platforms purchases, though I can't recall if this is by volume or dollar sales.

      With regards to the TPC benchmarks, I think these are a good indicator of performance, though it's essential to look at those particular tests that match the workload that you intend to present to the platform. Ultimately, all of the major platforms are good, relatively stable pieces of software that have exceptional performance. However, a purchasing decision, even on a greenfield project, will be influenced by factors such as available skill set, legacy investment in other supporting technologies and the requirement for supporting specific applications or development requirements.

      Although I pinned my colours to SQL Server's mast many moons ago, I think it's important to recognise it may not be the database of choice, even if it is the fastest, for all shops, no more than Oracle or DB2 may be. But from a production and operational cost point of view, I'd consider any one of these platforms in preference to a less mature platform - the licensing costs are often dwarfed by the installation, ongoing maintenance and overtime costs incurred in a less reliable product, a criticism I would have equally applied to SQL Server back in 1996.

      In my opinion, where Microsoft definitely score bonus points is in the security of their product (not a praise it could once be considered worthy of), the tightly integrated development tools, and of course the ETL, OLAP and reporting products that ship with it out of the box, which can often be a costly additional purchase with other platforms. But equally I do think Microsoft have to watch their licensing policy as it's causing much disconcert and consolidation out here in the real world, and it's causing it right now.

      Jon.

  6. Ian Ringrose

    -> “SQL Server's in-memory database engine is fully integrated into the core product, and at no additional cost”

    WRONG, it costs a lot more unless you need the most expensive edition of SQL server for some other reason.

    1. Jon Reade

      Hi Ian,

      Point conceded, I personally deal with mostly Enterprise Edition installations, and we should have pointed that out. Thank you for your correction, noted.

      Jon Reade.

  7. Ken Hagan Gold badge

    That million-fold difference.

    "Data retrieval latency is orders of magnitude slower than memory. We're talking milliseconds compared to nanoseconds, a million-fold difference."

    Good luck getting nanosecond latency out of the terabyte-sized memory mentioned in an earlier paragraph.

    On a CPU running a few GHz, you'll get nanosecond latency out of your L1 cache. By the time you are hitting DRAMs or flash, the latency is more like microsecond. You've lost at least two of those orders of magnitude, maybe three. On the other hand ... that still knocks seven kinds of shit out of a disc and into a cocked hat. Back on the first hand, a decent disc cache subsystem will have delivered most of that performance already, even on DBs that are slightly too large to live entirely in memory.

    So it will be interesting to see if this actually makes any measurable difference.

  8. david 12 Silver badge

    The migration wizard ... will no doubt improve in time

    This surprised me. What is the justification for believing that the migration wizard will be improved?

    PS: I won't say that no-one had ever heard of Sybase before MS bought the product. Just that it was a minority.

  9. DJ
    Mushroom

    Greed is good. At making you stupid.

    Perhaps the main reason for SQL Server's growth was its affordability.

    Now, the same brain trust that forced TIFKAM onto a server operating system (Windows Server 2012) feels they can go head-to-head with Oracle on price. Many firms using SQL Server are in for an eye-watering surprise when they receive their next bill. Not a small increase, not a modest increase, but a jaw-dropping price increase. At least one shop I know of is working on replacing SQL Server with PostgreSQL; they simply can't afford to continue using SQL Server.

    Sadly, these same geniuses (at Microsoft) didn't add value corresponding to the price increase - Hekaton, etc. are nice, albeit limited improvements, but are incremental improvements and are far from justifying the new (core-based) licensing model pricing.

    Expect more defections as the new reality sinks in.

    1. Anonymous Coward
      Anonymous Coward

      Re: Greed is good. At making you stupid.

      I have passed a kidney stone... the 2014 price increases bought similar pain.

    2. Jon Reade

      Re: Greed is good. At making you stupid.

      DJ - agreed. I've seen clients suffer sharp price increases in their SQL Server estate and will do so over the next 12-24 months as they upgrade. It's still cheaper than the competition by a long way, but I'm very concerned that Microsoft will kill off the goose that laid the golden egg. I'm equally as critical as yourself of their pricing policy, and of the feature set that is being gradually eroded or stagnating in SQL Server Standard Edition.

This topic is closed for new posts.

Other stories you might like