back to article The Great Graph Debate: Revolutionary concept in databases or niche curiosity?

Welcome to the latest in our series of Register Debates, in which writers discuss technology topics, and you the reader choose the winning argument. The format is simple: we propose a motion, the arguments for the motion will run this Monday and Wednesday, and the arguments against on Tuesday and Thursday. During the week you …

  1. Steve Button Silver badge

    It's the wrong question

    "MOST OF THE SAME USE CASES"

    That would mean that it's possible that up to 49% of use cases might be better off using Graph DBs, but the motion would still pass.

    It's an interesting discussion, but I would have thought "ALMOST ALL" instead of "MOST" would have been a better question (and probably even then should still pass).

    That's not to say that they might not have a place, but I'm just a DevOps / Sysadmin and the data architects get to make these types of decisions. I've always suspected that often they just want to play with the latest "shiny", but I don't know enough about this to be sure. Which is why I find this an interesting debate.

    More like this please.

  2. elsergiovolador Silver badge

    Performance

    Back in the day, the 1.0 of software would have this code:

    void doSomethingImportant() {

    for() {

    // doing some important stuff

    ...

    // then:

    Sleep(10); // comment out for version 1.1

    Sleep(10); // comment out for version 1.5

    Sleep(50); // comment out for version 2.0

    Sleep(50); // comment out for version 3.0

    }

    }

    Then the press release would come out "Our new version 1.5 offers improved performance over 1.1. Please buy."

  3. Steve Channell
    Facepalm

    Graph is a view, often stored as relations

    There’s no getting away from the fact that “Graph Databases” were woefully inadequate for the “Graph Database” problem if our time: Pandemic contact tracing.

    The great thing about a graph model {Node, Edge, Property} is that they can present all information as a common model that aggregates “Fred buys Car from Jane” as edges between {Fred, Car, Jane} without the complexity of pretending that a Person and a Product are somehow the same (both being nodes) or complex rules to prevent purchase of people.

    Regardless of {Graph, Relational, Object, Network, Key/Value, etc} storage, data is manipulated in memory as graphs (e.g. Bill-of-Parts) – as computer memory get bigger, storage-organisation becomes less important.

    The best use-case for graph databases is scenarios where you don’t understand the data – it might be old-fashioned, but understanding is still important

    1. katrinab Silver badge
      Meh

      Re: Graph is a view, often stored as relations

      Obviously how you store the information depends on why you are storing it, but maybe the car is the relationship between Fred and Jane?

      Or maybe you don’t have a graph problem, and therefore this just isn’t the right tool for the job?

      1. Steve Channell
        Unhappy

        Re: Graph is a view, often stored as relations

        RDF purists assert that everything is a {subject -> predicate -> object} triple constrained by an ontology. They assert that Neo4J is a Property-Graph because properties can be generalised a triples; <car> could be an edge between [Fred] and [Jane] with properties like ‘Reg’, ‘VIN’ and ‘Bill’, but you loose the ability to search by car (bummer if you need to search by ‘Reg’) – the CODASYL problem.

        Graph ontologies suffer the same modelling problem as relational schema, but with navigational rules ([Fred] <owns> [Cortina] where [Fred] <is a> [Person] and [Cortina] <is a> [Product] and [Cortina] <owned by>.Count = 1).. much simpler just to use “Node” as a view and “Edge” as link-view projected from a relational model.

        Everything a graph problem, the best example is Financial Crime (Layering, Placement) where you’re looking for cycles, but graph-databases are a poor solution. For almost every real graph-problem, there’s a better solution than a graph-database

    2. Tim99 Silver badge

      Re: Graph is a view, often stored as relations

      A long time ago I wrote relational systems for managing "Heritage". Traditionally this was for storing information for physical "object" items for museums, galleries, and archives - These included artefacts, artworks, photographs, documents, etc. Our problem for connecting these with other things to create "stories" was solved by using "subjects" - People, places, events and topics. Each object could be connected to a number other objects (like a camera, and a photographs that it took) using an object/object relationship table. Each subject could be connected to another subject (like a person lived in a particular place); and all subjects and objects could be connected - Like a camera that was owned by a particular person; who took a photograph of an event; like the opening of a new town hall (a place); covering a topic like "Local Government". All done easily in a traditional database with a few simple relationship tables…

    3. smarrteboje

      Re: Graph is a view, often stored as relations

      > There’s no getting away from the fact that “Graph Databases” were woefully inadequate for the “Graph Database” problem if our time: Pandemic contact tracing.

      Please tell me you aren't pinning the failure of various world governments to successfully implement a contact tracing system at breakneck speed for a novel virus entirely on their choice of technology.

      1. Steve Channell

        Re: Graph is a view, often stored as relations

        No, but if you're in the UK and you got COVID, the length of the graph connecting "patent zero" to "you" is between 1 and a number with 18 million digits: too big for recursive search.

        Fast graph databases don't store triples as rows, but with a sorted columnar store - fast read, slow write... too slow for real-time update.

        The problem was too complex for a graph database

        1. szilardbarany
          FAIL

          Re: Graph is a view, often stored as relations

          You have no idea what are you talking about. A modern graph database stores graph in it's native form: as vertices with their outgoing edges. Not in tables, not in docs, not in K/V stores. There are MPP graph databases, that can ingest data in real time and can ran complex analytics (including the several dozen graph algorithms, like shortest path or circular detection) in very fast response times. Multi-hop queries (several dozen hops over edges) are easily possible. Storing and processing billions of vertices and edges is possible.

          Have you ever seen a graph database?

          1. Steve Channell

            Re: Graph is a view, often stored as relations

            Yes Neo, RDF, SQL/Server and in-memory heap models.. You can run any data-model across a MPP, and they all scale. TigerGraph, is a leader in graph scalability (which I believe you're referring to), but even TigerGraph does not purport to be a replacement for a normalized data store, but an analysis tool.

            The topic of the discussion was whether you could replace relational databases with a graph store, rather than whether they had any value.

  4. Martin Gregorie

    IME this question seems to miss the point

    I've designed and implemented systems using both graph (CODASYL) and relational databases, the latter ranging from Sybase and PostgreSQL through early data warehouses , i.e. an RDBMS optimised for star schemas such as Red Brick as well as the simple indexed file schemes used by 4GLs such as Sculptor.

    In my opinion it doesn't matter much which flavor of database you choose, simply because if you dive into writing a system specification without some careful data analysis that is followed by producing a properly validated Data Structure Diagram before you do any detail design work on:

    * the database schema

    * the user dialogue used to maintain and access the database content

    * the software architecture that links them together

    the application is unlikely to meet its user's expectations for usability or performance.

  5. Charlie Clark Silver badge

    Relational often gets the blame for SQL's shortcomings

    Modelling graph traversal on a relational system isn't really difficult. But doing it in SQL used to be unnecessarily verbose, as was true for many relational queries. That has changed in recent years with syntactical changes allowing for more compact and expressive querying of the sort that SQL was always supposed to support.

    Relational is nearly always the best place to start, with incredible support for domain specific types, indices and functions and the flexibility to cope well with nearly every task.

    1. Steve Channell

      Re: Relational often gets the blame for SQL's shortcomings

      common table expressions (SQL/Server) and connect-by (oracle) are fine for recursive graphs (acyclic directed graphs), but not for cyclic data structures "Fred buys car from Jane, Jane bills Fred". Graph databases are supposed to be good for cyclic data structures but performance gets slower and slower when links are recursive - that's one reason they couldn't be used for pandemic contact tracing.

      For pandemic contract tracing, the problem was inverted by distributing the information over billions of agents (smartphones) with an old fashioned relational database ("has been contacted" flag) in the center. If you don't need a graph database for small graphs and can't use a graph database for big graphs, you have to wonder what the point of them is

      1. Charlie Clark Silver badge

        Re: Relational often gets the blame for SQL's shortcomings

        Apart from implementation design the biggest problem with contact tracing was that it was a fata morgana, the sort that the tech industry loves to propose to show how good it is. It was always going to be behind the curve and was unceremoniously replaced by sewage water sampling, which is cheaper and more likely to identify potential outbreaks.

      2. smarrteboje

        Re: Relational often gets the blame for SQL's shortcomings

        On what day were they contacted? By whom were they contacted? Did they become sick afterwards, or did people they contacted become sick? When you reduce the nature of the relationship down to a single boolean you lose pretty much all the context.

    2. EarthDog

      Re: Relational often gets the blame for SQL's shortcomings

      Too many people don't know the difference between a relational database and a scripting language interface to a database API. *cough* NoSQL *cough*

  6. This post has been deleted by its author

  7. anderlan

    Is a database debate a detabate?

    Could the next Vulture site specifically for database debates be called DetaBates?

    1. mote

      Re: Is a database debate a detabate?

      I think you meant "databate"

      1. ecofeco Silver badge

        Re: Is a database debate a detabate?

        GO AWAY! BATE-IN!

      2. Anonymous Coward Silver badge
        Trollface

        Re: Is a database debate a detabate?

        That's a minor datail

    2. jmch Silver badge
      Trollface

      Re: Is a database debate a detabate?

      I'm not sure about that.

      More importantly, is someone pretty sure at debates a master-de-bator?

  8. mote

    Graph DBs are the "everything else" in "RDBMs and everything else"

    If many organizations have adopted Graph DBs, there is not necessarily a functionality or performance feature, but an ease of access advantage, and I think the above comment mentioning "scenarios where you don’t understand the data" is one of its greatest features. There are known knowns and known unknowns, and as Donald Rumsfeld once said, "unknown unknowns."

    Also, science (data science included) isn't about speed, but results.

  9. Paul Smith

    Modelling graph traversal on a relational system isn't really difficult, but then, writing the entire DB system in Perl is also possible, but neither is actually advisable. If you have an RDBMS problem, then use an RDBMS to solve it. If your problem is graphing relationships, then a graph database is the most natural tool to use. This is not an either/or debate.

  10. The Velveteen Hangnail

    Exceptionally specific use cases

    The problem with Graph databases are the exact same problems as any other non-RDMS such as key-value/document stores.

    They are useful for very specific use cases that are tailored to their specific strengths. But using those systems come at significant costs, the biggest one being that more generalized use of the data is damned near impossible.

    With a conventional RBDMS, you have standardized ways of approach a problem, applying constraints, analyzing the data, etc. An RDBMS design can be trivially adjusted to emulate any of these other databases, and they are appropriate for use in 90-98% of use-cases.

    With any other database system, you now have a custom query language(at best) or you have to write entire subroutines just to get the data out of the system in the way you want. They are able to solve one or two specific scenarios, but at the detriment of all others. The end result is your hello world database that targeted your original need quickly becomes mindbendingly complex to manage or extend when other needs inevitably arise. MongoDB is a low-hanging-fruit example where you literally can't do any kind of analytics without writing entire applications to parse the data.

    RDBMS' won the database wars for very good reasons. They provided a very good compromise for all the different possible ways that data might need to be sliced and diced, nevermind the inevitable evolutionary requirements that any long-running database will experience.

  11. deadlockvictim

    RDBMS as standard

    I take the RDBMS as standard and ask the question: what can x (in this case graph databases) do better (faster, with fewer resources etc) than the standard RDBMS?

    From what I have seen so far, graph databases are very good at finding paths. As was mentioned above, the contact-tracing system used in the recent pandemic should have been the chance for them to shine. They did not and I don't know why. Too little expertise in the right places perhaps?

    The examples that I was shown were public transport connections. If you are in Lyon and you want to get to a rural village in Bavaria, graph databases seem to be better than relational databases at finding routes based on cost, time & geography. Mapping the spread of disease also looks promising and I'm waiting for some lab to apply them to neural pathways.

    I'm not sure that I would want a financial system based on graph databases but then this is an area where RDBMSs shine.

    All in all, just as cubes are better than relational databases at aggregating massive amounts of data and presenting me quickly with an answer so, I'm sure, graph databases have their niche where they do their job spectacularly well and much better than relational databases. And, like cubes, I expect that they will complement relational databases well.

  12. RobLang

    "Most of the same use cases"

    What use cases? Dynamic graph traversal is easier in a graph database. You can do it in RDBMS but then you'll have to code a bunch of views and functions. Why do that when a graph database does it already? At risk of sounding like my Dad: right tool does the job right. This debate feels like engineering conflict where there isn't any.

  13. shaye

    Stored routine

    Excellent article, but seems like it overlooked one important advantage of Graph databases over RDBMSs (with or without graph support) - out of the box graph algorithms. Although very niche, graph algos such as nearest neighbors, page rank, community detection, etc., are offered natively by Graph db's and provide great value if you find yourself need them.

    I'm sure RDBMSs can offer such algos at least to some degree, thought. Personally I have implemented page rank in SQL once.

    1. Charlie Clark Silver badge

      Re: Stored routine

      As long as you can describe the algorithm logically, you can implement it for an RDBMS. But the more important question might be do you want to index the results? Recursive traversal can quickly get quite slow and having the right indices is usually the solution to most database queries.

      FWIW a sample PgPsql function from 2013. Quite verbose, but I suspect this could now be written more compactly. Functions exist via extensions for most domains (Geodata, statistics, etc.)

  14. Anonymous Coward
    Anonymous Coward

    Trivial until it isn't?

    The use case I'm interested in is getting a dependency graph over all kinds of different entities, so I can find out if something changes in table A, it impacts items in table Q and R, without having to bake a query beforehand that explicitely lists all of the ways you could follow keys from table A to B and C, etc.

    Surely this would be quite useful in say, tracing supply chain security?

    If there's some nice (performant?) way to use SPARQL on top of a conventionally designed RDBMS, I'd be glad to use it.

    1. szilardbarany

      Re: Trivial until it isn't?

      It's not SPARQL that you need but a graph query language. An RDF triple store is not a property graph database. A knowledge graph is not a graph database. A real property database (be it a typed one, like TigerGraph, or a labelled one like Neo4J) can store and process large and complex networks of interconnected data. A supply chain is a good example, or a complex ETL/ELT process; in general any dependency related problems.

      See my other comment for this article.

  15. MrBanana

    well-architected relational databases

    I've been in the RDBMS business for over 30 years, as implementer, designer, support engineer, trainer, consultant... The number of well-architected relational designs I have seen in all that time is probably less than a dozen.

    1. Tim99 Silver badge
      Angel

      Re: well-architected relational databases

      I started with DEC Rdb and Oracle 4. Obviously, by the time I got to SQL Server 6, my systems were nearly perfect :-)

    2. EarthDog

      Re: well-architected relational databases

      +1 most programmers I've known are functionally illiterate at RDMS, or any DB model in fact.

  16. EarthDog

    Marketing

    Basically we have Graph Databases due to marketing. "Hey look at are new shiney shiney graph databases! Pay no mind to what the hot new shiney shiney graph databases were in the 70s".

    Codd saw the problems with graph and hierarchical databases and defined the Relational Model that solved those problems and more. He also created the Mathematics to back it up.

    I've worked with both Graph and (at least quasi) Relational Database models. Relational (at least quasi) wins hands down. But hey, if you want to waste your money on a Graph Database where things take longer to build than they should and they often require constant maintenance, then I'm happy to take your coin.

    A Graph DB Model is a subset of a Relational DB Model because a graph is a subset of a relation.

    Here's what I see graph DB are liked by those who:

    1) Don't know databases

    2) Don't understand graph DBs

    and/or

    3) Don't understand the relational model. Related: that SQL is a scripting interface to a DB API and NOT a way of modeling data.

    Yes I sound like a cranky old codger. In fact I am a cranky old codger who is tired of seeing the same mistakes made over and over. Now get off of my lawn!

    1. szilardbarany

      Re: Marketing

      You are indeed a cranky old codger. And you haven't seen a proper graph databases and/or a proper graph use case.

      I worked with/for some of the biggest RDBMS vendors, have modelled OLTP and DW databases, wrote plenty of SQL, but I also used proper graph database to tackle proper graph use cases where an RDBMS would not have a chance. Graph DBs have their place in the landscape. Currently it's a niche, but as more and more people/companies understand what they really can do, the market slice will grow too.

      See my other comment here.

  17. davidmakovoz

    We can argue until the cows come home about the advantages of GraphDB vs Relational DB. We seem to be in agreement that graph-like queries are of great benefit.

    And the fact is that until Neo4j, TigerGraph and the like came and started taking a bite out of the marketing share, Oracle and the like didn't bother to create a graph query interface. They were perfectly content with the status-quo and in no rush to innovate. And if they (Oracles of the world) manage to squash Neo4j's, they will go back to their complacent status-quo.

  18. szilardbarany

    If you have a graph shaped problem, nothing beats a graph database

    "Graph databases – in which relationships are stored natively alongside the data elements – do not provide a significant advantage over well-architected relational databases for most of the same use cases."

    Why would you want to use a graph database for a use case that can be solved in a relational database? Solve problems instead that would be too expensive/too time consuming/impossible to do in an RDBMS. Try to figure out if this banking transaction is in connection with any known fraudsters within 10 hops. Try to find out who fas those small farmers in Ecuador, Nigeria and Papa New Guinea from whom the raw ingredients was purchased for a product that was complained about when the process between the final product and the raw ingredient includes a massive and complex network of processing, transporting and internal trading steps. (FYI: 12 sec in graph DB vs 15 hours/timed-out query in RDBMS.)

    What the article states is: "Screwdrivers do not provide a significant advantage over a well polished hammer for most of the same use cases (e.g. dealing with nails)."

    1. CRConrad Bronze badge

      Exactly:

      Why would you want to use a graph database for a use case that can be solved in a relational database?
      And since all use cases can be solved in a relational database, why would you ever want to use a graph database?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like