Welcome back to the latest Register Debate in which writers discuss technology topics, and you the reader choose the winning argument. The format is simple: we propose a motion, the arguments for the motion ran on Monday and today, and the arguments against on Tuesday and tomorrow. During the week you can cast your vote on …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Wednesday 8th March 2023 10:48 GMT Filippo

I'm not sufficiently involved with DB architecture to provide a reasoned answer. However, I can't help noticing that this is a poll about a negative clause, and I wonder how many of those who have voted were confused as a result. It's not even the first time a Reg Debate does that, and I seem to recall it being discussed last time as well. Wouldn't these article series be improved, if the poll was not formulated as a negative?

16 2 Reply
1. Wednesday 8th March 2023 10:54 GMT Tom 38
  
  Wouldn't these article series be improved, if the poll was not formulated as a negative?
  
  I think you mean "Would these article series be improved if the poll was formulated as a positive?"
  
  32 0 Reply
  1. Wednesday 8th March 2023 13:31 GMT katrinab
    
    No, because there is a middle ground between "positive" and "negative". "not negative" includes that middle ground.
    
    4 1 Reply
    1. Thursday 9th March 2023 10:05 GMT Lucy in the Sky (with Diamonds)
      
      Don't say ever no never again...
      
      My Ol' Pappi always said to me, "Don't no say no nothing about double negatives to nobody, never again, or I won't tell what I wouldn't do to you."
      
      2 0 Reply
2. Wednesday 8th March 2023 12:44 GMT Steve Button
  
  I got to this part... "Vectorized Query Execution – Using a vectorized processing model [PDF] (as opposed to a tuple-at-a-time model)" and at that point my brain just started acting like Charlie Brown when the adults are talking in the room.
  
  So, I guess you could say I'm also not sufficiently involved in DB architecture.
  
  Although I know a bad one when I see it, and I've seen plenty of those.
  
  6 0 Reply
  1. Wednesday 8th March 2023 15:19 GMT pavlo
    
    These is call a "query processing model". I cover this topic in my DB class. The lecture is on Youtube:
    
    https://youtu.be/ck3PkXTOueU?t=1824
    
    -- Andy
    
    10 0 Reply
Wednesday 8th March 2023 11:41 GMT Greybearded old scrote

Riiight

It's all very well showing great performance with experimental extensions, but when will it be implemented in production DBMSs? Real Soon Now huh?

11 2 Reply
1. Wednesday 8th March 2023 17:13 GMT EarthDog
  
  Re: Riiight
  
  considering that the paper was written in 2023 maybe a year. Seriously, this isn't like coming up with a new power up pack in a video game.
  
  1 0 Reply
  1. Wednesday 8th March 2023 17:31 GMT ifadams
    
    Re: Riiight
    
    A year is optimistic for anything beyond early POC/Beta. While I'm not arguing that an RDBMS can't get such functionality, I think there is a serious issue in trivializing how hard it is to add functionality to these systems given the enormous complexity of maintaining consistency at scale, nor am I convinced the complexity of shoving it into an RDBMS is worth it.
    
    1 1 Reply
    1. Wednesday 8th March 2023 17:49 GMT yetanotheraoc
      
      Re: Riiight
      
      Still have Oracle 11 here (2007). Trying to get the remnants moved to Oracle 19 (2019), but it's not easy due to regulations.
      
      1 0 Reply
2. Wednesday 8th March 2023 17:32 GMT yetanotheraoc
  
  Re: Riiight
  
  Good point.
  
  "Their results show that DuckDB, with the above extensions, outperforms the native graph DBMS by up to 10x. These are state-of-the-art results from January 2023 and not from five years ago."
  
  Top researchers using bleeding edge tech describes precisely zero places I have ever worked. One question I have is what the results might have been if the researchers had put similar effort into extending the native graph DB and compared that to off-the-shelf DuckDB. On the other hand, if all we care about is results today and throughout our business cycle, it may not matter which approach is theoretically superior. Just count the clever people working on each approach.
  
  2 3 Reply
  1. Wednesday 8th March 2023 19:01 GMT Michael Wojcik
    
    Re: Riiight
    
    All the technologies you use now were "bleeding edge" at some point.
    
    And the force of the argument here is precisely against needing "bleeding edge" technology. It's that an evolution in RDBMS capabilities, which is already underway thanks to SQL/PGQ standardization, removes the need to switch from an established technology to a significantly different one.
    
    Your question about "the researchers had put similar effort into extending the native graph DB" doesn't make sense. The capabilities they added to DuckDB are already in graph DBMSes that support GQL, because they're a subset of GQL. What the paper shows is that it's feasible to add the SQL/PGQ enhancements to an existing analytic RDBMS and when they did so the performance was superior to the existing GRDBMS.
    
    I'm not qualified to have a strong opinion on the debate here, but this particular line of argument is irrelevant to it. The question at hand is whether GDBMSes fundamentally handle a significant subset of use cases better than RDBMSes. "GDBMSes have X and RDBMSes don't have X yet, but it's been shown they can have it" supports the RDBMS side, not the GDBMS one.
    
    9 0 Reply
    1. Wednesday 8th March 2023 19:48 GMT yetanotheraoc
      
      Re: Riiight
      
      `Your question about "the researchers had put similar effort into extending the native graph DB" doesn't make sense.`
      
      You purposely showed it in the worst light.
      
      Suppose you have a competition between a race car and a trolley car. The trolley car is far more efficient travelling on the tracks. So the race car engineers get busy bolting some extra steel wheels on the race car, and now it beats the pants off the trolley car when travelling around the tracks.
      
      Me: What would have happened if they had spent the same effort improving the trolley car?
      
      You: The trolley car already has steel wheels.
      
      0 6 Reply
      1. Wednesday 3rd April 2024 07:17 GMT CRConrad
        
        Re: Riiight
        
        Yeah, because a trolley with added rubber tires will be much better at racing around the streets than a car... You (inadvertently, I bet) came on the exact right metaphor: Trying to make a trolley beat a race car would mean turning it into a race car.
        
        0 0 Reply
Wednesday 8th March 2023 12:26 GMT SUDO-SU

With RDBMS I've seen a lot of garbage. Most people don't have the skills or a dedicated dba that will solve these problems for them or architect a good schema.

So if a graph DB can provide benefits of a well architected relational DB, without needing to be an advanced user, then graph dbs may become something

4 5 Reply
1. Wednesday 8th March 2023 13:02 GMT F. Frederick Skitty
  
  In my experience, Graph DBs just allow incompetent programmers to make an even worse mess than they would with a Relational one. The complexties of graph query languages don't help. I was forced to use MarkLogic at one company, simply to justify a license that we'd acquired thanks to a former colleague's lack of impulse control. Technically impressive but an absolute nightmare to work with (MatkLogic that is, not my former colleague - he was just a nightmare).
  
  22 0 Reply
  1. Wednesday 8th March 2023 17:36 GMT yetanotheraoc
    
    The research paper we would all like to see
    
    Graph vs relational - Which one produces the worse shit-show in the hands of incompetent programmers?
    
    10 0 Reply
    1. Sunday 12th March 2023 23:35 GMT spireite
      
      Re: The research paper we would all like to see
      
      Graph I wager....
      
      When I see NoSQL stuff implemented, it's usually because they can't be arsed to organise and 'schemaless' fits the bill.
      
      NoSql is a perfect 'product' for the lazy dev.
      
      3 0 Reply
  2. Wednesday 8th March 2023 19:06 GMT Michael Wojcik
    
    Honestly, I'm having trouble thinking of a technology that turns bad programmers into good ones. The only one I can think of that works in a significant number of cases is education.
    
    COBOL was the first widely-deployed attempt at that, as far as I can recall, and it didn't succeed. COBOL may have let non-programmers write parts of programs, but it didn't make them good programmers. Functional programming has its advantages, but it didn't turn bad programmers into good ones. Same for structured programming, object orientation, 4GLs, StackOverflow, GitHub Copilot, and so on.
    
    13 0 Reply
    1. Thursday 9th March 2023 10:42 GMT sgp
      
      It's a bit of a moot argument. If you stick to the basics, relational data modelling really isn't that hard. If you can't deliver, it's probably because what you are designing is way too complex and you need an expert anyway.
      
      4 0 Reply
Wednesday 8th March 2023 14:56 GMT Anonymous Coward

I've spent a lot of time over the last couple of decades working with systems developed on top of classic relational databases...always, and I mean ALWAYS the developers end up creating something that requires endless money thrown at more resources to handle the garbage that they've built...it's never their logic, it's always that the database doesn't have enough RAM or CPU...yes, you can make incredibly efficient solutions with relational databases...but you can't do it on the budgets available because as people have said above, there usually isn't any money available for a DBA...especially early on in the development cycle.

Stop me if you've never had to deal with that one ugly query that has tons of JOINS and UNIONS in it that brings one specific part of the solution to a crawl because someone, at some point decided..."well, we probably need to see everything on this page in one table".

Furthermore, database hygiene with relational databases is a massive problem...again, stop me if you haven't heard "we'd better leave the old data there, just in case we need it!" despite your begging and pleading, highlighting that keeping that old useless data means you're dragging back 9 million rows in a single query and that it's costing them a fortune to keep the old data around.

Yes, you can build efficient solutions, no your average non-technical user doesn't understand the implications of the things they ask for.

Anything that prevents non-technical people from shredding the fucking gusset out of their own project is a win, it's not re-inventing the wheel. It's taking the corners off and making the wheel round instead of square.

7 2 Reply
1. Wednesday 8th March 2023 19:07 GMT Michael Wojcik
  
  Anything that prevents non-technical people from shredding the fucking gusset out of their own project is a win
  
  <Fred Brooks>There is no silver bullet.</Fred Brooks>
  
  3 0 Reply
2. Wednesday 8th March 2023 23:35 GMT Code For Broke
  
  Database hygiene is a fine topic. But you piss me off by implying that tons of crappy data is a problem native to RDBS. It's a problem native to human nature and is also well evidenced in essentially every computer storage solution ever conceived, including graph dbs.
  
  Seriously, not cool.
  
  9 0 Reply
  1. Thursday 9th March 2023 09:04 GMT Anonymous Coward
    
    I didn't imply that it is a problem native to RDBS...I implied that it is a problem made worse by RDBS.
    
    0 1 Reply
    1. Thursday 9th March 2023 10:45 GMT sgp
      
      Except that it isn't. In your example, why would you not be able to tweak your query to not include the 9 million rows that are not needed? What kind of cost do you think is associated with that?
      
      3 0 Reply
3. Sunday 12th March 2023 23:37 GMT spireite
  
  Entity Framework and ORMs in general have a lot of blame in this ugly query thing you mention
  
  1 0 Reply
Wednesday 8th March 2023 15:34 GMT Pascal Monett

"there usually isn't any money available for a DBA"

So if they can't do it right the first time around, then they can pay forever to keep it hobbling along until they understand that they need to do it right.

Excuse me if I have little patience with Borkzilla-era managers who know nothing about how things work but are happy as soon as they see their pet project on their screen, and costs be damned.

So, let's forget relational because it's too complicated. Well bugger, but IT is complicated.

Personally, if you're a manager and you haven't yet understood that, you should be condemned to working with paper and punchcards until you get it.

That not being possible, paying ever more for ever more resources is an acceptable substitute.

2 0 Reply
1. Wednesday 8th March 2023 16:58 GMT EarthDog
  
  Re: "there usually isn't any money available for a DBA"
  
  DBs are complex because the things they model are complex. E.g. human relationships.
  
  3 1 Reply
  1. Friday 10th March 2023 18:15 GMT _olli
    
    Re: "there usually isn't any money available for a DBA"
    
    More often things stored into DBs event aren't inherently really that complex. There's just this school of system architects who think that "complexity is good" and oh boy can they coat everything with additional layers of complexity. Presto, mess is ready.
    
    1 0 Reply
Wednesday 8th March 2023 17:10 GMT EarthDog

I've said it once and I will say it again

First off +1 to @sudo-su and @Pascal Monett

A big chunk of the problem is most programmers are functionally illiterate when it comes database models. There is is trend to get rid of DBAs. In one case I commented on they got rid of the DBA and replaced them with a programmer on the team. Basically ending up with a programmer with possibly no background and/or interest in data who is actually and embedded DBA and bound to make all the same old mistakes and re-invent the wheel time after time.

9 1 Reply
Wednesday 8th March 2023 17:15 GMT yetanotheraoc

Now we're getting somewhere

Good stuff to chew on in this article. Let's see what Thursday brings.

1 0 Reply
Wednesday 8th March 2023 17:50 GMT recharged95

Keyword in this debate is "well-architected" and "most" typically cases.

Now we've reached the scrum argument tactic, if the RDBMS can't do what you need: "it's not the RDBMS", it's, "you're doing it wrong".

Sure you can incorporate a lot of the advantages of a graph db into a rdbms, but having it retain purity/usability of a normal row/col is hard. OracleText comes to mind back in the day. Most architects know a good schema is hard, complex. Graph dbs just make it easier. But yes if done right, schema/rdbms can be blazingly fast.

1 1 Reply
Wednesday 8th March 2023 18:24 GMT Ian 3

More to database architecture than performance? Won't someone think of the data modellers?

The motion is about providing a 'significant advantage' but only raw read/write query performance advantages seem to be used in support. For use cases with a lot of many-to-many relationships and objects that can be linked to the same other objects for different reasons at the same, the significant advantage isn't performance but ability to model with reasonable clarity, and I'm sometimes very happy to sacrifice performance for that (especially when performance is still perfectly reasonable). The way real world things relate to other real world things is rarely a neat, hierarchical relational model with nice foreign keys, and sometimes modelling as a graph has a 'significant advantage'. (And before someone mentions it, if your RDBMS model is full of FK to FK mapping tables with added relationship meta-data, then you've just built a graph, and your SQL will be 'interesting' and hard to manage)

3 0 Reply
Wednesday 8th March 2023 18:43 GMT Steve Channell

Nobody loves SQL, but it works

Lost to time is the fact you could use SQL to query IDMS (CODASYL network database) and IMS (mainframe hierarchical database), but the performance couldn't match relational (unless it was organized for your specific query).

While SQL queries are wordy compared call-api, a parameterized prepared query "SELECT * FROM CUSTOMERS WHERE CUSTOMER_ID = @id;" can be cached and reused (especially for DB2, where it is compiled to an application plan, or Sybase/ MS SQL/Server where prepared statements are compiled to TSQL procedures)

SQL relational has the significant advantage that tabular/column-store tables can be scanned in parallel by highly tuned (often automatically) DBMS and sharded over many nodes when partitioned. It can also take advantage of vector instructions and GPGPU - the performance advantages RDBMS enjoy will only get bigger over time.

Where relational suffers is recursive queries, but even this can be addressed with in-memory databases (either on the server or local copy of immutable history) and LSM (log structured merge).

The outlier is not fashionable graph-databases, but time-series and document databases - even these can be optimized by splicing databases that break them into an array of key/value pairs -nobody in their right mind would store an FpML contract as document, when the only thing that changes is the price and cashflow schedules.

2 0 Reply
1. Thursday 9th March 2023 16:47 GMT smarrteboje
  
  Re: Nobody loves SQL, but it works
  
  I think you're being quite flippant with your assertion that recursive queries only need be addressed with an in-memory database. Indexing and query complexity are both pretty big factors when you need to start doing this.
  
  I'm also totally unsold on this argument that relational databases are the only format that allow for optimized memory reads, or sharded reads across threads/servers, as if somehow a graph database storage engine is just throwing blobs of data across time and space in a random fashion and the only way to traverse them is in linear time. The reality is totally different.
  
  0 0 Reply
  1. Monday 13th March 2023 17:47 GMT Steve Channell
    
    Re: Nobody loves SQL, but it works
    
    I wasn't suggesting that recursive queries only need to be addressed with in-memory databases, but that in-memory databases address many of the issues associated with them.
    
    There are two types of recursive queries (three if you include graph extensions) : [1] programmatic recursion (e.g. Entity-Framework, Hibernate, etc) where each node visit uses lazy-loading to fetch the next tree, [2] Common-Table-Expressions/connect-by where the RDBMS recursively call the execution engine to build result set. In-memory databases help these recursive calls because the data is already in memory.
    
    You're right to highlight that recursive call that do not have sympathetic indexes are a particular problem because it is not clear which session need to be killed to stop performance problems. Cyclic Graphs need max-recursion limits to prevent never-ending queries.
    
    Relational databases have no special advantage with parallel scans, but they have been doing them for decades and optimizer support is mature.
    
    0 0 Reply
Wednesday 8th March 2023 20:00 GMT Robert Grant

It depends on what "the same use cases" is.

Graph databases are an optimisation: they're amazing at queries like "count all my friends of friends of friends" but they are bad at queries like "give me the mean of all the ages of people in this database". Unsurprisingly, they're good at graph-like traversal but bad at relational operations.

Relational databases are pretty good at everything, but not the best at many things. Choosing them is not the premature optimisation choice, which is good.

7 0 Reply
1. Thursday 9th March 2023 11:25 GMT Roland6
  
  Re. "the same use cases"
  
  I noted that the implications of this caveat are being (deliberately?) overlooked.
  
  To me the debate only really has any relevance if graph databases are better than relational for the business mainstay use cases/applications: CRM, HR, Finance, ERP etc.
  
  So far it seems graph DB’s are good for some specific use cases/problem domains which are outside of mainstream business applications, hence don’t really compete with relational.
  
  2 0 Reply
  1. Monday 13th March 2023 17:49 GMT Robert Grant
    
    Re: Re. "the same use cases"
    
    Yep. Anything graph-related, e.g. a social network, or a large-scale representation of things that relate somehow (e.g. if you're Amazon and want to query across all your networked devices, or an IoT provider wanting to query all your devices in the field and how they relate in a mesh) it works.
    
    I will say at least Neo4J is also quite good for exploratory work, but that's more because it doesn't enforce schemas and has a nice visualisation technology, than anything to do with it using graphs.
    
    0 0 Reply
2. Thursday 9th March 2023 16:47 GMT smarrteboje
  
  Yes, it's almost like the debate motion itself is poorly formulated. I don't think there is a single database storage model that could be said to be the best for most use cases. Like in every technology decision, there are better choices and worse choices, trade offs and time pressures. At best, I think if FOR wins this debate then we will have learned that the El Reg readership wouldn't choose a graph database to solve most problems, which is hardly learning much.
  
  2 0 Reply
Thursday 9th March 2023 05:13 GMT Ashto5

Small is beautiful

Small is beautiful and quick.

If the data is not needed TODAY then move it to an archive DB and let the good times roll.

2 1 Reply
Thursday 9th March 2023 08:10 GMT xyz

Oar in....

There are uses for graph dbs but I've never had to go there, even though I've been prodded in that direction by management wanting to be seen to be cooool and hip.

On the other hand let loose the code first nazis and their ORM bombs on a relational db and you get a world of hurt.

I've been extracting ndf files (datawarehouse type science data files) recently and God does that lot feel so old fashioned.

All in all, you choose your tool for what you need, not what is the new shiney. Data, in whatever format, should be readily accessable and to be honest there is always a way to get it and a bit more of work to get it fast. The key things for me are understand, test, learn how to be smarter and produce stuff that works.

BTW, with Reg specials like this, I always smell a salesman in the background.

2 0 Reply
Thursday 9th March 2023 10:17 GMT James Anderson

He lost me.

It was over 30 years ago that I encountered my first relational DB and have been using the, on and off ever since so I would expect to be on the RDBMS side if the vote.

However it’s also about 30 years since I first got my hands on a unix box ( Sun Microsystems Shoebox ) and I whole heartedly endorse the “do one thing and do it really well” philosophy of early unix systems.

So rather than admit there are a couple of things that are better done using a different tool he proposes adding yet more functionality to an already bloated and complex system. Commercial database products are already full of features and extensions that very few customers ever use but add to the cost and complexity of the system.

0 1 Reply
1. Thursday 9th March 2023 10:53 GMT sgp
  
  Re: He lost me.
  
  The same is true of unix (like) systems. The idea is that each component of that system does one thing (well). If you think of the relational database as a system of components, it's entirely the same thing. In 20 years time, graph db systems will be full of technical debt and features that almost nobody uses (anymore). But why should you care about this? This is the same kind of reasoning that leads to so many new programming languages while the same kind of features can be added to existing ones.
  
  1 0 Reply
Thursday 9th March 2023 10:51 GMT Binraider

Surely the key problem for a non-relational database logical model, above anything else, is that humans mostly think in either hierarchical or relational terms?

Regardless of what clever tech is used hierarchical or relational terms have serious advantages at the user-end.

And to be blunt, given corporate IT policies concerning rolling out tools, you're lucky half the time if you can get a copy of excel 64 bit & access; rather than a server to experiment on...

1 1 Reply
Thursday 9th March 2023 14:14 GMT Alan Bourke

Is this like how NoSQL was the death of the RDBMS

despite the fact you can't get a report out of it.

3 0 Reply
Friday 10th March 2023 09:01 GMT Tom 7

Most of my experience with DBs

suggest this may be a new way to obfuscate the customers data in ways that sql didnt allow.

2 0 Reply
Monday 13th March 2023 13:15 GMT Anonymous Coward

You just can't get good staff these days

It's axiomatic that the langage (and database) every developer uses is the best, for them.

I care more about getting someone in to make changes 5 years later. All those Perl and PHP scripts from 2002 are of no interest to the Python and R kids graduating 2022.

Relational may not be interesting but it generally works, the big 3 generally bolt on a few features at 2 year intervals that do 80% of what the new and cool DBs can, while the performance of the servers will increase to run them a bit faster and the efficiency of the DBMS will be a little better than before.

If you want to use a GraphDB, go for it, but don't pretend it's for technical reasons any more than running Apple vs Android.

1 0 Reply