Wow. I've been quoted. I feel so honoured. :)
The nodes have it in the Great DB debate: Reg readers pick graph
This week, Register readers debated the motion Graph databases do not provide a significant advantage over well-architected relational databases for most of the same use cases. It was lively discussion between two experts, who were evidently not satisfied by the prospect of merely winning or losing a debate and added public …
COMMENTS
-
-
This post has been deleted by its author
-
-
Friday 10th March 2023 11:45 GMT elsergiovolador
The database equivalent of a rap battle
Listen up, fam, it's time to start the showdown,
Relational Database and Graph Database, who gonna wear the crown?
We got Relational chattin' all slick and that,
Talking 'bout their structured data, like they ain't no brat.
Graph Database, on the other hand, acting all bad,
Thinking they're the real G's with nodes and edges, like they ain't no fad,
But let me tell ya, they're just a hype ting,
Their lack of consistency will leave 'em in a spin.
Relational Database, just a bare liar, innit,
Can't handle complex data, that's the truth, admit it,
Joins so slow, schema's too tight,
Can't handle unstructured data, they ain't even right.
Graph Database, think they're all loose and that,
But they'll never match the power of SQL, that's a fact,
Too flexible, queries taking ages,
They'll never be the boss, just stuck in stages.
Relational Database, too stiff and straight,
Like a cell block, not lettin' data come through and act,
ACID properties are just too far out,
They'll never be as flexible as us, there's no doubt.
Graph Database, just a toy for the little mans,
Can't handle transactional workloads, not even part of the plans,
Their scalability just ain't enough to be true,
They'll never be able to handle data like we do.
So the battle's done, who's gonna be the top dog, innit?
Team Relational or Graph Database, which one you gonna stick with?
-
Tuesday 14th March 2023 12:08 GMT breakfast
Re: The database equivalent of a rap battle
If your data's relational in stationary state and all the paths of graphs just make you laugh
When your tables are stable each index and key from foreign relation to primary
Could be seen as an edge between simple nodes - but move just one field and the whole lot explodes
So look at it face on: what kind of structure can you base on JSON?
It's hard to forecast scale and size when every query returns a surprise
So check the mission, escape the long text
The right decision will count on the context
-
-
Friday 10th March 2023 12:17 GMT katrinab
Graph Databases probably won't overtake relational databases any time soon, or indeed ever. But that doesn't mean they are useless.
Neither type of database is going to overtake Excel, but likewise, that doesn't mean that everyone should migrate the business logic over to Excel spreadsheets.
Use the right tool for the job rather than try to shoehorn everything into the same platform.
-
-
Friday 10th March 2023 20:00 GMT yetanotheraoc
Re: It does help.......
Excel is great. It seems to be a good fit for the amount of thinking the average business user is willing to do. Anything more complicated is either "I don't want to know about it."; or "Can you do such-and-such for me?" (but best not try to explain how you did it, or you get the first one again).
-
Wednesday 22nd March 2023 15:22 GMT Morten Bjoernsvik
Re: It does help.......
>Excel is great.
I hated programming in rows and columns. I used this module: https://metacpan.org/pod/Excel::Writer::XLSX
Banking people got nice html reports from their data-warehouse and payed me to convert it to excel.
A customer had a >1TB database with financial data, they wanted extracts in excel, so I created a script that every morning created around 15000 spreadsheet.
After some time they wanted a way to import changes from the excel spreadsheeds back into the database. That was really hard. Worked a long time on it.
Ended up with adding alter transactions operations we feed into the transaction system.
They were wizards in excel, but learning a few lines of splunk, perl or python were no option.
Nowadays I program in dataframes in R and Python and present it in jupyter workbooks on a website. A dataframe is just a simplified CSV version of a excel spreadsheet.
-
-
-
-
Wednesday 22nd March 2023 22:46 GMT Random Commenter
Re: Excel ...
Excel (and MS Access) offer just enough power for non-devs to create something that appears to be useful, but then turns into a business-critical application.
One investment company I worked for used a spreadsheet to record share trades that were placed manually. This was done because an exec had 'flexed' policy to allow only one of his customers to place trades after the daily 11.30 deadline.
But the time it was placed also affected the price, so this created a whole downstream industry of reconciliation activities, which could only scale by throwing more people into this thankless task.
The excel sheet had a fixed range in place, but nobody remembered, so people merrily added more rows until someone noticed that nothing new was coming through.
They pasted directly into the sheet, but pasted pound signs were treated as text not numbers.
They had a numpty macro that generated direct bank transfer agreements from this garbage. Very large numbers. Often wrong.
When things went south on occasion, a senior manager would need to spend the best part of a week unpicking the fallout.
And all because someone who found part of their job tedious thought to ease their pain by creating a spreadsheet
-
-
-
-
-
Tuesday 21st March 2023 11:16 GMT Nick Ryan
But that wouldn't work with the "modern" developer that doesn't like to have to think about things such as error handling or performance. They can just let exceptions cascade up from somewhere and present gibberish errors to end users, although some like to make this more "friendly" for the end user by informing that "An error has happened" and leaving it at that. After all, their code worked perfectly when they used it briefly (it compiled) and therefore anything beyond this is the user's fault. Likewise for database performance, the problem is that the database does not have enough resources and therefore what should be trivial databases, which are incompetently designed (if at all, most like to defer "design" to whatever flavour toolkit they decide to use this month), and therefore more resources should be provided until the entire database can be cached in RAM. More CPUs assigned would be good to. As for reporting, that's solved by making a copy of the database and ensuring that this is out of sync with the master database whenever possible and for expensive transforms to be applied to this copy of the database in order to answer simple queries that should take milliseconds in a half well designed database.
Incompetent database design is usually the problem - I've seen relational databases where the numpty who coughed them up had no clue about the concept "relational", let alone what a damn index is. Don't forget to use VARCHAR(MAX) for absolutely every column too, regardless of the expected field length. These clowns then decide that the database server is at fault when attempting to search 40,000 rows of data for text in a single column and that an indexed search engine is what is needed to be added on top of this mess of their own incompetence. That way they can defer their incompetence to the configuration of the indexed search engine and then blame the lack of resources on this and the end users will just have to suffer. Alternatively the column could be changed to a sensible length so it's stored inline with the rest of the row data (and it's no longer possible for an attacker to store a 200k Javascript exploit in the first name field of a user) and then add an index to the column - search working and down to millisecond query times as expected.
I've seen similar with no-SQL databases where the concept of consistent column names and data validation is something that the designer considers beneath them and even when a particular tag/column is required, it doesn't matter anyway as it can be patched up in column equivalence calculations at reporting time. The missing data in some forms doesn't matter and neither does have two different data values for the same data with different tag/column names. Oh, and as it's slow, add some more memory too.
-
Friday 10th March 2023 18:08 GMT KingsEnglish
Choice of database? Luxury (and yoy try telling the youth of today that!)
Wouldn't it be wonderful if we actually had a choice over the tech we buy?
If we buy SaaS, we buy into the tech that the vendor has chosen. So in fact, we are buying the outcomes, not the tech.
If we use our own resources, be they internal, contract or outsourced, the reality is that we will get what they know how to use, regardless of how suitable it might appear in a truly impartial analysis. So in fact, we are still buying the outcomes, not the tech.
I like Relational when I fully understand the data in that domain, down to an atomic level.
I like Graphs when I need something that helps me uncover and understand the relationships, especially when they are constantly changing.
But that feels like re-opening Inmon vs Kimball all over again..? Real life is far too interesting and important to allow tech to get in the way of making the most of it.
-
Friday 10th March 2023 23:29 GMT The Velveteen Hangnail
Democracy
> "The best use-case for graph databases is scenarios where you don't understand the data "
I cannot even begin to describe how terrifying that statement is. If you don't understand the data, GTFO and bring someone in who does.
The biggest problem with Democracy is that it depends on an educated public. I bet a week of morning coffees that the overwhelming majority of "pro-graph" people don't have the foggiest idea how databases even work, relational or otherwise. All they care is that graph database are newer, and therefore automatically better.
I've seen this during the NoSQL says. Why did they choose MongoDB? Because it was "easier". SQL was "confusing". MongoDB was "Faster".
Not one single reason was because MongoDB was demonstrably the best choice for the job. Not one. Everything revolved around the perception that MongoDB was inexplicably better.
These same people then spent inordinate amounts of time writing a crapton of convoluted code in order to do the most basic manipulation of the data they had. Work that would have taken 5-10 minutes with a reasonably written query.
One time, I was administering a DB server for a team of app developers. They routinely ran into performance problems because not a single blessed one of them understood what a WHERE clause was. They even went so far as to accuse me of sabotaging their database server and making their code run poorly. It was such a common occurrence, the first thing I did when someone opened their mouth was "Did you make sure your SELECT query was bounded?".
My absolute favourite was, in the same above scenario, they had gotten a new "Team Lead" who thought he was god's gift to software development. They ran into yet another one of these performance issues. Before they had even completed the sentence of the problem they were having, I immediately said, "This issue has come up repeatedly. Check your SQL and make sure you limit your results so you aren't sending back several 10s of millions of rows of data". No no no no no it couldn't be that! They literally spent an ENTIRE MONTH debugging the issue. In the end they solved it.... it was an unbounded SELECT and they were literally returning a couple hundred megabytes of data from the database. For a web page load that needed exactly one record out of the table.
I got occasional sneak peaks of other code they had written. There's no polite word to describe it. The code was such guttural trash that it would give a decent developer PTSD. This was a consumer oriented project too... and I made sure to avoid it like the plague it was.
So the lesson is: If your developers cannot figure out how to query a relational database, it's not the technology. Your developers just suck so bad that they shouldn't be allowed near a keyboard. And if you don't understand your data well enough to be able to say "Graph is better", then it's virtually guaranteed that it isn't.
And where is NoSQL now? Zero people talk about it anymore.
-
Monday 13th March 2023 22:45 GMT Michael Wojcik
Re: Democracy
I thought it was because MongoDB is "web scale".
But seriously, yeah. Adopting a tool because it lets you make wild stabs in the dark is a good way to cut yourself.
-
Thursday 16th March 2023 20:49 GMT EarthDog
Re: Democracy
I keep saying basically the same think but it doesn't sink in. Programmers are by and large NOT database literate. They are not data literate. I've worked as a Sys. Admin, DBA, SQA, Programmer, team lead, Department Head, Business Analyst etc. I am sick and tired of developers blaming others, esp. the DBA, for their crappy code.
They think their code will do the job better than a database engine developed and optimized by people smarter than most of us over a 50 to 60 year period. Oracle was invented c 1977 before the web, the internet, ethernet, and other tech we are used to today. It has a clunky mainframe feel to it but the core tech. is solid. A smart person will NOT blame the DB before have overwhelming evidence.
-
Monday 20th March 2023 03:08 GMT Anonymous Coward
The Unsaid
I find the lack of discussion about the fundamental issues surrounding the differences a bit disturbing...such as Schema and Knowledge Management. For instance, RDBMSs "require" an established schema to manage the store leading to an industry of DB sys admins who must periodically manipulate the DB structure due to the felonious idea that the schema must or can know everything about the data...the "closed world" theory. This is very costly...and accepted...because "go with what you know." Science did not progress with that mentality.
What I find refreshing and sensible is the GDB model, such and RDF and OWL (btw, the only real standard in the GDB domain), where the schema (or rather ontology) is used to augment the data store adding rules that can be changed AS NEEDED to derived and capture insight into your data...actual Knowledge Management instead of static nonsense. It's a different way to use the data. It mitigates much of the down time for schema changes.
Then, it's the dynamic nature of GDBs compared to the static nature of RDBs that is at issue and they both have very valid and separate use cases. GDB let you learn and adapt as needed. RDBs let you manage well know structures efficiently. However, GDBs also reduce data to highly normalized and simple structures...node--edge--node. The data can be represented as a data cube and the efficiencies that go with it. With it, especially using the Semantic Web standard, we can derive hidden information implied in the data, not just use it as intended. Intentions change over time. So use a system that helps with that process, not hinder it.
-