It's not only junior developers that commit this type of SNAFU
There's nothing more dangerous than an over-confident senior developer who has write access to the production daatabase.
248 publicly visible posts • joined 12 Jun 2009
As a Ph.D. student at Liverpool University in the mid 1980s, I tried to use the X.25 JANET FTP protocol to transfer a set of data files from a VAX at the Royal Greenwich Observstory (of blessed and glorious memory) at Herstmonceux to the IBM mainframe at Liverpool. It turned out to be a non-trivial task, and even when the head of operations at Liverpool got involved, he was unable to get it to work, and advised me to ask my collaborators at the RGO to put the files on tape and send it by Royal Mail.
There are Babylonian clay tablets containing cuneiform text that are almost 4,000 years old and still readable. Some of them contain records of solar eclipses and other astronomical events that are still being used by modern astronomers to investigate how the Earth's rotation rate has changed over the millennia. Now THAT is durable data storage!
If I were cynical, I might wonder whether this is simply a case of AI being used to generate greater volumes of the kind of corporate BS that blights the lives of the people who do the real work. I think readers of El Reg will know what I mean: emails, reports and presentations from HR, senior management and consultants, full of buzzwords but oddly devoid of any actual meaning. On the plus side, if I can toss these into Copilot and ask it for a one-line summary in language that a 7-year-old can understand, I'll save myself actually having to read all of the tedious crap and save myself a lot more than 26 minutes each day.
Ah, you're talking about MyISAM, the bane of every MySQL DBA's life. Yes, legacy applications that rely on MyISAM are a world-class PITA. I used to work with a team that refused to switch to InnoDB because they liked to move entire schemas between instances using filesystem-level copying of the data files. You could (with care!) do that with MyISAM. The manual says that you can now do it with InnoDB using the "Transportable Tablespace" feature, but any user who comes to me and asks about that will get the Paddington Bear Hard Stare.
"MySQL broke this by supporting, but ignoring the syntax, and using table scans whenever tables were joined, causing many developers to think, not unreasonably, that JOINs were the problem."
I've read and re-read this several times, but I still don't understand what you're saying. MySQL has always used indexes to perform joins, if there are suitable indexes. Its query analyser may not always pick the best index for the job, but it does have the EXPLAIN command, and an experienced developer or DBA will always run a new JOIN query through EXPLAIN to find out what indexes the query analyser is planning to use. MySQL also has a handy extension called index hints, which allow you to say to MySQL "no, don't use that index, use this index instead", when EXPLAIN shows you that it's picking the wrong index itself.
Occasionally, the query analyser will determine that none of the available indexes will give better performance than a full table scan. That can happen if you have an index on a column that has only a couple of values, and the data are split 50:50 between them. That index is pretty useless, and a full table scan will in fact be faster. That's not MySQL's fault, of course, but trying to explain that to some developers is like trying to explain quantum mechanics to my cat. Pointless and annoying for both of us.
"It is a long time ago but I remember having to fix a website where a single user had two different ids… and the root cause was the lack of data integrity across tables that a FOREIGN KEY would have ensured."
I've seen this myself, back when Ruby-on-Rails was (briefly) the must-have framework that all the cool kids were using. I was told, by someone whose job title was "senior developer", that they had no need for foreign key constraints, because ActiveRecord (RoR's ORM layer) took care of data integrity. A couple of months later, the team leader asked me to look at the possibility of adding foreign keys. If you know MySQL, then you'll know that it won't allow you to add a foreign key if it would be violated by existing data. That was the case with several of the critical tables, so I had to tell the team leader that the data were already inconsistent and would need to be cleaned up first. Thankfully, that wasn't my problem.
And that's why, whenever I hear a developer say "MySQL has changed my data!", I roll my eyes.
Well, the developers *would* say that, wouldn't they. After all, every developer will tell you that their code is perfect, and if weird sh*t happens, it must be the database, right? :-)
Seriously, if there were a problem like the one you describe -- MySQL randomly changing data -- then support forums like Stack Overflow would be full of threads on this subject, and it would be front-page news here at The Register. Googling for reports of MySQL changing users' data unexpectedly, I found just one Stack Overflow thread (https://stackoverflow.com/questions/49594025/mysql-loss-of-data) where a user claimed that MySQL was changing their data, and the leading reply was as skeptical of the claim as I am.
Occam's Razor says that the simplest explanation is the most likely: the bug is in the application, not the database.
"PostgreSQL's backup facilities are to say the least primitive"
That may have been true ten years ago, but the pg_basebackup tool has been part of the standard PostgreSQL distribution for more than a decade now. It allows you to make hot backups of both local and remote PostgreSQL clusters, and re-building a working database cluster from the resulting backup fileset is so easy that even an intern could be trusted not to screw it up. The capability to backup a remote database cluster also makes pg_basebackup the perfect tool for setting up standby clusters quickly and easily. I'd hardly characterise those kinds of capabilities as "primitive".
"I am speaking of experience with a database holding 3 million entities (a song metadata database). It would always lose records and never knew the exact number of songs."
Is it possible, do you suppose, that your software developers weren't very good at their jobs, and the reason why the records went missing is because the application was defective?
I ask because I've worked with MySQL for 25 years, as an application developer and as a DBA, supporting databases with tables holding hundreds of millions of rows, and I've never seen the kind of behaviour that you describe.
NASA had plenty of rapid unplanned disassemblies in the early days. In the movie "The Right Stuff", those clips of rockets exploding just seconds after lift-off are real NASA archive footage. The courage of the Mercury astronauts, especially Alan Shepard, cannot be overstated. Those guys watched rockets explode, then went ahead and sat atop one anyway.
Around 30 years ago, I worked for a while at a large UK university. The HR department had a reputation for leaving people on the payroll after they had moved elsewhere, so in my last month, several memos were sent to HR to remind them to remove David Harper from the payroll. A few months later, I started getting mail forwarded to me from the university, but for a David Harper in a completely different department. I emailed him to let him know that I had his mail and would send it back at once. He wrote back to tell me that after I left, HR had stopped paying *him* too. The poor guy missed two months' salary before it was sorted out. Not funny for him at the time, obviously!
As reported back in September by the Open Rights Group and the3million (https://the3million.org.uk/sites/default/files/documents/Loss%20and%20Liability%20-%20Glitching%20immigration%20status%20as%20a%20feature%20of%20the%20British%20border%20after%20Brexit%20-%20Sep2024.pdf) and more recently in the Independent (https://www.independent.co.uk/news/uk/home-news/evisa-uk-immigration-status-help-b2678643.html).
The Home Office's euphemism for this is entanglement. It sounds almost quaint until you remember that it means that your personal data is being shown to a complete stranger, or a complete stranger's personal data is being used to deny you the right to return to your home in the UK.
I did that too, when I was a Ph.D. student using a VAX 11-750 as a visitor at a famous but now sadly defunct government research institute. I got a royal bollocking from the system administrator who had to fish the end of the tape out of the reservoir of the tape drive and painstakingly thread it back onto the reel. I only made that mistake once. Happy memories.
The room/desk-booking system at my company now features an AI "assistant". I asked it to tell me the airspeed velocity of an unladen swallow, but it did not understand the question. It was also unable to tell me the answer to Life, the Universe and Everything. Clearly, its training data did not include any of the classics.
I'm deeply disappointed by the knee-jerk negativity of the earlier comments. This is a major success story for British science. Fast DNA sequencing of clinical samples from patients during the pandemic allowed the UK to track mutations in the COVID-19 virus as new strains emerged, and that fed into the public health response, as well as enabling more effective versions of the vaccines to be developed within months, rather than years.
Even before the pandemic, rapid DNA sequencing of clinical samples in NHS hospitals, combined with techniques developed in British universities and research institutes, allowed doctors to more effectively control the spread of outbreaks of MRSA and other nasty bacteria in hospitals.
So instead of doing down some excellent British science which will benefit all of us, let's celebrate it.
... you had to write programs that used memory and CPU efficiently, because both were severely limited resources, even on a company mainframe. Perhaps it's time to teach people how to write memory- and CPU-efficient programs again. That would cut the need for bloated data centres.
The numbers don't even have to be irrational or transcendental. Decimal 0.1 is neither, but it has no binary floating-point representation that is finite and exact, as anyone who cut their programming teeth on, say, FORTRAN 77 will attest: read 0.1 into a REAL variable and print it out, and you're not guaranteed to see 0.1 appear on the screen. You may have better luck with DOUBLE PRECISION, but it depends on the compiler, the run-time implementation of READ and WRITE, and possibly also the phase of the Moon.
And they'll tell you that this is far too small a telescope to expect a detailed view of Saturn. At that price, the optics are likely to be low-quality, so you're going to get a lot of chromatic aberration, and the image quality is going to deteriorate rapidly if you try to shift to higher magnifications. It's an unfortunate but unavoidable truth that if you want to take good pictures of objects like Saturn, you'll need to be prepared to spend well over a thousand pounds/dollars.
The Research Machines 380Z, to be precise. It was the first computer that my school bought, back in 1980. I was allowed to use it in my free time, and I taught myself BASIC programming on it. The school later acquired a Commodore PET, which was easier to use, but I'll always have fond memories of the RM 380Z. It was the starting point for a 40+ year career in scientific computing.
No, this the Reporting of Foreign Bank and Financial Accounts law, commonly known as FBAR. It is mandatory for any U.S. citizen who has a bank account in a non-US bank with a balance of $10,000 or more during each calendar year, and is independent of any liability to pay US income tax.
As for your 'life hack', the US government has made renunciation of citizenship extremely difficult. And in any case, why *should* a US citizen have to renounce their citizenship?
The hypocrisy of the U.S. government is breathtaking. If, like me, you're a non-American married to an ex-pat American, living outside the United States, then the U.S. Treasury Department demands access to details of any joint bank accounts that you and your American spouse hold in non-U.S. banks if the balance ever exceeds $10,000. Many non-American financial institutions now refuse to take on ex-pat Americans as customers, because the U.S. government threatens non-compliant banks with severe penalties.
Ian Dunt explains the underlying (and systemic) causes of the lack of technical expertise in the higher level of the Civil Service in chapter 6 of his recent book "How Westminster Works ... and Why It Doesn't". I highly recommend Dunt's book. It will depress the hell out of you, but at least you'll understand why big government IT projects always end in failure.
Many years ago, I worked at a UK university for several years. When my fixed-term contract ended, I moved on, but after a couple of months, I started receiving forwarded letters and journals for another David Harper who still worked at the university, but in an entirely different department. I contacted him to let him know about the screw-up. He told me that wasn't the worst of it. When I left, they stopped paying his salary as well as mine. It seems the HR department had deleted all the David Harpers on the payroll, just to be on the safe side.
Ian Dunt's new book "How Westminster Works ... and Why It Doesn't" looks at the various parts of government, and explains why each of them is dysfunctional. In chapter 6, he examines the civil service, and concludes that it has an institutional bias against in-depth expertise, especially of a technical nature. Civil servants, especially those in Whitehall, gain promotion not by becoming experts, but by moving from one department to another every couple of years. Inevitably, then, any large project is going to be managed by a series of civil servants who come in knowing nothing about the project, and leave two years later taking any accumulated knowledge with them.
QUOTE
IDG defines DataOps roles as using "a combination of technologies and methods with a focus on quality for consistent and continuous delivery of data value, combining integrated and process-oriented perspectives on data with automation and methods analogous to agile software engineering."
/QUOTE
Oh my, the folks at IDG have drunk deep from the Kool-Aid of corporate-speak. Or is ChatGPT writing their material these days?