
I would like to thank MySQL..
.. for making life difficult for Facebook.
Well done.
Facebook has had all sorts of no fun trying to migrate from MySQL 5.6 to version 8.0. A post from the social network’s engineering team reveals that Facebook's last MySQL upgrade – to version 5.6 – took "more than a year". Moving to version 8.0 has taken "a few years so far". Some of the reasons for the slow rollout will …
I think they are making life difficult for themselves by (a) making such a big version jump (MySQL 8 is a very different beast from MySQL 5.x), and (b) applying their own custom patches to the server which means that they now have a bespoke system, not one supported by any project team.
When Facebook was originally built - early 2004 - it was a fairly basic PHP application which used MySQL for all storage (user profiles etc). I don't think it used anything else except maybe memcached or possibly Redis for caching.
I doubt MySQL would scale to hold all of the user data that it now needs to. What are they still storing in MySQL and what other storage technologies are they using? There must be an absolute myriad of data especially given that you can pretty much go back to when you opened your account and find posts.
People are quick to criticise Facebook but the way in which they've scaled that is nothing short of incredible.
MySQL does scale pretty well in read-heavy environments. Unfortunately, it's shit™ for migrations due to the way tables are managed: "simple" changes to tables can take forever as they're migrated to temporary ones and back. And that forever can take even longer on an ever growing system.
If you have a system running on a DB you need to have a migration strategy in place. This might even include no migrations to new versions. But the real problem here seems to be running custom extensions, which obviously seemed a good idea at the time…
They could switch to Postgres and use FDW (Foreign Data Wrappers) to hook up the legacy databases while they work on a data migration strategy (inplace updates are not always possible) that minimises the dump and load problems.
Or they could ask Oracle for help…
I'm not really an expert but I believe mysql separates front end and back end so mysql can scale indefinitely if you have Facebook's engineering resources to work on the backend.
A year for a hefty DB migration sounds fast to me.
Must be fun working there, big problems to solve, money to burn and zero responsibility: matters little if Facebook disappeared tomorrow.
... is apparently now move fast with stable infrastructure.
And yet their unstable infrastructure seems to be holding them back. It's almost as if it were all just empty sloganising.
As I said in my first post people are quick to criticise Facebook. I bet none of those people would have the skills to create something that went from a bedroom project to being able to concurrently support billions of users and hold nearly 2 decades worth of data for all those people.
Generally speaking Facebook is very stable and that is _incredibly_ hard to achieve when you have over 2.8 billion users. Even if only 50% of them were active that's way more than your average application. The sort of people who criticise them generally don't even have 1 million users on their own platforms.
I'm yet to see an example of anybody who has taken something from an amateur project and scale it to what it has become. Granted it's somewhat annoying when the use-case is for the Karens of the world to share posts about how disappointed they are at their local school... but the tech and processes behind it is nonetheless very impressive.
This post has been deleted by its author
I can take a project from embryonic to mega-buster
It's exactly this sort of bullshit I'm talking about. Go on, name your project that has 2.8 billion users and virtually zero downtime in ~16 years?
I very much doubt I'm alone in that on these forums.
Agreed, it's full of people who also criticise yet never seem to be able to mention an equivalent/better thing they've done themselves. Primarily because they haven't, and can't.
It just seems that FB let it run away a bit too much.
... and yet, it continues to work without any significant problems.
It's a different project but I think the people behind Stack Overflow lifted the curtain a couple of years ago about how they managed to be "webscale" without about 8 servers. A good architecture won't immediately support millions of users but it will make it possible.
This post has been deleted by its author
It’s good that they are actually biting the bullet and getting on with the upgrade. Too many organisations (I could name a name, but I’m not going to) have stuck with a single platform well beyond its use-by date and because they never had a plan for updates of any kind they now don’t have the resources to tackle a problem that has been in the making for years, if not decades.
I’m not a fan of Facebook the platform, but this upgrade and the willingness to tackle it deserves recognition.
MySQL v8.0 was released on 19 April 2018 and v5.6 was released on 5 February 2013 - 8 years ago. So I would argue that Facebook have used MySQL v5.6 "well beyond its use-by date and because they never had a plan for updates of any kind" and are still struggling to tackle this problem that has been in the making for over 5 years (since the release of v5.7).
I updated a non-profit from MySQL 5.x to 8.x two jobs ago. (actually we went with the MariaDB equivalent because MySQL v8.0 was only available to paying Oracle customers at the time)
I agree that scaling a service to essentially planetary dimensions, when it was not designed to do so, is a very significant achievement. That it's possible at all is down to the fact that it consists of relatively small individual chunks of data which are assembled by the browser into the page that you see.
The clever bit is the way the routes to those individual bits of data can be dynamically altered as compute resources are spun up and taken down - along with the infrastructure management that decides which resources to activate and when. That's what makes the scaling possible.
What prevents its using the world's entire electricity supply is getting the most out of each individual compute unit and here FB seem to have taken a very proprietary approach - whether it's heavily customising MySQL or its PHP to C++ converters or its Hack language. The trouble with that approach is that maintenance of your short-term fixes simply add to your long-term workload and divert effort from developing and maintaining the primary application code: it also imposes an additional learning burden on staff you bring in to the company.
FB had to grow very quickly to keep the money coming in - but the money was coming in in copious amounts and could be chucked at the scaling issue. The results are impressive and operationally stable but at the cost of development fragility going forward.
The most impressive thing about FB is, of course, that none of this matters. It will either continue to make massive amounts of money regardless of its technical merits or it will wither and die and live on only as the answer to a future pub quiz question.
Facebook's throughput is impressive but, of those 2.8 billion users, the only number that matters for maintenance is how many concurrent users there are. Even then, it's still largely a read-heavy environment which means caching, caching, caching for the users. All the real development work goes on the advertising side and this is why they don't talk about it.
But for real scale, it's difficult to beat what Google gets up to with YouTube alone.
Once again, Postgres is superior in that regard. I recently moved a largish app from MariaDB to Postgres and luckily there was very few custom SQL to deal with. Was not without quirks, however. Worth doing it as the number of concurrency issues went from four or five each day to once each two months or so.
A friend of mine who I consider authoritative when it comes to comparing databases said the same thing: if you want to get serious about databases (and certainly if you require future proofing and expect data changes), you best stick with Postgres.
Maybe Zuck should consider this switch too? It's not going to be the last time he or his people come up with ways to abuse the data they have milked out of their users.
Much as I dislike MySQL, this is mainly for the long list of weird bugs that never seemed to matter. Credit to Oracle for actually working through them once it took over. But the main problem with MySQL was trying to use it for things it wasn't designed for. It was originally a fast tabular storage engine with an SQL query engine, but it was not designed as an RDBMS and, hence, generally sucked at both the relational and management parts. As long as you stuck within the MyASM limits you could have a fast and generally reliable database, and it was quite a bit faster than Postgres at the time. But it's sort of stayed there. Yes, it now does the relational part with InnoDB but the management part is still largely MIA, which is why there are tools to fix your DB for you. And one of the side effects was a generation of programmers who created applications that suited the denormalised model.
Over the same time Postgres, which always managed your relational data for you reliably, has grown faster and scaled in all dimensions: reading, writing, replicating, storage types and analysis. It also has a well established extension system which has been used to bring new features and improved performance before things go into core. And it has almost always encouraged best practice such as refusing to allow foreign keys for non-indexed values…
it even requires its' own backend database! Not sure whether this is an attempted brag, or an admission of unsuitability for task............
The rest of the whinging about how difficult it all was can, I think, be answered with the reply "hire lots more DB staff then, you can afford it".
What a coincidence*. Literally this morning I received the following email, purportedly from the UK's Driver and Vehicle Licensing Agency (DVLA) which states:
"We are currently upgrading our database and all drivers are required to update and verify there driver's license details.
To complete your license verification with us you are required to fill out the form in the link below.
<REDACTED>
Drivers that refuses to upgrade his or her details within two weeks of receiving this verification email will lose his or her driver's License and will have to take a fresh driving test.
We sincerely apologise for any inconveniences this might have caused you."
Just slightly concerned about the spelling "there" when it should be been "their", "License" when it should be "licence" (UK English spelling not USA) and "inconveniences" and the use of the past tense, but I'm sure it must be ok, after all the sending email address is:
"service-update-accountinfo-datagovuk-securechristophe.taymans@telenet.be"
Which is obviously a genuine government email account.
*(Genuinely a coincidence, I got it this morning. I've not replied yet.)
The techie in me is massively impressed by the skills required to build and maintain such a hugh database with fantastic uptime.
The human in me despises what they do with that data - harvesting personal information for profit thru advertising and misinformation and it's data feeds into the 3 letter spy agencies.
Every time that I read a story like this my immediate thought is that the entire Internet has become a house of cards that inevitably will collapse under its own weight. The Akamai crash this week was just the latest demonstration that we've built in far too many single-points of failure.
This is why anything that actually matters lives on my local hard drive, not in the cloud, and why I backup anything that I'm forced to keep on-line - I just don't trust any of the major Internet behemoths to keep my data safe, or available when I need it.
I guess it's a factor of age, but I can remember when the Internet was a cool new thing that added considerable utility in some areas, but wasn't yet critical to nearly every task and operation our lives. The fact that it's been allowed to grow so fast, so willy-nilly, and so entirely unregulated is cause for great concern.
And of course (Internet Archive notwithstanding) it saddens me greatly how much of our history has already been lost forever when web sites and news services just close up shop and disappear. Future historians will shake their heads and condemn us for the great gaps in our archival processes. Even sadder are the individuals and families who have entrusted their entire family histories and photo albums to Facebook or similar services without understanding that they could lose everything at a moment's notice.
Ultimately though what gives me pause is a firm belief that at the end of the day none of the big social media companies give a sweet god-damn about individual customers and users. Their utterly abysmal customer service is evidence that they really couldn't care less if you or I are harmed, locked out, or just leave. That attitude surely spreads to every corner of the operation, and leaves us all at considerable risk.
On a related note: Don't try accessing the citations at the end of Wikipedia articles. The majority of them are dead links nowadays and nobody seems to be interested in cleaning them up. When my daughters were in school, they were forbidden to use a Wikipedia article as source because of this---those dead links may as well be fakes.
.....in a land far away, there was a retailer who adopted PeopleSoft.
.....but the SENIOR people (you know....C-Level folk who know everything) decided that the standard PeopleSoft package was woefully short of retail smarts.
.....so began a million dollar project to customise PeopleSoft with all the required retail smarts. Fantastic job.....just what the doctor ordered.
.....except that for the next ten years it was impossible to apply PeopleSoft patches and upgrades. Oops!
.....then (different) SENIOR people (also C-Level folk) decided on....go on......guess! They decided on another million dollar project to implement the latest version of PeopleSoft....this time plain vanilla....no cute ideas.
Fantastic story......especially for the consultants.....a million dollar project.....then ten years trying to square the circle.....then another million dollar project.
The moral of this story (if you have any sense) is -- implement STANDARD packages!
The moral of this story (if you are in the consulting business) is -- get your client to apply all the modifications they can dream up!!