back to article GitHub explains outage string in incidents update

Code shack GitHub is offering an explanation for a succession of lengthy outages this month - it's the fault of resource contention issues in its primary database cluster during peak loads but more investigation is needed. It's fair to say last week was not a fun time for either GitHub or its users. The service was unavailable …

  1. Mishak Silver badge

    It was MySQL, with the resource contention, in the database cluster

    Has someone released a modern version of Cludo?

    1. Korev Silver badge
      Coat

      Re: It was MySQL, with the resource contention, in the database cluster

      Cludo by four?

  2. Robert Grant Silver badge

    It can't be MySQL

    I heard MySQL's fine. After all, Github uses it?

    1. Charlie Clark Silver badge

      Re: It can't be MySQL

      Yes, but is it webscale?

      Shirley, this is mainly a read-heavy environment?

  3. Androgynous Cow Herd

    smells like...

    someone tried out a new backup technique for the database cluster....

    just guessing based one reading between lines and consulting teal leaves and chicken entrails.

    1. Charlie Clark Silver badge

      Re: smells like...

      What are teal leaves? And where can we get them? The Github store?

      1. LionelB Silver badge

        Re: smells like...

        Small greenish-blue ducks - on vacation at a pond near you.

        1. captain veg Silver badge

          Re: smells like...

          But not until they actually leave.

          -A.

  4. Kevin McMurtrie Silver badge

    DBA?

    Nobody has a DBA anymore. Companies "scale up" their cloud configuration and consider it solved, even if that scale-up costs far more than a DBA. If GitHub did have one, this overload should have been no surprise.

  5. karlkarl Silver badge

    "It's fair to say last week was not a fun time for either GitHub or its users"

    The whole point of Git is it is decentralised. We didn't even notice.

    I assume that if it halted your workflow, you really must be using Git incorrectly. Perhaps try to reduce your overconsuming of GitHub specific features?

    1. AndrueC Silver badge
      Meh

      I vaguely recall an incident when I was trying to approve a PR but by the time I'd reported it to the rest of the team it was working again.

  6. Steve Davies 3 Silver badge
    Windows

    Cue news that MS is dumping MySQL

    after this and their reluctance to be audited by Oracle, they will be moving it all to SQL-Server.

    only joking but...

    1. Korev Silver badge
      Joke

      Re: Cue news that MS is dumping MySQL

      I wonder what could trigger that?

    2. Anonymous Coward
      Anonymous Coward

      Re: Cue news that MS is dumping MySQL

      Maybe they are administering as if it is SQL-server and don't realise there is a difference.

  7. Anonymous Coward
    Anonymous Coward

    It's crazy how many other companies use MySQL but don't have this issue.

    1. This post has been deleted by its author

    2. Nate Amsden Silver badge

      a lot do have this issue, not many companies are public about what causes their outages. DB contention is a pretty common issue in my experience over the past 18 years of dealing with databases in high load(relative to what the app is tested for) environments. I've seen it on MySQL, MSSQL and Oracle, in all cases I've been involved with,the fault was with the app design rather than the DB itself(which was just doing what it was told to do). (side note: I am not a DBA but I play that role on rare occasions).

      I remember in one case on MSSQL the "workaround" was to restart the DB and see if the locking cleared, if not restart again, and again, sometimes 10+ times before things were ok again for a while. Fortunately that wasn't an OLTP database. Most critical Oracle DB contentions involved massive downtime due to that being our primary OLTP DB. MySQL contentions mainly just limited the number of transactions the app could do, adding more app servers, more cpu more whatever had no effect(if anything could make the issue worse) the row lock times were hogging up everything.

  8. Yes Me Silver badge

    Play safe

    Such a great idea to commit your project to someone else's disks, I've always thought.

    Personally I keep a local copy of everything (in addition to the cloned repo).

    1. teknopaul Silver badge

      Re: Play safe

      Git kinda does that for you.

      I doubt anyone was unable to code. Github wobbles prevent publishing code to Github only, setting up a new remote is trivial if you prefer to manage it over Github, even in the short term.

      You may be locked into Github's non git features of course.

      Github and gitlab are convenient (when they work) and both better than average in terms of avoiding lockin, due to the nature of git.

  9. DevOpsTimothyC Bronze badge

    Perhaps it's all the additional bloat that M$ has been adding

    Here's an idea, Perhaps remove all of additional bloat that Microsoft has been adding in an effort to a) turn it into a social platform, b) get more insights and provide additional unwanted services

  10. DomDF

    Could the problems be related to the algorithmic feed? Even if not directly, as the feed itself should be read only, it could be driving traffic to new repositories where users then star, fork, create issues and PRs, all of which add write demands.

  11. Anonymous Coward
    Anonymous Coward

    Hopefully Microsoft hurry up and migrate it to something a bit more enterprise ready such as SQL Server.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like

Biting the hand that feeds IT © 1998–2022