back to article Easynet blames network collapse on dodgy router update

Early investigations of a nationwide collapse of the Easynet network have pointed the finger at a software update to a Cisco router. The outage began at 8.32am on Wednesday, and for some customers lasted most of the working day. "An Easynet edge router crashed due to a software bug condition triggered by an invalid update …

COMMENTS

This topic is closed for new posts.
  1. Alasdair Russell
    WTF?

    Hmmmmmmm - do we need a translation

    "an invalid update from an external source into the Easynet network"

    Does this translate as an external attempt to hack a Cisco Router with a default or easily guessed password?

    1. Anonymous Coward
      Anonymous Coward

      Invalid update

      "an invalid update from an external source into the Easynet network"

      Could be a software update from Cisco, or possibly a bad routing table update sent from another ISP causing a routing loop.

  2. ElNumbre
    Stop

    Who?

    Who do they think they are, BT?

    Good ISPs have self healing networks with routers which can failover automagically, if not seamlessly, then at least within a couple of minutes.

  3. Anonymous Coward
    Anonymous Coward

    Was the SLA breached?

    Is the SLA 99.9% per day, or 99.9% overall? I would imagine overall surely

    When was the last time EasyNet went down? By my count, if it had been up without outage for the past 25 days solid, the a 6 hour outage still keeps it within 99.9% uptime

    If you make it only business hours, say 7 hours per day and exclude weekends, then they are still 99.9% if they have been up solid for the past 12.5 weeks

    1. Chris Williams (Written by Reg staff)

      Re: Was the SLA breached?

      The availability guarantee is quarterly. I have added that to the story.

      So 6+ hours down is getting on for 0.3 per cent of the quarter.

  4. Lee Dowling Silver badge
    FAIL

    An edge router

    A single edge router took down an ISP. Nice redundancy / monitoring / replacement policies there. I thought the point of buying very expensive Cisco hardware was that this sort of thing wouldn't affect the connectivity as a whole, rather than having to have some Cisco guy read commands to you over the phone when things go wrong?

  5. Ivor

    Three nines.

    Doesn't 99.9% give them just over 8 hours in a year? what's the duration of their SLA?

  6. Anonymous Coward
    FAIL

    Carrier Class?

    Well if you will use an enterprise class solution, dont expect it not to fall over sometimes!

  7. Andy Barker
    FAIL

    A single router?

    I am always amazed when it is reported that a single router buggers up so much of an ISP's network.

    Redundancy?

    Failover?

  8. Chris Miller

    99.9 per cent uptime guarantee

    Over what period? I'd expect monthly (although the publicly available material is silent on the topic), but if they sneakily measure annually you could fit in a single 8-hour service interruption and still meet the guarantee.

  9. Tom 15

    Eh?

    Unless I'm understanding something wrong 99.9% uptime allows up to 8.76 hours of downtime per year so 6 hours doesn't break that?

  10. Candy
    FAIL

    SLA broken?

    Um, not unless there were additional outages or the 99.9% uptime mentioned was inaccurate. Three 9s buys you up to 8hrs 45 minutes (and change) of outage per annum

  11. Kevin Gurney
    Thumb Up

    Note the word "QUARTERLY"

    Easyney sells its SureStream ADSL product as an alternative to a leased line, with a 99.9 per cent uptime guarantee (quarterly).

    1. Andy ORourke
      Joke

      You're assuming......

      People read the article BEFORE they jumped in to comment that it was really bad that a single router could take down a substantial amount of an ISP's network and then go on to debate the 99.9 uptime guarantee figures :-)

  12. Chris 211

    Sounds like

    They did a misconfiguration of the router which tried to automaticly update itself OMG why! Dont people research images before applying and apply images to routers put into standby mode. Old fashioned it maybe but if your going to update a critical router its best to have an 'engineer' involved not a script.

  13. Candy

    Oopsie.

    Mea culpa...

  14. Fuzzysteve
    Grenade

    redundancy's all fine and good

    What makes big networks break, is when their routing tables get hosed by the bad updates being spewed by a badly configured/misupdated router.

    No redundancy can stop that from happening. Though it shouldn't take 6 hours to fix.

  15. Matt 80
    Thumb Down

    We are still down....

    but we are getting a new FREE router in the morning !! Will it make any difference ? I doubt it.

  16. Anonymous Coward
    Anonymous Coward

    external ROUTING update, which is something real routers do all the time!

    there was a major event around that time which impacted several makes of routers (apparently from multiple vendors) causing their BGP sessions to go haywire, however it was quickly suppressed at the source (well, likely at their upstream provider) and the only ones who seemed to have noticed were those with alerts set up, i suspect in easynets case this was probably a trigger event that exposed an unrelated problem with their network, which caused the extended outage

    however their comments on the outage do seem to imply a single router with no automatic failover which is rather worrying...

    more details: http://www.merit.edu/mail.archives/nanog/msg14487.html

  17. sw1sstopher

    these aren't the redundancy figures you're looking for...

    maybe if they hadn't made so many engineers redundant this would have happened...

  18. Matt 80
    Thumb Down

    Guess what ?

    On checking this morning at my place of work, we still don't have a link to the outside world - I really hope the new router which is on it's way works......

This topic is closed for new posts.

Other stories you might like