back to article Mozilla 404s '404 Not Found' pages: Firefox fills in blanks with archive.org copies

Mozilla is trying out a new experimental feature in Firefox that lets you smash through annoying 404 dead-ends. The "404 No More" feature uses copies of webpages from the Internet Archive's Wayback Machine to replace 404 "not found" errors with something more useful. If you visit a link to a page that's disappeared, Firefox …

  1. Sorry that handle is already taken. Silver badge

    Will the Internet Archive be able to handle the extra traffic if this goes mainstream, and/or will Mozilla run a cache for popular requests?

    1. Ole Juul

      resources

      I had the same question about resources. It could also slow down your browsing on a slow connection as you encounter yet another wait, though I suppose you'd turn it off in that case.

  2. ashdav
    WTF?

    What's the point

    You have tried to access a page that no longer exists from a website that has deprecated links.

    So they send you to an archive.

    And...........?

    1. Cynic_999

      Re: What's the point

      ... And in most cases the archived page will be the one that the link you clicked on originally referenced, so you will get the information you wanted.

  3. Sebastian A

    As long as you can turn it off, I'm fine with it.

  4. HCV

    Doesn't strike me as a good idea at all

    I really don't like it when my web browser tries to outguess reality. If a page is missing, that's data of a sort right there. And instead showing me what is by definition outdated information? No. Don't.

    I'm especially not crazy about the potential for a chilling effect. Sites will quickly learn that they can't actually delete a page -- unless they request that it be deleted from the Wayback Machine, too!

    I've already seen archived sites get wiped from the IA at the request of a new domain owner. This is going to cause even more disappearances.

    1. paulf
      Meh

      Re: Doesn't strike me as a good idea at all

      You make a very good sub-point. When a domain is acquired by a new owner any request they make to remove pages from IA should only go back to the point they bought the domain - it shouldn't remove content that predates their ownership of that domain.

      I've also seen the complete and only archive of a site removed from IA simply because the new owner of a domain (that had otherwise been dormant for 10 years) decided they didn't want any of their content archived. Having their new content omitted from IA is fine, if that's what they want, but they shouldn't be allowed to tell IA to wipe historic data of that domain that predates their ownership (content they had no hand in creating).

  5. Peter Prof Fox

    Why can't they

    Add something like 'Click here to find the archived page' rather than serve up out of date/irrelevant information by default.

    PS. I gave up on my blog because moderating the comments was grief. 80% were nasty-spam and then 19% were nasty-spam. I don't want Mozilla going round graveyards digging up putrid corpses.

    1. Geoffrey W

      Re: Why can't they

      Complaining before trying is always so much more fun than commenting on what you actually know. What you suggest is just what they do. Silly boy.

    2. Mark Simon
      Thumb Down

      Re: Why can't they

      Good idea. Oh look they did that already. Before you suggested it.

      https://testpilot.firefox.com/experiments/no-more-404s

  6. perfgeek

    Slash-dotting archive.org

    Sounds like a good way to slash-dot or otherwise DDoS archive.org, and if not, bring it to the attention of the right-to-be-forgotten cabal.

  7. This post has been deleted by its author

  8. gobaskof

    As long as there is a big clear message saying that this is an archive and the real page is missing.. then this is a good idea. I assume they will do this, but who knows..

  9. Geoffrey W

    Aren't you all a bunch of naying nellies. Did any of you even try it? It still displays the 404 page but you get a bar at the top of the page if there is an archive entry wherein you can click to view said archival page. I thinks its neat. I already found it useful. I had a dead link in my bookmarks which I never got round to checking wayback for, and lo! there it is! I think its a neat idea. Oh, and yes, when it shows the page it has a bar at the top telling you its from the wayback machine archives. So quit griping and find something more worthwhile to bitch about.

    1. VinceH
      Thumb Down

      "Aren't you all a bunch of naying nellies. Did any of you even try it? It still displays the 404 page but you get a bar at the top of the page if there is an archive entry wherein you can click to view said archival page."

      Perhaps they didn't feel the need to try it because the article told them what it does... inaccurately, by the sound of it:

      The "404 No More" feature uses copies of webpages from the Internet Archive's Wayback Machine to replace 404 "not found" errors with something more useful. If you visit a link to a page that's disappeared, Firefox will fetch from archive.org a version of the page before it vanished.

      Perhaps your criticism should therefore be directed at El Reg for incorrectly describing what it does, rather than readers for reading it.

  10. GrapeBunch

    It's tedious for the site-owner to do this by hand

    If you write a web page with external links, typically 90% of those links will be 404 after a few years. The pages largely still exist, but the site has been moved or the pages in the site reorganized. Installing the latest and greatest content management system used to break most of the links on a site.

    It's faster to change the links to point to IA than, in general, to find them anew. After one boring session of editing links this way, I was tempted to suggest pointing to the IA version from the very start. Another pitfall avoided is when the link still exists, but its content changed. For example when a domain is allowed to expire, then is taken over by pr0n or worse.

    I hope that IA doesn't delete, but rather deep-archives sites that are requested removed from public access. Eventually, what's on them will flow into the public domain. OK, most would be best forgotten, but the gems justify the overburden.

    1. John Brown (no body) Silver badge

      Re: It's tedious for the site-owner to do this by hand

      "I hope that IA doesn't delete, but rather deep-archives sites that are requested removed from public access. Eventually, what's on them will flow into the public domain. OK, most would be best forgotten, but the gems justify the overburden"

      Archaeologists just love digging through rubbish tips and analysing coprolites. Maybe 100/200/300 years in the future, all those old archived Geocites pages will be gold to social historians

  11. Winkypop Silver badge
    Meh

    404.5

    I can see some end-users getting mightily confused, more so than normal anyway.

    1. Ole Juul

      Re: 404.5

      "I can see some end-users getting mightily confused"

      Then send them to 404 101.

      1. Anonymous Coward
        Anonymous Coward

        Re: 404.5

        Never underestimate the incompetence of the public (customers).

        It scares me that these people vote, drive cars and raise kids....

  12. bombastic bob Silver badge
    Devil

    better tha click-jacking the 'bad' URL

    well, better to see an automatic archive redirect than a click-jacked advertisement page for web hosting...

    and the 'right to be forgotten' cabal can take it up with the archive hosts

    1. Geoffrey W

      Re: better tha click-jacking the 'bad' URL

      Doesn't work that well. For instance try looking for an old Geocities web site; You still get web hosting ads. Seems to need an actual 404 error. If the server no longer exists then you get whatever you get redirected to, or nothing.

      1. Alister
        Facepalm

        Re: better tha click-jacking the 'bad' URL

        Seems to need an actual 404 error. If the server no longer exists then you get whatever you get redirected to, or nothing.

        Well yes, that's what it's there for, it says so in the name. If the server is no longer there it's not a 404 error, is it?

        1. Geoffrey W

          Re: better tha click-jacking the 'bad' URL

          Yes, that was my point but you put it more succinctly, but why couldn't they take it one step further - server not found is not that different than page not found. Can't be that hard to just check an unknown server against the archives and offer to show that if found.

          1. Alister

            Re: better tha click-jacking the 'bad' URL

            @Geoffrey W

            Without looking into it in any depth, I would guess it currently works by examining the response header returned by the server, to determine when to spring to life.

            If there is no server response (i.e. the server isn't there) then it can't do that.

            If they also built it to work when there was no server response received, you would be in danger of flooding the archive with requests for non-existent or incorrectly typed URLs.

  13. Dabooka
    Stop

    I'm surprised peeps on here...

    assume it's an 'invisble' serving of the WBM archive. It isn't.

    Give it a shout, or don't. However you shouldn't knock it before actually trying it

  14. Tannin

    People aren't stupid

    People aren't stupid, you know. They read the article, and the article is at pains to say that Firefox will redirect 404s to the archive. It does not, repeat does not, bother to make it clear that (according to various grumpy comments above which I have no reason to disbelieve) this isn't a redirect at all but a glorified error page that offers to serve the archive page instead. (A very different - and much more sensible - thing.)

    Subject to who you are and how plausible your message is, people tend to believe things you tell them. When you are a writer for the Register, we tend to think you probably know your stuff and take it at face value. (Stand aside one loopy science malreporter, of course.) When what you write seems plausible (e.g., when you suggest that Mozilla management have come up with an ill-considered "improvement" of dubious value - just to pick an example completely at random), people tend to believe it.

    In short, don't bloody criticise people for posting perfectly sensible responses to the (you would have thought) trustworthy news they read. Instead, criticise the highly misleading, headline-chasing article they are responding to.

    Thankyou, Mr Grumpy and your friends, for pointing out that Mozilla haven't been as stupid (this time) as the article makes them out to be. (Assuming you have your facts right, of course, which I am happy to do.) No thanks for the manner in which you did so.

    1. Geoffrey W
      Unhappy

      Re: People aren't stupid

      I assume you mean me so, sorry. This is the internet though and rule one should be "Trust No One" and rule two "The Truth is Out There". It gets so tiresome to find hordes of (well, a few) people jumping up immediately and being negative before they really know. Instead of "Oh that sounds interesting, I'll go and look further" they offer "That's just Stupid!" I went and became a test pilot and flew the feature, up in the air and everything, without a seat belt wearing only my big goggles, so what I reported is what I saw.

      The register may have people who know their stuff but they aren't above trolling their readers, and perhaps just being wrong in their assumptions too without researching what it actually does. Research these days often seems to comprise reading the press release, searching google, then reading the headlines and perhaps the first paragraph if you feel like going really deep.

      I did try to not be totally flamey and attempted an amused tone by using a silly phrase like naying nellies. I've had more than my share of down votes this week. Sorry.

  15. Cthonus

    Real world issues

    You know there are valid reasons pages get junked from websites.

    Given the number of photographs I've had copyright abused over the years I wouldn't be happy if some of the dead links ressurected images I've had ISP remove from the archives.

    What about articles that went to be considered libellous and were taken down? Given how lazy people are there must still be hundereds of broken links/bookmarks cached and never updated, even on legitimate sites...

  16. david 12 Silver badge

    Who gets 404 from dead links?

    I get landing spam, or (302) "page not found" pages. I can't remember the last time I got an actual 404 response like the TheRegister link: it wasn't recently.

  17. Mr Dogshit

    Doesn't this break an RFC?

  18. Spudley

    Why is this being built-in, rather than a plugin?

    This seems to me like a perfect example of a feature that ought to be implemented as a browser plugin, not as a built-in feature.

  19. Velv
    Childcatcher

    Appearing in a Court near you soon...

    Given the "Right to be forgotten" rulings it isn't going to be long before someone gets upset at links being magically offered up after they've already been "removed".

  20. Anonymous Coward
    Anonymous Coward

    This is great

    This is a wonderful new feature.

    I recently decided to go back and play a computer game that from about 10 years ago.

    What I found was is that this generation has pretty much abandon text based web pages in favor of monetized youtube videos. There was once a plethora of fan web sites for this game with tons of information that you could search and find. Now those pages are gone, replaced by hundreds of youtube videos/channels that are almost impossible to accurately search for specific data.

    I had come to rely on the wayback machine to resurrect the old missing sites for my game, and started to wonder why no one had yet devised some plugin that would automatically redirect 404s to try the wayback, as I was getting tired of getting a 404, copying the original link and then going to the wayback and pasting the link in.

    1. Anonymous Coward
      Anonymous Coward

      Re: This is great

      So true, one of the worst things about the modern web is the fashion for using Youtube videos to convey information that would be more clearly expressed as text or still images.

  21. Anonymous Coward
    Anonymous Coward

    Pullling the last available site before it went kaput? Hmm...

    People with personal pages that don't change often (like that raw HTML homepage you tried to host and learn in 1995) can simply load their websites, wait archive.org mirror it, and take it down.

    Free hosting FTW.

    And I wonder what happens to the like of piratebay, and image repositories...

  22. Richard Lloyd

    It's a banner across the top that asks if you want to load the archived version

    Unlike most people here (or the original article author I suspect), I actually went ahead and installed Test Pilot. Here's one of the screen shots showing the 404 not found thing in action:

    https://testpilot-prod.s3.amazonaws.com/experiments_experimentdetail/2/4/24ddd4335aca6b96cca9106a6f3411d2_image_1470245154_0440.jpg

    This seems like a good way to do things - show a banner at the top of the page giving the option of seeing the archived version. If the original article here at El Reg had made that clear, I suspect there'd be a lot less outrage. I, too, think that auto-replacing a 404 not found page with an archive.org version is a very bad idea - that's what this article implied...

  23. FatGerman

    Firefox?

    Is that still a thing?

    1. Geoffrey W

      Re: Firefox?

      Pointless comments? Is that still a thing?

  24. Midnight

    I don't get it.

    I understand the github reference, but what's so amazing about Bloomberg's 404 page?

    Aside from having over 110k of scripting and menus, the page just says "404. Page Not Found / Unfortunately, this page does not exist. Please check your URL or return to the Home Page".

    Am I missing something, or are those two sentences just that much more amusing than anything else Bloomberg ever reports on?

    1. Geoffrey W

      Re: I don't get it.

      I saw a man in a suit smashing a computer from a table then something weird happens and his head falls off as do all his other bits and he collapses upside down on the floor in a heap of suit.

      1. Not That Andrew

        Re: I don't get it.

        Remarkably badly animated, but it's just a 404 page

  25. TRT

    Does this...

    have to comply with the right to be forgotten?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like