back to article Remember WordPress' Pingbacks? The W3C wants us to use them across the whole web

Something called Webmentions – which looks remarkably like the old WordPress pingbacks, once popular in the late 2000s – is grinding through the machinery of the mighty, and slow-moving, World Wide Web Consortium (W3C). But don’t be deceived. Lurking behind that unassuming name lies something that might eventually offer users …

  1. Hans Neeson-Bumpsadese Silver badge

    Hmmm....

    Given that the next article on the Reg home page is...

    http://www.theregister.co.uk/2016/02/18/ddos_dingbats_enslave_tens_of_thousands_in_wordpress_pingbacks/

    ...I assume that this would be a Bad Thing?

    1. Michael Wojcik Silver badge

      Re: Hmmm....

      It's a Bad Thing, and then there are the security issues.

      Now if you'll pardon me, I need to find my shaking cane.

      (In all seriousness, I don't particularly care; there's no requirement to participate. But phrases like "participate in the larger conversation" elevate my hackles. It's like all those early Internet-obsessed Habermasians gushing about the web. Oooh, discourse in the public sphere! Yes, and it's done so much to usher in a utopia of communicative rationality.)

  2. Warm Braw

    Scaling?

    So, every time someone claims to have referenced content on your website, you have to parse the source from the proffered URL and verify that it contains a link, even if you're not sure exactly what format the source material takes. You potentially have to do this every time a further post is made to the conversation and potentially for every contributor.

    No wonder the spec suggests processing notifications asynchronously, "to prevent DoS attacks".

    Also, no wonder that TBL quickly discarded the idea of two-way linking in his search for a practical hypertext system.

    The best suggestion I've seen for finding out who's linking to your content is simply to ask Google.

    1. AndyS

      Re: Scaling?

      I've just tried that, and it doesn't seem to work. Google for "link:google.com" and you get no results.

      Has google dropped the "link" qualifier for searching, or am I doing it wrong?

      1. Anonymous Coward
        Facepalm

        Re: Scaling?

        If you dig into Advanced Search, "link:" is there but returns no results. Good job, Google.

      2. Warm Braw

        Re: Google for "link:google.com"

        Oh, no! You've broken the Internet!

        It does seem to still work for other sites, but the option seems to have disappeared both from Google's "advanced search" page and its description of search operators. Curious.

      3. cd

        Re: Scaling?

        Still works on DuckDuckGo.

    2. Anonymous Coward
      Anonymous Coward

      Re: Scaling?

      > Also, no wonder that TBL quickly discarded the idea of two-way linking in his search for a practical hypertext system.

      Nah. WWW was an incremental improvement on Gopher. TBL didn't know he was changing the world. If he did, he would've made drastic architectural changes to prevent spam and DDoS, he would've addressed bidirectional linking, and he would've added an automatic content replication feature as a safeguard against dead links. But that wouldn't be easy.

      1. Andrew Orlowski (Written by Reg staff)

        Re: Scaling?

        I'm not sure the GBL ever thought of bidirectional linking, but others did, and had added it to the specs by 1996:

        http://www.cogsci.ed.ac.uk/~ht/new-xhl.html

        "he would've added an automatic content replication feature as a safeguard against dead links. But that wouldn't be easy."

        Those are implementation details. Replication and caching (eg Squid) don't belong in a language spec.

        1. Anonymous Coward
          Anonymous Coward

          Re: Scaling?

          Interesting. So they didn't get far with XHL, but it does raise the question: could this feature be built on top of the existing web, despite its dirty content and ephemeral URLs? Say, a crawler/reader package that anyone can install locally or on a public server, which crawls a list of sites, caches any new content and 1st-level linked content, and serves it up Google Reader fashion. But decentralized. Grassroots implementation might succeed where topdown standards failed. If it became popular, websites would bow down to it instead of Google. Wordpress's REST API project might be anticipating that outcome... and Webmentions is apparently a comments-only version. </naiveoptimism>

        2. Michael Wojcik Silver badge

          Re: Scaling?

          I'm not sure the GBL ever thought of bidirectional linking, but others did, and had added it to the specs by 1996:

          http://www.cogsci.ed.ac.uk/~ht/new-xhl.html

          There's also Hyper-G, another attempt to add bidirectional links to HTML, which was also developed in the mid-1990s. There's a commercial implementation (HyperWave) and some free clients, but it never caught on.

          I think the problems they were trying to solve simply weren't compelling to the vast majority of users. And I think the same is true of pingbacks and Webmentions and the like. The W3C may standardize them, but they'll never be used "across the whole web"; they'll be a niche technology.

          1. kevinmarks

            Re: Scaling?

            The one-way nature of links was necessary for the Web to achieve what it did. The globally connected nature of the web as a small world network is built on a scale-free distribution of linkage. If all links are required to be two-way, this rapidly becomes unwieldy and cumbersome - imagine if the front page of Apple.com showed all the inbound links to it. The unidirectionality created the permission-free linking culture the web depends on, and reversing those links in a useful way is an interesting problem

            me in 2004, when I was trying this: http://epeus.blogspot.com/2004/02/technorati-xanadu-and-other-dreams.html

    3. kevinmarks

      Re: Scaling?

      Hi there - as I built the crawler for Technorati to do this in a centralized way web scale, I can confirm that a decentralised version scales better. As mentioned in the article, there are already services that you can delegate webmention handling to, which is handy if you have a static site.

      The other point is that how you respond to webmentions is up to your site - you can of course implement a whitelist, a blacklist or the Vouch protocol extension that demands that the webmention sender show proof that you have linked to them before.

  3. Anonymous Coward
    Anonymous Coward

    The corporate web won. I don't think this spec is going to bring back the golden age of blogging.

    1. joeW

      There was a golden age of blogging?

      1. Anonymous Coward
        Anonymous Coward

        You joke, but yes there pretty much was.

        Before facebook (for celeb gossip and commodity fetishism), and before Stackoverflow (for programming tips and tricks), most useful information was stored in blog posts, and Google search algorithm was heavily weighted towards finding said blog posts.

        1. Michael Wojcik Silver badge

          ...most useful information was stored in blogUsenet posts...

          FTFY.

          Frankly, I'd say that from the early days of the web up until Google became popular, most of the useful information on the web was either professionally-produced content like product manuals and article reprints, or on personal single-topic pages that were manually maintained. (And the best way to find them was often with Yahoo!'s human-curated encyclopedic index, which was based on an actual information model, not "let's throw everything into one bag".)

          While there were no doubt some useful blogs, the majority, in my experience, were just the usual vanity publications, and the information therein tended to be highly subjective, apocryphal, or misleading.

          But of course others will have different impressions.

          1. Anonymous Coward
            Anonymous Coward

            Don't say FTFY. You didn't fix jackshit m8. Usenet was dead (side-lined, bleeding mind-share, mostly spam and binaries) by 2000.

            Blogger - 1999

            Google Groups - 2001 (Deja News bought and strangled to death by Google)

            Drupal - 2001

            Technorati - 2002

            Wordpress - 2003

            Google buys Blogger - 2003 (Google highly favours blogger posts in own search algorithm)

            This era was dominated by content management systems on cheap PHP hosting. Not long after, spam brings the whole thing crashing down. Google no-longer favours "fresh" content, but established domains. Around 2005 corporate silos begin to dominate.

            Facebook - 2004

            Wikipedia reaches critical mass - 2004-2005 Top search results for most encyclopedic topics.

            Reddit - 2005

            Disqus - 2007

            Stackoverflow - 2008

            Apple app store - 2008

            Google play store - 2008

            2010 onwards, the open APIs that defined the undefinable "web 2.0" started to be shut down. Openness no longer seen as necessary or relevant.

            Corporate domination complete.

  4. Pete 2 Silver badge

    Bait and switch

    So after Alice has published something and Bob "webmentions" her writing in his response. Then Alice sees what Bob has said - something complementary - she decides to link her stuff to his stuff.

    Fair enough so far. We have two pieces of compatible material.

    Now, after a day or so, Bob (or Alice) swaps out their original text and replaces it with an advertisement for bodily elongation, loan applications, political endorsements or pr0n. How is the weblink policed?

    So long as the link stays the same, would the process be able to detect changes; whether benign such as a correction of update or nasty, underhand or fraudulent?

    1. Anonymous Coward
      Anonymous Coward

      Re: Bait and switch

      Maybe it should have been crypto based, where you send a hash of the content. Of course that'd be way more complex when you want to edit a post, and all the linkers would need to re-scrape.

    2. kevinmarks

      Re: Bait and switch

      Link rot is a problem with any link on the web, webmention won't magically fix that. As Tim Berners-Lee has said 'eventually every URL is a porn site'. However, if Bob resends the webmention after changing the text, it will get revalidated. If Bob doesn't his original text will be cached on Alice's site.

  5. FatGerman

    Yes the corporate web might disappear, but so can sites hosted by individuals. If I have a conversation on Facebook it's all in one place. With this system, rather than see a conversation all in one convenient place, the component parts of the conversation get spread out across the internet across sites hosted by various random individuals. So when one individual stops paying his ISP because he's got a life and bored of blogging, half the conversation disappears rendering any replies to his parts less than meaningless.

    Say what you like about Facebook, but it's popular because it's convenient and easy. Pingbacks were only ever an ego boosting tool used to make you feel good that somebody referenced you. They never really got used for anything important because they didn't do the very thing you describe Twitter and Facebook as being good at - aggregating things. This seems to be exactly the same but with a different name.

    They've rebadged it you fool.

    1. Doctor Syntax Silver badge

      "Pingbacks were only ever an ego boosting tool used to make you feel good that somebody referenced you. They never really got used for anything important because they didn't do the very thing you describe Twitter and Facebook as being good at - aggregating things."

      Usenet & IRC are the non-corporate way of aggregation. All pingback achieved was disrupting discussion threads on blogs.

    2. John Lilburne

      "all in one place ..."

      "If I have a conversation on Facebook it's all in one place."

      I had one of those the other month. At the end the user whose feed it was deleted the entire thing. Most links on wikipedia are dead, most of the flickr accounts that they took images from are deleted. My drupal powered site has 8000 indexed pages on Google and several 1000 images. Linked to by a number of academic institutions but it could go tomorrow, or in a couple of months time if I decide not to renew the hosting.

      The web in whatever guise is completely ephemeral.

      1. Doctor Syntax Silver badge

        Re: "all in one place ..."

        "The web in whatever guise is completely ephemeral."

        Far from complete - and usually the missing bit is the bit you want - but there's archive.org as a last resort.

        1. John Lilburne

          Re: "all in one place ..."

          "but there's archive.org as a last resort."

          A few years back I discovered that a web page that my wife had created had her full name on it. That name is pretty unique and pasting it into a search engine hacked up her full address etc. The page was on archive.org but a quick edition to robots.txt and an email fixed that and 12hrs later all was gone.

    3. kevinmarks

      A silo like Facebook seems more robust, but when they vanish, they take out a big chunk of the net with them. See http://indiewebcamp.com/site-deaths for examples.

      Distributing comments across websites is more robust, and webmention gives the site receiving the comments a way to cache them that can persist even if the commenting site does go away.

  6. John Lilburne

    Interesting. The more ways in which we can breach these corporate silos, and claim back the internet for individuals, the better.

  7. M man

    What does google do...

    "The best suggestion I've seen for finding out who's linking to your content is simply to ask Google."

    Google strenth is in indexing references and links to sites and comparing who has the most/best.

    your search results are driven by this information.

    thier ads ride off the back off your searches

    thier revenue comes from ads

    If the web effectively did this by itself...it could distroy the root of thier power.

    1. Tom 7

      Re: What does google do...

      Is that first second or third thier?

  8. Old Handle

    Pingback? Was that the thing where blogs would be cluttered with faux comments that seemed to contain nothing but around ten words excepted from the original post? Or am I thinking of something else?

  9. kevinmarks

    Here's the inventor of pingbacks switching to webmentions (Note the webmentions in response, that, unlike pingback default presentation don't look like incoherent excerpts:

    http://www.kryogenix.org/days/2014/11/29/enabling-webmentions/

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like