Hmmm....
Given that the next article on the Reg home page is...
http://www.theregister.co.uk/2016/02/18/ddos_dingbats_enslave_tens_of_thousands_in_wordpress_pingbacks/
...I assume that this would be a Bad Thing?
Something called Webmentions – which looks remarkably like the old WordPress pingbacks, once popular in the late 2000s – is grinding through the machinery of the mighty, and slow-moving, World Wide Web Consortium (W3C). But don’t be deceived. Lurking behind that unassuming name lies something that might eventually offer users …
It's a Bad Thing, and then there are the security issues.
Now if you'll pardon me, I need to find my shaking cane.
(In all seriousness, I don't particularly care; there's no requirement to participate. But phrases like "participate in the larger conversation" elevate my hackles. It's like all those early Internet-obsessed Habermasians gushing about the web. Oooh, discourse in the public sphere! Yes, and it's done so much to usher in a utopia of communicative rationality.)
So, every time someone claims to have referenced content on your website, you have to parse the source from the proffered URL and verify that it contains a link, even if you're not sure exactly what format the source material takes. You potentially have to do this every time a further post is made to the conversation and potentially for every contributor.
No wonder the spec suggests processing notifications asynchronously, "to prevent DoS attacks".
Also, no wonder that TBL quickly discarded the idea of two-way linking in his search for a practical hypertext system.
The best suggestion I've seen for finding out who's linking to your content is simply to ask Google.
> Also, no wonder that TBL quickly discarded the idea of two-way linking in his search for a practical hypertext system.
Nah. WWW was an incremental improvement on Gopher. TBL didn't know he was changing the world. If he did, he would've made drastic architectural changes to prevent spam and DDoS, he would've addressed bidirectional linking, and he would've added an automatic content replication feature as a safeguard against dead links. But that wouldn't be easy.
I'm not sure the GBL ever thought of bidirectional linking, but others did, and had added it to the specs by 1996:
http://www.cogsci.ed.ac.uk/~ht/new-xhl.html
"he would've added an automatic content replication feature as a safeguard against dead links. But that wouldn't be easy."
Those are implementation details. Replication and caching (eg Squid) don't belong in a language spec.
Interesting. So they didn't get far with XHL, but it does raise the question: could this feature be built on top of the existing web, despite its dirty content and ephemeral URLs? Say, a crawler/reader package that anyone can install locally or on a public server, which crawls a list of sites, caches any new content and 1st-level linked content, and serves it up Google Reader fashion. But decentralized. Grassroots implementation might succeed where topdown standards failed. If it became popular, websites would bow down to it instead of Google. Wordpress's REST API project might be anticipating that outcome... and Webmentions is apparently a comments-only version. </naiveoptimism>
I'm not sure the GBL ever thought of bidirectional linking, but others did, and had added it to the specs by 1996:
http://www.cogsci.ed.ac.uk/~ht/new-xhl.html
There's also Hyper-G, another attempt to add bidirectional links to HTML, which was also developed in the mid-1990s. There's a commercial implementation (HyperWave) and some free clients, but it never caught on.
I think the problems they were trying to solve simply weren't compelling to the vast majority of users. And I think the same is true of pingbacks and Webmentions and the like. The W3C may standardize them, but they'll never be used "across the whole web"; they'll be a niche technology.
The one-way nature of links was necessary for the Web to achieve what it did. The globally connected nature of the web as a small world network is built on a scale-free distribution of linkage. If all links are required to be two-way, this rapidly becomes unwieldy and cumbersome - imagine if the front page of Apple.com showed all the inbound links to it. The unidirectionality created the permission-free linking culture the web depends on, and reversing those links in a useful way is an interesting problem
me in 2004, when I was trying this: http://epeus.blogspot.com/2004/02/technorati-xanadu-and-other-dreams.html
Hi there - as I built the crawler for Technorati to do this in a centralized way web scale, I can confirm that a decentralised version scales better. As mentioned in the article, there are already services that you can delegate webmention handling to, which is handy if you have a static site.
The other point is that how you respond to webmentions is up to your site - you can of course implement a whitelist, a blacklist or the Vouch protocol extension that demands that the webmention sender show proof that you have linked to them before.
You joke, but yes there pretty much was.
Before facebook (for celeb gossip and commodity fetishism), and before Stackoverflow (for programming tips and tricks), most useful information was stored in blog posts, and Google search algorithm was heavily weighted towards finding said blog posts.
...most useful information was stored in blogUsenet posts...
FTFY.
Frankly, I'd say that from the early days of the web up until Google became popular, most of the useful information on the web was either professionally-produced content like product manuals and article reprints, or on personal single-topic pages that were manually maintained. (And the best way to find them was often with Yahoo!'s human-curated encyclopedic index, which was based on an actual information model, not "let's throw everything into one bag".)
While there were no doubt some useful blogs, the majority, in my experience, were just the usual vanity publications, and the information therein tended to be highly subjective, apocryphal, or misleading.
But of course others will have different impressions.
Don't say FTFY. You didn't fix jackshit m8. Usenet was dead (side-lined, bleeding mind-share, mostly spam and binaries) by 2000.
Blogger - 1999
Google Groups - 2001 (Deja News bought and strangled to death by Google)
Drupal - 2001
Technorati - 2002
Wordpress - 2003
Google buys Blogger - 2003 (Google highly favours blogger posts in own search algorithm)
This era was dominated by content management systems on cheap PHP hosting. Not long after, spam brings the whole thing crashing down. Google no-longer favours "fresh" content, but established domains. Around 2005 corporate silos begin to dominate.
Facebook - 2004
Wikipedia reaches critical mass - 2004-2005 Top search results for most encyclopedic topics.
Reddit - 2005
Disqus - 2007
Stackoverflow - 2008
Apple app store - 2008
Google play store - 2008
2010 onwards, the open APIs that defined the undefinable "web 2.0" started to be shut down. Openness no longer seen as necessary or relevant.
Corporate domination complete.
So after Alice has published something and Bob "webmentions" her writing in his response. Then Alice sees what Bob has said - something complementary - she decides to link her stuff to his stuff.
Fair enough so far. We have two pieces of compatible material.
Now, after a day or so, Bob (or Alice) swaps out their original text and replaces it with an advertisement for bodily elongation, loan applications, political endorsements or pr0n. How is the weblink policed?
So long as the link stays the same, would the process be able to detect changes; whether benign such as a correction of update or nasty, underhand or fraudulent?
Link rot is a problem with any link on the web, webmention won't magically fix that. As Tim Berners-Lee has said 'eventually every URL is a porn site'. However, if Bob resends the webmention after changing the text, it will get revalidated. If Bob doesn't his original text will be cached on Alice's site.
Yes the corporate web might disappear, but so can sites hosted by individuals. If I have a conversation on Facebook it's all in one place. With this system, rather than see a conversation all in one convenient place, the component parts of the conversation get spread out across the internet across sites hosted by various random individuals. So when one individual stops paying his ISP because he's got a life and bored of blogging, half the conversation disappears rendering any replies to his parts less than meaningless.
Say what you like about Facebook, but it's popular because it's convenient and easy. Pingbacks were only ever an ego boosting tool used to make you feel good that somebody referenced you. They never really got used for anything important because they didn't do the very thing you describe Twitter and Facebook as being good at - aggregating things. This seems to be exactly the same but with a different name.
They've rebadged it you fool.
"Pingbacks were only ever an ego boosting tool used to make you feel good that somebody referenced you. They never really got used for anything important because they didn't do the very thing you describe Twitter and Facebook as being good at - aggregating things."
Usenet & IRC are the non-corporate way of aggregation. All pingback achieved was disrupting discussion threads on blogs.
"If I have a conversation on Facebook it's all in one place."
I had one of those the other month. At the end the user whose feed it was deleted the entire thing. Most links on wikipedia are dead, most of the flickr accounts that they took images from are deleted. My drupal powered site has 8000 indexed pages on Google and several 1000 images. Linked to by a number of academic institutions but it could go tomorrow, or in a couple of months time if I decide not to renew the hosting.
The web in whatever guise is completely ephemeral.
"but there's archive.org as a last resort."
A few years back I discovered that a web page that my wife had created had her full name on it. That name is pretty unique and pasting it into a search engine hacked up her full address etc. The page was on archive.org but a quick edition to robots.txt and an email fixed that and 12hrs later all was gone.
A silo like Facebook seems more robust, but when they vanish, they take out a big chunk of the net with them. See http://indiewebcamp.com/site-deaths for examples.
Distributing comments across websites is more robust, and webmention gives the site receiving the comments a way to cache them that can persist even if the commenting site does go away.
"The best suggestion I've seen for finding out who's linking to your content is simply to ask Google."
Google strenth is in indexing references and links to sites and comparing who has the most/best.
your search results are driven by this information.
thier ads ride off the back off your searches
thier revenue comes from ads
If the web effectively did this by itself...it could distroy the root of thier power.