* Posts by JN

25 posts • joined 11 Jan 2009

How Google's black box Knowledge Graph can kill you


Re: What's new?

"This issue is not new, formerly if it was printed in black and white it was fact, when it wasn't always and still isn't."

This is true, but when in the past was there ever a book or newspaper with the instant, daily global reach of Google/Wikipedia (and all those who automatically copy their info from them)?

I mean, Facebook is "not new" either. People have always talked to their friends and family, and sometimes told each other "fake news". But we're learning the hard way that if you put a global near-monopoly and artificial intelligence tools together, you end up with a different sort of beast altogether.

Why is Wikipedia man Jimbo Wales keynoting a fake news conference?


Re: Wikipedia infaillible ? No!

"studies have confirmed that in general, Wikipedia was better than Britannica"

Beg pardon? You do realise that the famous Nature piece concluded that Britannica was better, only not by as big a margin as expected? Moreover, that comparison focused on science topics only. Even Wikipedia itself (see "Reliability of Wikipedia") does not claim that Wikipedia generally wins Britannica comparisons.

Wikipedians generally advise people not to take anything written in Wikipedia on faith, but check the sourcing whenever it's important. (Wikipedia is not considered a reliable, citable source in Wikipedia itself.)

Wikipedia's quality level depends on the topic area and varies much more than that of Britannica – it ranges from truly excellent to questionable (historical examples: PR pieces authored by companies, biographies written by their subjects or PR agents, plastic surgery articles written with promotional spin by commercial plastic surgeons, etc.) to complete garbage (outright hoaxes and defamation, people writing about scholarly topics they do not understand).

Wikipedia quality generally is a function of how many people watch and have edited an article, with the most serious problems tending to occur in articles and passages that no other regular contributor has ever reviewed.

Golden handshakes of almost half a million at Wikimedia Foundation


Ah, okay. :)

The WMF performs banner tests throughout the year, as part of its efforts to maximise the banners' effectiveness. A/B tests. When that happens, a small proportion of users is shown the most recent fundraising banner designs, and then the designs they've run are compared for fundraising effectiveness.

I haven't seen any such banners here my end in recent weeks, so I think you may just have got caught in the test group. The big fundraising campaign is always in late November and December. (Come to think of it, I think there may also be some countries where the banner runs at a different time of the year.)



ElReg had three articles on this and was quite probably instrumental in stopping that fundraising campaign early, after showing that the takings had already exceeded the Wikimedia Foundation's publicised target by several million dollars.

The articles are here:

1. https://www.theregister.co.uk/2016/12/16/jimmy_wales_wikipedia_fundraising_promise/

2. https://www.theregister.co.uk/2016/12/19/jimmy_wales_breaks_promise_more_chugging/

3. https://www.theregister.co.uk/2016/12/30/el_reg_just_saved_your_wiki_xmas/

Jimbo announces Team Wikipedia: 'Global News Police'


Re: Filtering Fake news

Having news organisations with known biases has its advantages. If you read a left-leaning or right-leaning newspaper, you know what to expect. You can read both sides and can mentally compensate for the bias of each. Having named journalists with established track records and known stances on various issues is similarly useful. You know where they're coming from.

In a semi-crowdsourced news organisation this becomes a lot more difficult. It will be hard to know, for any given contentious issue, which "side" has discovered Wales' new project first, shaping the site's reporting in its own image.

Wikipedia, at any rate, has more than its fair share of political activists, PR people and social entrepreneurs who studiously keep their identity and affiliations secret, shaping public perceptions under the cover of anonymity (or at least trying to).

I wonder whether WikiTribune's "volunteers" will protect their identities and affiliations as zealously as Wikipedia's contributors do, and whether the articles will have bylines providing author names like "Darklord" or "Rocketman12".

Happy birthday: Jimbo Wales' sweet 16 Wikipedia fails


Re: Citation needed

"But that happened, long before Wiki."

Sure it did.

And in a way, Wikipedia takes us back to how things were centuries ago, when there was, say, only one standard work on South American fauna, and everybody else copied from it. (Ironically, the Cambrige University Press book linked in the article, The Legacy of Dutch Brazil, edited by Michiel van Groesen, discusses exactly one such case of scholars simply copying from their predecessors... and in the process makes two mistakes: firstly copying the fake "Brazilian aardvark" moniker from Wikipedia and secondly referring to the coati instead of the agouti Buffon was actually writing about ... but I digress.)

So in the 18th century, you had a situation where there was one standard work, and everybody else, for the most part, copied from it, rather than presenting their own research. With Wikipedia, we're now effectively returning to this situation.

Centuries ago, the problem was that there weren't enough sources to choose from. Today, there are too many sources for people to choose from. The choice is too bewildering, and people look for a one-stop shop. But the outcome is the same: one source dominates everything else, and its errors and biases propagate.


Re: Citation needed

"If you find 10 other websites (or even books) quoting the same fact that doesn't make it true if they all used wiki as their original source, but there is no way to determine that"

The only way to check is to first figure out when the info was added to Wikipedia (you have to go to the article's revision history and perform a "revision history search"), and then look for sources mentioning the same info that were published prior to that date (if there aren't any, it's likely to have been a Wikipedia invention). The process is extremely cumbersome, and will become more and more difficult as time passes.


Yes, that was a good one too.


Re: Citation needed

I was responding to the statement that "Any encyclopedia is essentially just a compilation of sources." Wikipedia articles can indeed be used like that, but traditional encyclopedia articles aren't footnoted.


Re: Citation needed

Reinforcing the false impression that Wikipedia consistently achieves a level of accuracy comparable to that of Britannica really doesn't help. Interestingly, it's a hoax in itself that compounds the problem.

For a start, Wikipedia contains millions of articles on pop culture, sports, companies, business schools, villages and other minor topics that no one will ever be able to conduct a comparative study on, simply because other encyclopedias don't cover them. (Britannica has no article on Amelia Bedelia, for example, or the Boston point-shaving scandal.) The Wikipedia articles on Neptune, the Aral Sea or Barack Obama may be quite excellent, but it's the more obscure articles where Wikipedia's vetting lets the site down, and bad edits slip through.

And those bad edits are a qualitatively different problem from those Britannica and other traditional encyclopedias suffered from: they did not outsource some of their content writing to stoned sophomores, hoaxers, political extremists, revenge peddlers or PR companies operating under the cover of anonymity.

People see Wikipedia described as an encyclopedia and mistakenly assume that all its content is vetted by the site's administrators before publication, in much the same way that editors and specialists did their best to verify the content of conventional encyclopedias before going to press. (Any Wikipedian can tell you otherwise.)

It's an impression the Wikimedia Foundation often fosters, praising the vigilance of its anonymous volunteers and citing studies on Wikipedia's reliability that gave the site a passing grade. In doing so, they're doing knowledge a disservice. It would be much more helpful if they told people to be alert, check references and so on (as Amir Aharoni, to his credit, did in that interview).


I don't know if you've ever read a traditional encyclopedia, but they're not "just a compilation of sources". Traditional encyclopedia articles didn't even list their sources. They didn't have to, because they employed expert writers trusted to have a full grasp of the topic's literature, and the ability to summarise key findings for the public. Listing sources for everything is Wikipedia's invention, as an alternative way of establishing content credibility.

Nature's famous Britannica–Wikipedia comparison, by the way, only compared a small number of science topics and firmly concluded that even in this topic area, which is one of Wikipedia's strongest, the site contained a third more errors than Britannica, and was much less well written. I'm not aware of any studies concluding that Wikipedia is more trustworthy than Britannica, but if you have seen one, I'd be interested in looking at it.

Will Wikipedia honour Jimbo's promise to STOP chugging?


WMF:Fundraiser will continue, but with a Xmas break.

The Wikimedia Foundation has posted an update. The short story: They are happy to have reached their target in record time, but will continue fundraising anyway – though they will take the banners down during Christmas, and then put them back up again at the end of the year.


"This year, we are happy to report we’ve reached our goal of US$25 million in record time. This is a testament to the importance of Wikimedia and how much support we have from people all over the world.

"Given this momentum, we believe that it would be wise and worthwhile to continue to fundraise more in the month of December, for the following reasons: [...]

"Here is what we will do: We intend to continue with the banners for a few more days. We would then take them down over the Christmas holiday, before making an end-of-year push in the final couple days of the year. (Many people choose to give at the very end of the year, and they are expecting to hear from us as usual -- so it is an opportunity to give people who plan to give the easiest means to participate)."

Link: https://lists.wikimedia.org/pipermail/wikimedia-l/2016-December/085712.html

We Googled the ex-Google guy and Google said he was clean, says Wikimedia


Pathetic, if not sinister.

It wasn't on the first page of Polish Google? Pathetic.

What's bordering on sinister is that those board members -- like Wales -- who did have some inkling of the issue didn't alert their colleagues before they asked them to rubber-stamp the appointment of the man.

Maybe they didn't really want anyone to dig too deeply. And one would have thought that an organisation like Wikimedia that goes on about transparency and openness had a more democratically constituted board.

How to save Wikipedia: Start paying editors ... or write for machines


"You couldn't possibly pay the massive amount of people making edits today"

That assertion is worth looking at, because the number of people editing Wikipedia is actually fairly small. The English Wikipedia's core community for example is a little over 3,000 people; another 30,000 people or so make perhaps one edit a week.

The total number of edits made in all Wikimedia projects together (Wikipedia, Wikidata, Wikimedia Commons, Wiktionary, Wikisource etc.) over the past 15 years is about 2.5 billion at the time of writing. That sounds a lot, until you remember that Google's ad revenue alone is $17bn a quarter. Google's Wikimedia-based Answer Boxes and Knowledge Graph panels are an important factor driving that revenue. They train you to look at and click in those areas of the search engine results page where the paying ads are.

Now, if Google gave just a single day's ad revenue (about $200m) to Wikipedia editors, each Wikipedia edit would average out at $200m/2.5bn = $0.08. For a Wikipedia editor who's made 150,000 edits over the past 10 years or so, that would be $12,000.

And while Google is the biggest player leveraging Wikimedia content to get eyeballs on paying ads, they're by no means the only one. Bing, Facebook (which contains a complete copy of Wikipedia) and others do the same.

That's not to say that I couldn't imagine other problems arising, along the lines you are describing, but given the scale of the economies involved, the problem isn't that there isn't enough money in the system. The people who profit most from "free knowledge" are mega-rich corporations, and they're keeping their profits to themselves.

It's Wikipedia mythbuster time: 8 of the best on your 15th birthday


Re: Myth busting a Mythbuster.

You know how to click links, right?


Re: There was always a near monopoly on encyclopedic knowledge

As I recall, the Nature piece also found that the writing on Wikipedia was poor, lacked structure and veered off into inappropriate tangents. However, as the final and widely publicised count was only about actual inaccuracies they found, that aspect didn't make it into the final assessment, which would have rated an article with all the right facts in the wrong order the same as an article with the right facts in the right order.

Another thing that c|net didn't report was that Nature only looked at science articles -- topics like "Meliaceae", which don't tend to get a lot of drive-by edits on Wikipedia. They did not look at sociology, literature, art, fashion, politics, history, current affairs etc.


Shouldn't a project like Wikipedia be held to higher standards than that? Its whole identity is about being a non-profit serving the public interest, being transparent and so forth.

Those are all good goals on paper, but if they don't live up to them, they should be clobbered until they do.

Wikipedia is a public resource. If you don't expect and demand better, you won't get anything better.

Wikimedia Foundation bins community-elected trustee


You're correct.

First, the candidates the community votes for are only ever recommendations; even the successful candidates in the community votes are appointees (= community recommendations accepted by the sitting board members, at their discretion), rather than being directly elected to the board (which would give them a different legal standing according to the law of Florida, where the Foundation was originally set up).

And according to the board rules, the majority can at any time dismiss any board member, for cause or without cause (something which hitherto had not happened).

You can find more information in the current mailing list threads related to this:


As someone has mentioned there recently, the conflict of interest policy for board members is very strange, too. Board members are assumed not to have a conflict of interest as long as their ownership in a company is less than 10%. So, for example, you could own 9% of Google, and this would not rise to the level of a conflict of interest on any board business related to Google. (However, if you had an 11% investment in your kid's window cleaning business, that would.)

The Geshuri business also casts a bad light on board proceedings because Jimmy Wales for example has recently said he had some awareness of the lawsuit, while Dariusz Jemielniak, another community-selected board member (who voted against Heilman's dismissal) said he only found out about Geshuri's involvement in the scandal after the appointment. So it seems that Wales did not share the information with Jemielniak. There was a general lack of due diligence.

Unsourced, unreliable, and in your face forever: Wikidata, the future of online nonsense


Re: I'm puzzled by the example

The image relates to the articles by Mark Graham in Slate and on the Oxford Internet Institute website, linked in the text. Graham argues that there are disparate views of Jerusalem's status, and the Knowledge Graph only represents one. There are other examples like that where the Knowledge Graph picks one view alone, without mentioning the others; for example (quoted from Slate), "the search engine lists Northern Cyprus as a state, despite only one other country recognizing it as such. But it lists Kosovo as a territory, even though it’s formally recognized by 112 other countries." That's the sort of thing Wikidata could theoretically influence (looking at the relevant Wikidata items, I'm not sure it does in these two cases).

Shapps launches probe into Wikimedia UK over self-pluggery allegs


One scandal a year?

Wikimedia UK seems to average one scandal a year. Bamkin's Gibraltarpedia, the van Haeften ban, the Compass enquiry, the Symonds desysop in the wake of the Shapps allegations ...

Significantly, the types of problems coming up generally seem consistent with the problems marring Wikipedia as a whole: amateurism and people misbehaving, driven by self-interest, bias and contempt for the people they write about in Wikipedia.

EU squashes bogus copyright scare as red-faced Guardian slaps down Wiki's Wales


Re: Copyright extension sucks

The downvotes are probably because of the phrase "I don't care what tactics the Pirate Party uses." Even if you agreed with the end, it doesn't justify the means. Wikipedia and Wales in particular should not give the public information that is substantially misleading. (This is hardly the first time that such accusations have been raised.)

The Wikipedia movement professes to be all about education. But when both Wikimedia management communications and Wikipedia content are so full of spin (to the extent that it requires The Guardian to add a substantial correction of fact to an *op-ed*, something that really does not happen very often), you are seeing the opposite of education: an attempt to manipulate the dumb masses.


Re: Wrong!

Wales actually said,

"Freedom of panorama is the unrestricted right to use photographs of public spaces, without infringing the rights of the architect or the visual artist. Wikipedia only uses freely licensed images. Therefore, this valuable exception to copyright is necessary in order to allow Wikipedia to freely depict public spaces on relevant articles."

The qualifier you mention ("images of architecture") is not in fact there.

And the French Wikipedia (remember, there is no freedom of panorama in France today) has a special and well-stocked category of "non-free images of recent buildings" – located at fr.wikipedia.org/wiki/Catégorie:Image_non_libre_de_bâtiment_récent – exactly what Wales claims would be impossible for Wikipedia.

It is the same in other Wikipedias: fair use images stay out of Wikimedia Commons, and are instead uploaded in the relevant Wikipedia itself, in line with that project's non-free content provisions.

The one point I do agree with you on is that the English Wikipedia showing an album cover in an article on that album isn't copyright infringement, but fair use (at least according to US law; in Europe it may be different, and some Wikipedia language versions avoid showing album covers).

But the Guardian article should indeed receive a second correction. As it stands, Wales' statement is grossly misleading.

Conflict-of-interest scandal could imperil Wikimedia charity status


Presentation on the SEO value of being on Wikipedia's front page

Here is a presentation that openly hawks the SEO value of being on Wikipedia's front page:


"The more links you've got, the higher you go up the ratings."

"It's a phenomenally cheap, very imaginative way to absolutely energize a city and put a city on the map. And there you go..."


Where are the British media

One thing that surprises me is that so far, there has been no report of this in the British media. The story has been reported all over Europe these past few days, with articles in El País, Le Monde, Frankfurter Rundschau, as well as Fox News in the US. But nothing in The Guardian, Times, Telegraph, BBC, etc.

Wikipedia awash in 'frothy by-product' of US sexual politics

Thumb Down


This was more than just reporting. This was a Wikibomb.

Three new navigation templates were created, and added to hundreds of unrelated articles on Wikipedia, adding hundreds of in-bound links.

Some of the sources the article cited were ridiculous: an alternative crossword puzzle, a geek limerick contest, and free erotic e-books. Even the book Cade cites in his article is self-published -- Broken Science Press has published exactly two books, both by the same author.

7 new articles on Dan Savage were created and nominated for the Wikipedia main page, to try to get Dan Savage on the main page 7 times within one week.

The one good source in the article, The New Partridge Dictionary of Slang and Unconventional English, was misrepresented. It mentioned the term in its introduction, and explained why it did not list the term in the dictionary:

"As we drew from written sources, we were also mindful of the possibility of hoax or intentional coinings without widespread usage. ... An example of deliberate coining is the word 'santorum', ... In point of fact, the term is the child of a one-man campaign by syndicated sex columnist Dan Savage to place the term in wide usage. From its appearance in print and especially on the Internet, one would assume, incorrectly, that the term has gained wide usage."

Instead of reflecting that, the article said the exact opposite, proudly reporting that the word was listed in the dictionary as a "deliberate coining".

Most of this was the work of one editor, who has not been sanctioned in any way and is free to carry on as before.

'Lord of the Universe' disciple exits Wikipedia


Cult wars on Wikipedia

You need to have a look at the main Scientology article on Wikipedia. It is currently locked so nobody can edit it and just the first few paragraphs will make you forget any notion you might have had that it is controlled by Scientology. They wish!

If people are worried that Wikipedia is too soft on cults, check Wikipedia's Scientology article against the Encyclopedia Britannica one on Scientology. You will find that Britannica is MUCH kinder to Scientology.


Biting the hand that feeds IT © 1998–2020