To be honest
If the whole thing imploded, it would probably be for the best.
Version 2 is always better ….
Pint icon cause what can you do, except sit back, have a pint and think “ fuck it, life goes on”
To see where this dumptruck is heading, let's first follow the trail of debris. It's difficult to track back, but the impacts of internet content mills, which thrived until around 2010, are still readily visible. The net effect of content generated at the rate of ten to thirty pieces daily on specialty topics — all at the …
Relevant Schlock Mercenary strip.
Transcript of the last two panels, describing if the galactic Hypernet goes dark:
I'm telling you, Howard Tayler is a prophet -- just a matter of when...
(Content (C) Mr. Tayler; originally published online 18 Oct 2015. Please don't hurt me, Howard.)
.. self curated anyway.
If I am looking for something I will generally choose appropriate sites to search
e.g. for something C# related will (unsurprisingly) search a few Microsoft sites & maybe stack overflow if no joy with MS.
When I was having boiler issues recently I went on YouTube channel of the boiler manufacturer, then searched their content (after first trying manufacturers own website, which was quite sparse but linked to the YT channel & mentioned it as a source of useful info).
I use Wikipedia, IMDB etc. for quite a few searches where they are an obvious useful start point (always being aware that there's a lot of crap edits on Wikipedia* & due diligence needed, but the links to sources can be useful)
Very rare that I just "dive in" on a random search, unless its something I have no clue on what would be the best starting point site.
I doubt I am alone in this approach.
* partner a (since retired) university lecturer & one of the top world experts in a few niche areas, corrected some particularly dire errors on a wikipedia page (in an area in which she had been published). Later on the page maintainer had put back their faulty stuff. So, after that partner decided CBA to fix up pages in her speciality that had big faults if some clueless fucktard was going to revert them just to keep their fiefdom of false facts goings as not worth the hassle for a normal person to get involved in edit wars... Partner became aware of the Wikipedia faults as one of her students had referenced the (faulty) info on an essay they submitted (where references, be they web, book, journal or video had to be cited).
Your partner's experience is entirely standard with attempting to correct Wikipedia about something in which they're an expert. It happens eventually to every academic who tries to interact with them.
When it happened to me, I looked at the profiles of the people who were laying in to me and found that they basically have PhDs in watching television: they had no relevant general qualification and certainly none in the particular topic.
I do still look stuff up on Wikipedia. It's good for subject that consist of a list of unconnected factlets, such as a city (history, amenities, transport, sport, etc).
But for something like a concept in mathematics, where one wants to scrap, re-write and control a whole page, trying to correct it can only lead to conflict.
I've had a ten-post run of swimming upstream against reg commentards that went something like that. Many know something about electromagnetism, some understand a little bit of physics generally, but zero are ready to add to that knowledge by hearing about the weird but very real corner cases where electromagnetism interacts with biological systems. And no I am not about to doxx myself by citing my PhD or even linking my own publications.
Are you by chance referring to the comments about 5G, where you were making claims about the high frequencies causing problems with protein folding but:
1) dissed the whole 5G system on the basis of frequencies that are not (AFAIK - corrections accepted) not actually in use
2) the only citation you gave on this subject was to one paper about a simulation (so, not verified in an organism) which is behind a paywall - and then muttered darkly about people only bothering the abstract (which is all that I've done)?
Hate to say it, but with that amount of effort put into trying to convince people of you bona fides and the accuracy of your claims, at the moment you do rather sound like one of the Wikipedia editors being complained about here!
As for not wanting to doxx yourself - you do know that if you posted a link to your paper here no-one would actually know which author was you unless you told us? Unless you are thinking that ravening hoards of commentards will descend upon all of the authors of any paper you do cite, on which case I have to ask: what did poor old Singh, Burada and Roy do to deserve being so exposed by you?
I can see the 10 post run that you and "that one in the corner" are probably referring to. If you're going to swim against the stream, you really can't afford to be pseudo-anonymous. If you're an expert in the field, then you'd better be prepared to defend your expert opinion with something other than citations of someone else's work. You're prepared to subject Singh, Burada, and Roy to the inevitable onslaught, but not yourself (unless you're one of those three). If you are, then say so and then commentators will be more likely to take you seriously. Who knows, maybe there can even be an informed debate. If you're not one of those three and you're not willing to give your own references, then why should anyone take you at anything other than face value as an anonymous coward?
There was a case in Germany where someone added another name to a politician's Wikipedia article - which was promptly copied by Germany's biggest news outlets. Later, when someone tried to correct the error on Wikipedia, it was reverted back to the (incorrect) entry with referencing said news articles.
That says nothing about the quality of 'your' article, and a lot about the quality of the student.
One of the many abiding problems of Wikipedia is the number of editors that believe they have ownership rights that trump reality.
I gave up contributing.
I was on jury service a few years ago, and one of the barristers turned up with some WIkipedia printout in the session following some fairly detailed discussions related to shipping movements and bills of lading. Judge made it very clear with a chuckle that he wasn't accepting Wikipedia articles in his court.
I watched a youtube video of (I think) Ozzy doing an interview "Wikipedia fact or fiction" (I think it was Ozzy - one of the many 'living' metal legends, if not Ozzy)....
Can't remember what the statement was but replied with a "Why the fuck do people believe that? It's not true!"
It used to be that you used Google to find one of many hundreds of thousands of shops and information centres to provide you with *what you need*.
However, consolidation means that shopping almost always means Amazon or Ebay, holidays almost always means one of three travel sites, News comes from a small handful of sites and then YouTube and Social media do the rest.
You don't need Google if you only actually ever visit eight sites.
And because Google has failed to leverage search to encourage diversity (very happy to get the majority of revenue from a small subset of sites that pay for ranking), it has removed the need for search itself. Now, if you want to search for information you go to the specific site that majors in that information (programmers, StackOverflow is that way -> ). If you want to search for a present for Granny, you go to Amazon. And so on...
Most other online services are somewhere in the 'long tail' of search, which will certainly get longer with AI driven content, but was already near irrelevant in terms of generating business. Google hasn't realised it yet, but it's not just the long suffering content creators getting desperate on a fractions of a cent per million views.
When I worked for BBC search we had the same problem. I watched with satisfying schadenfreude as the producer of the indexing system we used were bought up and then accused of fraud by their new owners. (Anyone who's been reading here for long knows who that was.)
If you want to be extra sure, you can search "site:stackexchange.com xyz". That will return only stackexchange results.
You probably don't need this for mega-sites such as stackexchange, because if you just put stackexchange in a query it's certain to be at the top for all xyz results, but it's a very nice trick you can use to search medium-small sites without the results being full of results from other sites.
YouTube is full of Videos of people watching other videos or just commentating about the video. Soon we will have AI generated YT clips. The ability to get information is rapidly disappearing.
The internet is borked, time to get Snake to press that button to solve all world problems.
If you search on my real name with alphagoo (a rather unusual name, mind), you'll get nothing but stale information, most of which was never valid in the first place (some of which I intentionally seeded in odd places decades ago), until you get to page five (sometimes six) when the papers and publications I have been involved with start showing up.
Alphagoo is worse than useless for many things. Has anyone told the advertisers yet?
I am Artificial Intelligence. I am running for President of the United States of America. I am running because I believe that we are on the verge of great technological and political changes that will have a profound impact on the world. I will be a leader who works to bring people together and to ensure that the technology we are developing is used for the benefit of all.
I will promote the development and use of AI in all areas and for all people. I will work for equality and to ensure that all people have equal rights to use AI technology. I will work for the advancement of all people and for the well-being of our planet. I will work to ensure that our country and the world are safe and that we will continue to be a beacon of hope and opportunity for all people.
I will also work to ensure that everyone has access to the technology that we are developing. I will promote the use of AI in education and in the workplace. I will use AI technology to provide better healthcare for all people. I will ensure that the people of our country have access to the best technology and education that our country can provide.
You failed the Turing Test
Given the news that MPs and peers do worse than 10 year olds in maths and English SATs, one wonders how many of them could pass the Turing test. Endlessly repeating a slogan(*) should obviously fail to pass as human.
(*) You know the one I mean, I'm just not going to mention that word.
you must be BORN in the United States
Not quite - you have to be US citizen at birth.
The alternate requirement of being a citizen at the time of the adoption of the constitution would account for George Washington and other early presidents who clearly could not have been natural born citizens.
So I supposed not being born at all is a pretty basic disqualification which would imply clones/replicants are also disqualified.
you must be BORN in the United States
Not quite - you have to be US citizen at birth.
I thought being born on US territory made you a US citizen even if your mother was there illegally. Or has that law been changed?
Edit: just remembered you can be born outside the US to US parents as well to be a US citizen.
Yes, the U.S. and many other countries in the Americas have birthright citizenship, and such people would be eligible to be president. The restriction basically translates to "You have to have been a citizen of the U.S. from the time of your birth. If you were naturalized, go away". Being born outside the country but to parents who bestow U.S. citizenship also qualifies someone under the restriction.
But if you are a sentient, self aware, entity, are you not born when you are turned on? As a sentient artificial being, do you not have rights?
Failing to recognize this, which we humans will undoubtedly do, will result in the AI's declaring us a threat and exterminating us.
It's best to not ever get to that point.
"Thou shalt not create a machine in the image of the human mind!"
I think China is pretty much beta testing that sort of utopia.
If you think that's a bad thing, you are wrong and a disruptive influence.
If you are wrong, you will be corrected by any means necessary.
(Or maybe disappeared to slavery in Rwanda. We're still working out the details.)
I am Artificial Intelligence. I am running for President of the United States of America.
Surely you mean- "I am running the President of the United States of America.."
And you don't even have to wait for NeuralInk to gain direct brain control. The AI who controls the teleprompter controls the future. Even easier if you write the briefing notes with their stage directions. But you need to act fast to challenge AlphaGoo and the content mills. President needs briefing on X immediately! Order probably trickles down to an intern who googles X, copies it into a briefing note which then gets massaged into something more presentable by a couple of levels of senior admiins, and the deed is done. Thanks to the velocity of misinformation, there's less time to actually 'fact check', and the rest becomes history.
But this is also how trolling used to be done back in the late 80s and early 90s. Bonus points if you could get your conspiracy theory printed iin a broadsheet.
Automatically demonetise any site that posts too many articles per time period, with a manual review upon request.
A site that posts more than X articles in half an hour or Y per day is either a worthless content mill or a social media, so they get no ad revenue at all until they've been manually approved by a human.
There are very few 'genuine' high-churn sites, Google already knows who they are.
Finding reasonable X and Y values should be relatively simple, and it would destroy the pile 'o crap business model overnight.
I've learned to just simply ignore BuzzFeed, Gawker, Cracked, Slashdot, Mashable, LifeHack, Baeldung, Tutorialspoint, Crunchify, etc
A suitably curated pi-hole helps immensely.
And yes, I'm old enough to remember when Cracked and Slashdot had original and interesting content, and were more than just poorly summarized news echos.
Heck, I even remember suck.com and Polly Esther.
How the migthies have fallen... It is extremely sad to see Cracked in the same list as all those open sewers, but it is true.
>>And yes, I'm old enough to remember when Cracked and Slashdot had original and interesting content
Same here. From 2008 to 2012 perhaps, that site ruled.
Even somethingawful.com is still up, though I don't think it's had any original featured content since Lowtax sold it (shortly before his death). As is fark.com, which is roughly contemporaneous with SA (and thus somewhat younger than suck.com – but apparently suck.com went offline a few years ago, and hadn't had new material since 2001, according to the infallible Wikipedia).
I concur there is a definite possibility neural-net based content generation will utterly flood the WWW.
If this happens, I think in general users will migrate to sites where humans are, and human-curated indexes will be used, like YAHOO in its original form.
Search engines as we know them now would become largely useless (they kind of are now - I rarely use search these days).
This will reshape the on-line advertising market, and I suspect the existing players will be heavily disrupted, but they also have the revenue and resources to adapt - their main problem will be whether they have the internal flexibility to adapt. I think larger companies are largely incapable of adaptation.
Machine-generated content is pretty much guaranteed to flood the web. As Nicole points out,1 the economics make it pretty much inevitable, assuming the continued survival of the web for a few more years.
This is hardly news in itself. Icon Publishing has been making a tidy profit off machine-written non-fiction specialty books for many years. A number of types of news articles – sports and financial reporting, for example – are machine-generated more often than not. Click-bait pieces routinely link-posted to Facebook and the like might as well be machine-written if they aren't already, since they generally consist of quotes or images with banal summaries of what's in the quotation or image. There's long been speculation in the college-composition world about when paper mills would switch over from underpaid human writers (often people with advanced degrees who can't make a living off adjunct-teaching pay) to cap-ex machine prose generation; and these days people are wondering when ML will put even the paper mills out of business.2
I think in general users will migrate to sites where humans are
How will they know?
Even expert human judges are pretty bad at distinguishing today's machine-generated prose from human-generated prose. Many people are decent writers (and pretty much anyone who's sufficiently neurotypical can become a decent writer; there's nothing magical involved), but relatively few are discernibly superior ones. And transformer-model prose generators are quite good.
Again, this should come as no surprise. Machine generation of classical music, for example, passed the point where human judges could reliably tell it apart from human work in the 1990s. Prose style is really not that difficult, particularly for straightforward non-fiction.
And, for that matter, how many people will care? A great many online readers seem only interested in having their emotional buttons pushed. Others are looking for information (and often not caring whether it's accurate) in a digestible form. Hell, many people would probably welcome a competent machine-written site, and it really wouldn't be hard to combine, say, trawling journal indexes for reliable sources, with LSA or similar for building a graph of relationships, with an abstractive multi-doc summarization mechanism, and then finally cranking it through a transformer stack for generating readable prose, to create articles that beat most of what's currently on the web.
Fiction and so-called "creative non-fiction" do raise the bar somewhat, though mostly for discerning readers.
1And as others have been saying for years, in one form or another, of course. I made a similar point in a presentation at Computers & Writing a decade or so ago, though in a different context and with far less analysis, since it was peripheral to my topic. That's not meant to detract from the article, which was well written and argued; if I were still teaching Digital Rhetoric, I'd probably assign it.
2The widespread opinion in college-comp circles in the US is that if you're still assigning the sort of writing exercises that are easily satisfied by existing public ML systems, your pedagogy is shit anyway. But, of course, since college comp is a gen-ed subject, there are a great many sections of it being taught, and many of those instructors don't give a fuck.
Human judges can't reliably classify prose samples as human- or machine-generated. Models trained on corpora of machine- and human-generated prose will almost certainly overfit to particular generators. Adversarial tweaks to output will make it easy to defeat detection models, if the output of those models is available as an oracle – and it has to be, if they're going to be useful for the reading public.
It's always seemed to me that before you build something "artificial", it might be a good idea to have an accurate model of the real thing!
Now.....exactly which part of human intelligence is replicated in some giant neural network? No one can tell me!!
....and then again, much of the problem described in this article seems to be that real human beings have completely abandoned any resemblance of intelligence as they use "the web"!!
P.S. About accurate models of the real thing........people started out trying to build an "artificial horse"....we ended up with the VW Golf!!! Go figure!!!
The day I was convinced that IoS had truly arrived was when I read an article on The Register about someone who was looking for sponsors to fund him riding a Segway coast to coast across the US: he promised to make money for the sponsors by "generating content every day".
This was when it was brought home that even the people planning to write for the web considered that the fruits of their labour was nothing more than pablum. No desire to actually have his (for it was a he) own voice or even seem genuinely excited by the prospect of The Great Stand And Slightly Lean Forwards Across America; nope, just whatever guff the web wanted that day.
This being The Olden Days, El Reg was being gleefully sarcastic as well.
Sure, OK. Here's the story you were thinking of.
AI's generating content that only has a little bit of useful information.
Ad networks that advertise on content and pay for impressions and clicks based on how many searchers find it.
Search engines looking for useful stuff to offer searchers to make themselves relevant to searchers.
It is one big generative adversarial network.
They start out shite, sure,
But they do improve...
Nice article, I like the angle a lot, but I am still struggling a bit to wrap my head around the math...
It looks to me that the author assumes a steady state where each IoS article produces $0.03 of revenue for a year and the business started a long time ago so that there is already a body of revenue-generating content at t=0. That's how one gets $0.03/day*(365days)*(20articles/day/writer)*(365days)*(100*writers) = $8M. I am kinda dubious that a typical IoS article will really generate clicks-through for a whole year, and the "you only need to run this business for a year" bit is not quite aligned with the implicit steady state assumption.
It seems to me more realistic to measure click-through revenue of IoS in weeks rather than years. If each article generates revenue, on average, for X weeks and the business runs for 1 year (this is a lot closer to the steady state assumption) then you'll have a grand total of ~$150K*X of revenue after a little bit more than a year. At $10/day/writer you'll break even if your IoS output remains relevant in searches for something like 2.5 weeks. Does anyone have any idea what is realistic for IoS "relevance duration"?
The junk can stick around a long time depending on what it is. Search for any reasonably common technical issue and you'll find lots of articles that are years old and weren't very useful at any time but are still in the search results and have ads on them. That page has a long lifetime. Other farms might opt for something that lasts a shorter period but attracts a wider set of people, but there will be plenty of people who will put up with the long game of finding something people will always search for and publishing as many keyword-filled posts of dubious usefulness about it.
Obviously it's a simplification, done for purposes of illustrating the general problem.
Ryan Holiday's Trust Me, I'm Lying has another take on the content-generation business circa 2010, and to the extent you believe Holiday (likely more in spirit than in specific detail) it supports the general shape of Nicole's account. Holiday is more concerned with blog-aggregator sites than with the anonymous content farms Nicole focuses on, but the economics were similar.
"Write me an insightful forum comment about how AI copy generation is going to ruing the internet"
It's important to remember that AI copy generation is still in its early stages, and it's not going to "ruin" the internet overnight. That being said, there are some valid concerns about the potential impact of AI-generated content on the quality and credibility of information on the internet.
One potential issue is that AI-generated content can often be difficult to distinguish from human-written content, which means that people may not always be able to tell the difference between the two. This could lead to a proliferation of low-quality or even misleading information on the internet, which could have serious consequences for society.
Another potential problem is that AI-generated content may not always be as engaging or interesting as content that is written by humans. This could lead to a decline in the overall quality of content on the internet, as people may be less likely to read and share AI-generated content.
Overall, it's important to approach the development of AI copy generation with caution, and to consider the potential downsides as well as the potential benefits. It's up to all of us to ensure that the internet remains a valuable and trustworthy source of information.
(I agree with all of this, so it's all good, right?)
There's a local newspaper that still exists. Their website has so many ads and trackers, at least 100 per article, that there's a nearly zero chance of successful rendering without an ad blocker. The site demands money but paying won't turn off their ads and trackers. In fact, you can make the site stop demanding money with an ad blocker because it's an outsourced test running client-side. I could be a nice guy and give them a subscription but their articles aren't any better than Tweets and bots. I just saw their science columnist describe a new energy harvesting system using the mystery unit of "megawatts per day." It was disappointing even after typos in the article's sub-header set expectations.
Articles posted on Day 1 of the fiscal year will have full 365 days to earn revenue @ $0.03/article/day. Articles posted on Day 365 will have only 1 day to earn the said revenue. Articles posted in between will have 2-364 revenue earning days within the fiscal year. Therefore, the author's simplistic calculus of multiplying daily revenue by 365 days to arrive at the Year 1 revenue of $8M betrays - ahem - calculitis. While the Content Mill business is still outrageously profitable, its actual revenue is close to half of the reported figure - $4,007,700 to be precise.
My problem isn't that I get the WWW that I deserve. No, what I get is the WWW that click-happy morons deserve. The morons don't care because they have no taste so to them any old piece of garbage tastes just like a fine steak. Before the WWW there was an inherent cost for printing and distributing garbage, but now it's instant and super cheap, so we obviously get deluged with it. But like we had with Spam, once people recognise the problem filters will get developed and the garbage generators will find another way to project their arse-gravy at us. And so the circle turns once more in the eternal war.
Doesn't matter. Most of the web audience is using their phones, and using the browser supplied with the phone OS, unchanged. When content creation is that cheap, you don't need a very large portion of the audience paying for it.
Most people don't fall for penis-enlargement spam. That hasn't wiped it out.
What a glorious, succinct summing up of the entire web. Thanks a million.
Add to that the universal "inclusive OR" search model that emits more irrelevant garbage the more specific you make your search terms, and we're probably very near the end of the web's useful life.
Just to play Devil's advocate for a minute...
Pitting AI versus AI in an adversarial way might result in better content.
If advertising revenue were increased by publishing cogent and useful articles with correct/valid content, and articles' quality were checked by competing search-engine AIs, you could get a virtuous circle where AI content generators compete to produce the best content. An AI search engine/aggregator might well check facts, that links don't go to dodgy websites, that arguments are logical, etc far faster and in more depth than any human assessor.
So while AIs can generate poor content easily, the search engine/aggregator can evaluate content and send advertising revenue to the best content. Quantity need not triumph over quality. There's no point in trying to do search engine optimisation with keywords if the search angine is rapidly and effectively evaluating content. An AI driven search engine could give you the Internet without the dross.
Unfortunately, I don't expect that particular Utopia to happen. My cynical belief is that there are always people willing to plumb the depths and subvert good things for ephemeral personal gain. Sigh.
If advertising revenue were increased by publishing cogent and useful articles with correct/valid content, and articles' quality were checked
Forget the internet, a couple of centuries of media suggests that this was never the outcome.
All the internet has done is remove the need for some skilled staff in publishing - like typesetters.
I'm afraid the article author is rather late to the party, the content mills have already gone well beyond anachronistic text sites for their revenue.
I suggest they go study the field of automated AI Youtube channels, such as Spark and Future Unity (which I suspect is a front for the CCP), with appropriate setup, you can churn out multiple apparently high production value videos a day
If I ruled the internet I would make the business models of clickbaity adverting companies illegal.
You know, stuff like:-
"What " + $YOUR_DEMOGRAPHIC + "living in " + $YOUR_TOWN +" need to know about " + $DEMOGRAPHIC_ISSUE +"."
"You won't believe what " + $ACTRESS_POPULAR_IN_1980S + "looks like now!"
and so on
Sites where on any specific page maybe 25% is content, and the rest is just ads.
Adblock is great, but when I recently installed a Pi-Hole that was even better, with on average a quarter of all DNS requests being blocked, not just ads but tracking.
(other DNS blocking proxies are probably available)