If the crawling is so problematic, and easily attributable, set up a robots.txt and take a DMCA case if they continue to crawl.
Given what the courts have ruled counts as an”effective” security measure, this should be a fairly straightforward case.
Google AI Overviews and other AI search services appear to be starving the hand that fed them. Google's AI-generated summaries of web pages, officially released in May 2024, show up atop its search results pages so search users don't have to click through to the source website. A year later, enterprise AI analytics biz …
"If the crawling is so problematic, and easily attributable, set up a robots.txt and take a DMCA case if they continue to crawl."
This. But I'd also be looking to add something akin to NOINDEX - NOAI maybe - to the headers, and a clear, human readable "not for use by AI" in every page footer.
That should make your position clear, and DMCA cases much easier. Some kind of cease and desire injunction might also be in order.
Personally, I don't care if the bots crawl my sites for training purposes. But this particular application of the resulting AI is a bit much. Note it's the application rather than the AI per se that I object to.
The obvious solution is for everyone to immediately add NOINDEX to every page of their site; Google will have to remove every page on the web from their index, and become irrelevant over night. But nobody is going to do that, because they still need traffic from Google.
Looks like the days of making web sites for fun and profit are nearly over.
robots.txt, nofollow, noindex... All ignored by the majority of these robots these days. They generally spoof the browser agent to look plausible, and use various different IP addresses from all over the world.
Fine for static sites, but problematic when multiple agents are trying to grab every git commit on a repo.
And some of them I've blocked (by ASN) months ago, and they still try. This is the last entry in my "blocked" log:
Sun Jun 22 19:03:44 2025 | 94.74.125.251 | ecs-94-74-125-251.compute.hwclouds-dns.com | 42794 | HK | Hong Kong | Hong Kong | AS136907 | HUAWEI CLOUDS | 94.74.120.0/21 | 94.74.64.0/18 | git.freebsd.catflap.org | GET /src/diff/?id=d86f022e79862e9d91be80a806a4b011359d6cc8 HTTP/2.0 | 403/403 | text/html | 330 | 1425 | - / 1 | TLSv1.3 | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 | zh-CN | SG | Singapore
How would you suggest dealing with that in a proper way?
First I remember was a sort of "digital stamp" for email that cost a fair amount of CPU time to compute and attach.
The reddit post you link to is by a deleted account (!), and the replies have many explanations of why this idea isn't workable in real world implementations.
Simplifying: Won't solve the problem and makes the user experience worse, well done.
Well yeah, that becomes the solution many go for, but I was replying to the post that said to "just use "noindex" and robots.txt etc. and everything will be ok"
I don't blame people for resorting to using these types of filters, but it is annoying me when you're working in a console, and want to Google something with "w3m" only for the destination site to fail because I don't have JavaScript.
AppleWebKit/537.36 (KHTML, like Gecko) Chrome
wait ... While Apple WebKit and Chrome are indeed based on KDE's KHTML, but Gecko is Firefox's html engine, so Apple WebKit it's not like Gecko at all
How would you suggest dealing with that in a proper way?
you should contact Huawei to tell them to correct such unforgivable mis-information.
That's how user agents work. It's a mess of annoying cludges from history. That's why pretty much everybody's starts with "Mozilla/5.0". Sometimes, I think we'd be better off if we decided to simply discontinue using the user agent altogether. It had a point, the point is broken, just hand out the document for the URI requested.
Their point, however, was that the bot did not decide to identify itself as one. The bot is masquerading as a browser, at least by user agent.
Google has a proposal to replace user-agent .. with extra headers! "client hints".
chrome sends them already:
Google has made a suggestion to replace User-Agent... with more headers! Client-hints: (CH-*)
Chrome already sends them, in addition to user-agent.
"you should contact Huawei to tell them to correct such unforgivable mis-information"
As "doublelayer" pointed out, that abomination is used by "valid" browsers.
I just checked chrome, and it "legitimately" reports the tag-soup "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
Still, complaining to a bot company that their bot doesn't spoof a real browser accurately enough is a novel approach! :-)
Not my downvote!
"The obvious solution is for everyone to immediately add NOINDEX to every page of their site; Google will have to remove every page on the web from their index, and become irrelevant over night. But nobody is going to do that, because they still need traffic from Google."
I like that very much, but as you aptly pointed out, a majority of businesses still rely on Google traffic. But then there are those that don't. Those would be consultants or professional businesses who have detailed websites for potentially new clients who have been referred, often by colleagues, and use the website to confirmation. They did NOT arrive at said site via a Google search, but believe it or not, an url on a business card. Out of my last 50 clients I've created and designed sites for, around 30 or so were like that. They just didn't give a shit about 'Google search placement' or Google itself. Most of their business connections and activities run behind the curtain, out of sight of the internet. I suppose if these people were THAT pissed about the scraping of AI, they could easily opt for NOINDEX.
"apparently people like it"
There is no "apparantly" about it. Judging from the article people like the AI summary and this is satisfying them so they don't search further.
Maybe Google has made an unforced error here - a case of the cannibals eating their cook.
Unfortunately, whenever I try a Google search on a subject that I already know about, I find the AI summary is invariable full of errors. Or just plain wrong in its entirety.
Either way it isn't reliable enough to be worth a fig tbh. The clueless and the gullible lap it up, and will then wonder why they are getting everything wrong.
"full of errors. Or just plain wrong in its entirety."
Recently a 1930s UK crime novel which had been digitised on entering the public domain had the phrase "spin a yam" (multiply with variations.)
A fairly obvious OCR error YAM for YARN I supposed as r+n resembles m in many typfaces.
Still it might have been a common humorous phrase when the book was written based on the deliberate misreading. Foolishly I tried searching from the e-reader application which defaults to Google which gave this "authoritative" AI† response without any substantiating evidence. (No relevant non-AI responses.)
"Spin a yam" is an idiom that means to tell a long, elaborate, and perhaps unbelievable story, often to entertain or deceive.
With the small footnote disclaimer AI responses may include mistakes.
"Caution: All AI responses are a confabulation of your queries and randomly selected text sequences trawled from the internet " would be more accurate.
† Authoritative AI would have to be the ultimate oxymoron with bugger all oxy.
The number of trade vans I see where the sign writer has made no effort to consider kerning is huge, There used to be one who lived round the corner not only did the name feature a rn pairing too close together they had put silver lettering on a gray vehicle. So if you could make out the writing, then you probably got the name wrong.
I always thought that the purpose of having a sign written van was so that people would see it and then call you up and offer you work?
I always thought that the point of writing on the front of an appliance was to help you operate it. But then I bought an LG washing machine that has a dark brown body with the programmes written in black ink. I've always imagined it to have been the brainchild of some artistic type in the marketing department. 'Let's not sully our lovely brown design with anything crass like text'.
What is scary is that people just accept the AI summary as the definitive answer.
I don't use it as a matter of principal but it is a laugh reading it and seeing how close (or removed) from reality it is.
Too many people just don't get the basic principal that AI is not some magical solution that "knows" the answer. It has got the answer from scraping web sites for material then simply applying some filters to decide what is the most likely.
Everything to do with AI should carry a warning like tobacco "This result has generated by AI and as such as a load of bollocks"
What is scary is that people just accept the AI summary as the definitive answer
They have long ago learnt that if it's on the internet it must be true ... provided it supports their prejudices. The real fun starts when the AI gets "better" and tailors each answer to match the viewers personal prejudices.
because apparently people like it
I don't think that 99% of the people even noticed, and simply leave the defaults. So no, probably nobody actively likes it, people think this is the new "normal" and leave it at that. Heck, I see many people with the default desktop background wallpaper ! They don't seem to know that it can be changed
In Mac FF you can just create a custom search in FF/settings/search with the following string
https://www.google.com/search?q=%s&udm=14
In Mac Safari you need an extension. I use one called CSE == customize search engine and use the same string above.
All works fine so far. I've not tried to do it with mobile but I don't do much browsing on my phone.
Thing is, for quite a few open source libraries the AI summary is actually a lot better than whatever docs the devs have time to knock together. Not least because it also references open source projects that use the library and works out what they're doing with it.
Pretty much the only time I use an LLM but I couldn't do my job without some of the docs it created from practically nothing.
... where part of stage 1 was websites trying to suppress even the most trivial bits of content being spelled out in search results so that they can harvest more clicks. Like they also stopped linking to other websites just because they might lose visitors before they otherwise would be done clicking around. Both already was the beginning of the end of the web as we (well, some of us) once knew it...
It's funny that search engine AI is now attacking the state of the web after the ad-commercialisation which was the first great wave set out to kill web functionality and usability, i.e. the core of the web.
Of course, because this world is built on commerce and commerce alone, this attack in itself is just a next stage commercialisation attempt, another fun part of which is that its principle is to damage what it lives off. A classic parasitisation process.
Yeah, sounds like search is engaging in a classic short-sighted evolution towards optimal virulence by castrating parasites (metaphorically) that could turn it right into a living dead IT version of Chesapeake Bay zombie crabs ...
Gotta pick and choose 'em beans just right you know ... with "cocoa" we get the delicious aromas and flavors of chocolate, butt "cacao" yields naught but the stinkers of stage 2+ enshitification, iiuc!
needs to make like the Oozlum Bird ( https://en.wikipedia.org/wiki/Oozlum_bird )
AI is already flying in ever decreasing circles as it consumes more and more AI generated crap treating it as valid input
"The oozlum bird, also spelled ouzelum, is a legendary creature found in Australian and British folk tales and legends. Some versions have it that, when startled, the bird will take off and fly around in ever-decreasing circles until it manages to fly up its own backside, disappearing completely, which adds to its rarity"
This is not as bad as it seems.
Web sites cost money per user access (bandwidth). Users who only need to see a summary are now not going further than Google Search. If they did, their access would be superficial, just to check something, with no benefit to the website. So Google is paying for the bandwidth use of superficial users, but those who want more are still going to the website. This may be one of the few benefits of 'AI'.
Superficial users will also not look at adverts. They want one piece of info. So advertisers will now only be paying for visits by users who dwell on the site.
In 2023, Canada enacted the Online News Act. If it sounds Orwellian, good. The government decreed that major search engines should pay the news sites that they link to.
To no one's surprise, except the Canadian government and the news sites themselves,, Google wasn't interested in paying sites that they were giving publicity to, but if they had to, they'd just not link to them. So, they simply removed the links to those news sites.
Even less surprising, the smaller news outlets suddenly found their traffic drop by as much as 90%. It turns out demanding that big search engines pay for the right to link to you isn't really a compelling argument when there are literally millions of other news sites that they can just as easily link to for free.
> Google wasn't interested in paying sites that they were giving publicity to
That's an interesting take on what happened. This particular wording makes it seem like we should all be happy to merely "work for exposure" rather than for currency... that everyone should be delighted to be featured on google's front page because of the infinite benevolence of that company. Observe how well this feudal lord treats its peasants, it gives them 'exposure', free publicity. Publicity doesn't pay the bills, buckaroo!
The reason for that act was not about _linking_ to news sites. It was about google demanding that it could use content from those news sites on its own page, so that click-throughs to the news site didn't even happen. It was about google asking for free stuff.
_Another_ interpretation one could come up with is that google deliberately de-prioritized those sites to 'teach them a lesson'.
But now I'm being "mean", aren't I? Google would never pull dirty tricks like that. Their search results are totally, completely organic and objective and not influenced by humans _at all_.
The point here is not that news sites need google. The point here is that google is your new daddy, and you'll bloody well do what it says you'll do, or else...
What this tells you that there is too much power in the hands of a single entity. Canada should have gone further and blocked google wholesale.
And you've gone too far the other way. The law and the politicians who called for it, both the Canadian version and the Australian version which was almost the same, specifically said that they had to do something about the problems of linking and quoting. If they had just stopped at quoting, maybe they'd have had a point, but they called out linking from Google and large social media as a problem requiring compensation. They were surprised when the response to a law that was passed because linking is bad was that the people they spent months saying should stop linking or pay for doing so decided to stop linking.
The problem was stronger on the social media side, where people, including the social media companies themselves, had an incentive to copy the entire text of a news article and put it on that network so people stayed there instead of going to the paper's website. Sometimes, to try to make their act less obviously illegal, they'd only copy the first four paragraphs, thus removing important context without cutting it too short that anyone was motivated to leave. Obvious copyright violation though that is, I can see why someone might decide another law is needed to do something about it. It is, as you will have noticed, entirely separate and opposite to linking, which would be the good thing they could do instead, and that's what many search engines were already doing, quoting a couple, possibly cut off, sentences for context but requiring people go to the site to read the whole thing.
And yet, the laws were written so badly that the only way to interpret them was that linking was the problem. What the backers of the laws wanted was to take money from tech and give it to newspapers, and that's why they included linking, but the result is that the law effectively tells you that including large newspapers in search results is a bad thing and you shouldn't do it.
You're absolutely right. The legislation was framed so poorly that it treated linking, arguably the most constructive and least intrusive aspect of content sharing, as something that required control or compensation. Unsurprisingly, the result hasn't been improved funding for journalism or better copyright protection, but rather a significant reduction in access to diverse sources.
In my own case, the impact was immediate. YouTube on my Nvidia Shield effectively stopped showing international news content. What now dominates the feed is over 90% CBC, a state-funded broadcaster whose editorial tone often leans heavily into “The Government is great” and “Scandal? What scandal?” territory. That might be tolerable if it were one voice among many, but it's increasingly the only voice presented.
The legislation didn’t address the actual issue, full content copying on social platforms. Instead, it targeted linking, which is how most users discover and access journalism in the first place. The result has been a narrowing of public discourse and the emergence of a curated, one-sided news environment, which is the opposite of what a healthy democratic media landscape should look like. And the tinfoil hat wearing part of me suspects that was by design by the Canadian Government.
<...."It was about google asking for free stuff.".....>
Google didn't ask. They simply took with neither a please nor a thank-you (hardly surprising as I am pretty convinced that neither term exists in the vocabulary of anyone at Google. . . . or indeed at most large tech companies).
I remember trying to figure out why all the news agencies were so mad that they were being linked to by all the big tech companies.
To me it sounded like they were getting free marketing, at least from what was being reported at the time.
I was thinking maybe they (Google, Facebook et al) were stealing and posting the content from the news sites, but that was not the case?
Maybe they were being charged for the links?
EDIT: OK, after reading some more posts it was a blend of stealing content and linking. It appears that the legislation should have left the linking and blocked the content theft but killed it all.
sans-serif fonts are all the rage these days and for the life of me I still cannot figure out why. I've had to do multiple "cyber security" trainings (and those stupid phising test emails) for work (both my employer and our customers) where they explain you have to be really careful when reading email addresses because the l, I and 1 may look very similar and my immediate thought was: Wouldn't have that problem if you chose a frigging serif font to display those email addresses. Makes reading easier too. But no, we're stuck with stupid sans-serif fonts because some numbnuts decided for us that this is what it should be. Can't change it either as far as I can tell (certainly not on the work lappy)
Research (no citation, sorry) has yielded that serif fonts are easier to read on paper, and sans-serif fonts are easier to read on a screen.
And then of course there is fashion. Nobody in 2025 would take a website using "Times New Roman" as their typeface serious (unless they use it for historical reasons).
Sans fonts were easier to read on a screen AT LOW RESOLUTION. And that is very important distinctions. With modern 1080p and higher screens, the advantage of sans fonts disappears and Serif fonts becomes easier to read again even on screens.
"Nobody in 2025 would take a website using "Times New Roman" as their typeface serious (unless they use it for historical reasons)."
Maybe I'm weird but I see no reason why I shouldn't take it seriously. Times New Roman is still a perfectly fine font.
Nobody in 2025 would take a website using "Times New Roman" as their typeface serious
Explain, please?
On my admittedly text-heavy site, I generally just use the default font [0] (example here). The default serif font varies among browsers and OSes, but Times New Roman appears to be a common choice.
I suppose there may be those who figure such sites haven't kept up with current trends, and they are arguably correct. But that's not the same as not taking the site seriously.
[0] Come to think of it, I made a rare exception on that page. I knew some users would find the whole thing daunting, so I wanted to show the words "Don't Panic!" in large, friendly letters. The font I chose seemed suitable for that purpose, at least to me.
If the point of AI is to overwhelm the internet and funnel everything through google it's working. Unfortunately, that turns the internet into a sales conference for influencers and "others". The internet is simply not useful for information any more. It has, in fact, been weaponized against the world by design.
>The internet is simply not useful for information any more.
The issue isn't that "The Internet" isn't useful any more, its just that you now have to battle through layers of rubbish and misdirection unless you know a source and so can go directly to it.
Google is effectively killing itself, if people don't scan at least the first page of search results and visit a few likely looking websites then why pay to advertise on the Internet?
The simplest response is often the most effective. Simple actions have a fractal like response. That is happening here. Even your old forums are now targeting and manipulating you. This is not the old internet. When you enter the internet today, you are entering an actual digitally produced hallucination. One that was "seeded" by entities or persons for a reason.
You are in a feed back loop. Where did this information come from? An influencer? a 15 yr old? a nkorean? a russian? perhaps just some young man with too much testosterone that wants to start a ruckus for "fun". Where did this information come from that was so reliable?
Then, your actions become the target, by very definition. You become targeted. This is even on your old reliable forums and such. The whole thing has collapsed already, and is literally feeding its self. It's eating its own falsities and hallucinations, then spitting it back at you, but with a goal. It has a purpose now. The purpose of the ones doing the targeting. advertisers and influencers are actually chinese and iranian operatives in reality.
Even your job seeking apps are now spying and targeting you. This has a HUGE influence on society. An influence on what people think and feel. This "hallucination" is now peoples reality. Lots of people, walking around living in fantasy worlds of their own creation.
I wonder if most web searches these days are for things in which bad results are somehow still acceptable to users ("How old is celebrity X?") or not.
I never use AI summaries because when I search the web for facts, I need facts ("What time does the Post Office close today?"), not might-or-might-not-be-facts.
Even the straight (non-AI) web isn't trustable.
The other day (Saturday) I did a straight web search on Post Office hours, and it was reported to me that the nearest branch closed at 14:00. I rushed around, did things, and got to the Post Office at 13:30. Adhesive plastic lettering applied to the inside of the glass door reported that they closed at 15:00 on Saturdays.
Indifferent, lazy, or overworked staff -- all too-often -- fail to update signage when open hours change.
Who/what ought one believe? The possibly out-of-date signage, or the possibly-out-of-date web page?
I poked my head through the door, asked the clerk, who had seen me examining the signage, and got A Look, along with the answer, "3 PM." I didn't bother to explain.
Google Query: "What time does the Post Office close today?"
AI Summary: The Post Office is a service run by the government and...40 paragraphs later...and is still often used by people today.
Ad 1: We have the best What time does the Post Office close today? you can get - sponsored link
Ad 2: Try our new What time does the Post Office close today? and get a free coupon
Ad 3: Top 10 things that What time does the Post Office close today?
Ad 4: See what What time does the Post Office close today? looks like today
The problem with your example is, it's completely false. And it's trivial to test and demonstrate that it's completely false.
Instead of sitting there making up bollocks, just try it yourself. Ask Google "What time does the Post Office close today?" I just did, and I got exactly the answers I would have needed if I actually gave a rat's arse.
I work at a law firm. I had a colleague who bills at $900+ per hour use Google AI to get an address for a regulator we needed to send mail to. Address was wrong. Fortunately I knew that and didn’t let her change it to the made up address, but it is infuriating anyway.
I asked another colleague who bills for almost that much how to cite a particular document in the stupidly complex and obscure citation style for US legal writing. The colleague said “let me ask ChatGPT” -_-
But just because it is bad for the website owners doesn't mean it is good for Google. Their main source of revenue is from those search referral fees that their AI is hurting. If they made a "perfect" AI that simply told you the answer how could they possibly monetize that as well? You aren't clicking on multiple sites and seeing a bunch of ads. You're just getting the answer. Maybe that answer includes a single link you click on - if you asked it what the best refrigerator to buy was given all your criteria and it gives you a link to the place where you can buy it at the lowest price.
But why should that site pay Google for the referral? If it was chosen by the AI because it best met the criteria, not because they paid Google to influence the AI's decision, the AI was just giving "best" answer and if it says an LG refrigerator best meets my criteria why should LG pay Google for telling me what amounts to a "fact"? (the AI would consider it such and if I trusted that AI I would also consider it such) Why should the site that sells it for the lowest price pay Google either? They just have the lowest price, another "fact". LG and the retailer would be getting that referral/sale simply because they made the best product for my criteria and sold it at the lowest price.
Seems to me either Google makes a lot less ad money from search in this new world, or they are charging companies to influence the answers the AI is giving - that I'm not getting the best refrigerator that meets all my criteria but I'm getting an LG because they were willing to pay a lot to Google to steer queries like mine in their direction as much as possible. Of course then I would no longer trust its answers...
The only way it works is if I pay Google to use their perfect AI. It is as if there was some uber expert on refrigerators who had memorized every company's lineup and where the best prices for each model could be found, so he could answer that same question for me. He wouldn't expect LG to pay him for sending me their way. He'd expect ME to pay him for his time/knowledge!
Seems to me you haven't read the article that coined the term "enshittification", because you just described it pretty fairly without using the word. It's worth a read.
This post has been deleted by its author
"But just because it is bad for the website owners doesn't mean it is good for Google. Their main source of revenue is from those search referral fees that their AI is hurting."
Companies are working hard to invent the self-licking ice lolly. Companies aren't going to buy ad-words or premiums to have their sites featured on the first few pages of search results when they aren't seeing referral counts from the search engines. I stopped using Google ages ago as the search results were obviously paid spam. If they now are optimizing their site to get more clicks rather than click-throughs, there's no point to the business.
I also find the AI summaries to be useless and mentally just filter them out. And no, I'm not looking to buy imported tat from Amazon (there's the first 4 pages of search results I can skip).
Lusers like AI summaries. It has happened, it is here, there is no going back. The TechBros are not going to let go of that, they will learn to monetise it. "Let my AI summarise your marketing puff for you. Cheap at the price!"
Frankly, if the paid clickbait migrates to the AI shite and search engines return to delivering actual search results, I would be one happy bunny. Oh, what, you can't shove your shit in my face any more? My heart bleeds for you - not.
Googles AI summaries are just terrible!
Open Claude and as the same question and get 100 times more accurate answer!
Even Gemini, a Google product is better than the search AI summaries.
If I am looking for a work-related answer, I go straight to AI (Claude preferred) as it saves time digging through search results from Stack Overflow or (God forbid) Reddit!
You just have to know when the AI takes a wrong turn. It happens and if you don't identity it early enough it goes totally bat shit!
As far as news or other types of searches (e.g. when is a store open) I just ignore the summaries.
If you know enough about the subject to be able to spot the errors, you probably don't need to ask the question in the first place.
If you don't know enough about the subject to be able to spot the errors, then the search tool isn't going to be much help to you, because you will be left wondering whether the answer it has given you is entirely correct or not.
I preferred it when we had search tools which would give you an answer (or at least links to an answer) which you could almost always rely on being correct.
"I preferred it when we had search tools which would give you an answer (or at least links to an answer) which you could almost always rely on being correct."
More importantly, a complete answer since many things can't be answered in one sentence if they are very important questions. Some supporting material is required so you know you have an answer you can use.
More importantly, a complete answer since many things can't be answered in one sentence if they are very important questions. Some supporting material is required so you know you have an answer you can use.
Of course, you will have to recognize that the "supporting material" is not "spinning a yam".
Yes, stage 1 enshittification has happened. But for the time being at least the AIs give back to ordinary people the search that they want/need. In complex queries they may invent answers, but in general they provide people with the information they seek. Whereas traditional Google search does nothing of the sort. At best it refers users to relevant web sites, which may or may not contain the required information. More often it will refer us to an irrelevant website, or worse a f****ing YouTube video which has to be waded through to discover whether it has anything of value in there or not-mostly not, but that's several minutes of your life wasted, and you still have to go through it again and again till you find a half decent one. With luck the AI will just give you the damned answer.
> With luck the AI will just give you the damned answer.
In my experience, it's certainly not a correct answer. I search using google daily. I have never seen an AI answer that was correct.
I've seen about a dozen where I was surprised and I THOUGHT were correct, but I did more digging, and they weren't.
Never seen an AI answer that was correct? Really? It's not always perfect, but it's often enough or at least a good start for my kind of query.
I totally prefer AI answers over a list of promoted sites and optimised ad bucket sites. Google didn't put AI at the top because it wants to give you AI answers to demonstrate how wonderful its AI is or something, it did it to stop a migration to alternate AI answer sites. It's a loss of revenue but that's better than a total loss of relevance. Goggle want to remain the default search engine.
There's plenty of problems with AI, and we are in the early days when everything changes every other week, but this is better for me at least.
Try searching for a few things in an area where you have a great deal of expertise.
You'll find that the AI gives subtly to very wrong answers most of the time.
Once you see that in your areas of deep knowledge, consider how you'd tell the accuracy when you do not already know the answer.
This was my thought. I do wonder how many of the down voters are techie commentards ( whic his what you'd expect here of course) using AI queries for fairly obscure stuff that probability based searches, either Old Skool search or AI would not find. But for "How do I get this damned OXO pepper grinder open to refill it?" type of queries work well- mostly, whereas Google search will sideline you to a YouTube video about how to use the thing (you twist the top)- the video takes 3 minutes and two adverts to say this" but which doesn't tell you anything about refilling it after you've sat through all the obvious stuff that the talking face has been spouting.
When I do searches, I don't know the answer. A wrong answer is not any better than a useless answer. In the case of your pepper grinder example, I don't expect the AI to give me a correct answer tailored to the model of grinder I have, but instead what tends to work on most grinders. Twisting the top would have been my guess to open it, and then I expect there is a hole where I put in the new peppercorns. If that's all the AI summary says, it is no better than my guess. So just to check, I put the prompt "How do I open and refill an OXO pepper grinder?" into some LLMs, and that's what I got, paragraphs to tell me to open it and put the peppercorns into the hole, which is located either on the top, bottom, or side.
Meanwhile, if I ask a question about something more specific, the AI summary is more often wrong than useless. Either way, the answer is so frequently wrong that, when it is right, I can't count on that. Meaning I have to look at results to try to figure that out which is what I was going to do anyway. Maybe you're getting better results, although the people I know who trust LLM results have often found that following those instructions has given them results they don't appreciate.
@Gene Cash
"I've seen about a dozen where I was surprised and I THOUGHT were correct, but I did more digging, and they weren't."
IMHO, those are the worst, when not obviously wrong likely to hoodwink the casual user in a hurry ... No problem if an error some say, well it is if someone was looking up an aspect of legislation.
.. I know there's all sorts of caveats about AI maybe not being correct (covers the back of Google or whoever) but we also know plenty of people will assume it's correct despite that "here be dragons" *warning.
* though sadly a far less scary warning given.
In the words of the great Sherman Potter, Horse Hockey!
If you form your query properly you will get a proper answer. In a lot of cases the answer will be correct but may have a misspelled command (an 's' on the end where it should not be)
In my experience, the LLMs are about 80% accurate. You just have to realize the things they are completely clueless about as "they" won't tell you that, they will just make shit up!
If you're warning us that they can always make the AI answer worse, granted. However, it's already annoying for most of us because the answers are often wrong or useless. For example, I recently deliberately ran an AI search through Google and Perplexity because I wanted to know how a certain technology works, but all normal search engines sent me to pages that had no interest in the technical details. What I really needed was to go to the effort of finding the patent for the technology or some introductory material describing the mechanism and get a summary of that, but that wasn't very easy to find in search results (including the word "patent" in the search query just found a lot of pages telling me that the mechanism concerned was in fact patented).
Seems like something that search-equipped AI agents should be able to do. They can find which pages have that information and distill the information for me, right? And I expect that they may get the information wrong, but at least Perplexity cites sources, real ones that actually exist, right? So I run the query and the LLM happily tells me that the mechanism works "electromechanically". Which is true. Yay, the LLM didn't make anything up. Of course, that's hardly any more detail than saying the thing was built out of atoms of some elements mixed together, or in other words, the AI search result was even more useless than the search results I had before, but it was pretending not to be.
In other cases, the answer is simply wrong, but people who assume that, since it looks plausible, it is probably correct end up acting on it anyway. Ideally, you get the correct answer to your query, but you're probably not getting it as often as you think.
Yes, the AI summaries often contradict themselves. This may be a good thing, as it reveals how little intelligence there is behind the scenes. But it may also be a bad thing when people lacking in intelligence use these AI summaries. Examples I've seen include things such as claiming that half of 1.8 is 0.7.
What you hit here is an area that the AI has absolutely no detailed technical knowledge of. IT could not find exactly what YOU could not find!
Considering that many research documents and other such data are behind paywalls on the internet is probably the reason the AI had not ingested the information you needed.
You wanted the technical details of a patented product; I doubt the patent holder would be pleased if you were able to get that info via web search or AI?
Perhaps you are unfamiliar with the concept of a patent, which means there is a public document out there describing how the thing works, in a public database. That's what you have to do to get a patent. I think the company concerned probably didn't want that to be easy or they would have put their patent numbers on their website like others do, but it's still out there and available to me. Eventually, I can find that patent and read it to learn what makes their thing different from others. I was hoping to avoid having to do that manual search by using a tool which is supposed to be better at searching than I am and was probably trained on a bunch of patents including this one (from 2016) since they are public documents. It wasn't, hence why I had to do that search, which took a while, but now I have the details I was looking for.
And guess where I got that patent file eventually? I tried going to a few patent offices first, because they store the databases. Their search systems aren't very good, though. That means I ended up finding the patent number in question at...drum roll please... Google patents. You think they might have trained their own AI on that? The point being that this is yet another case where AI will try to answer a question it has no ability to answer and prove to be useless.
would anyone want or need to use Google search?
It's a long time since it offered significantly better search than other sources.
The fact that loads of people use Google out of ignorance or unthinking habit doesn't make using it somehow 'okay'.
Shit is shit, with and without added glitter - why eat it?
BTW, I stopped using Google years ago. It doesn't miss me, and I definitely don't miss it.
White we are on the subject... recently, early in the morning (3am) I needed something to eat in an area I haven't visited in a while.
So I searched Google Maps and I clicked the Restaurants button and specified "Open now" as a filter.
It showed a lot of convenience stores, gas stations, and hotels in the RESULTS LIST (not restaurants) but only one actual restaurant. I was like "wait, I KNOW there are Arby's, McDonald's, Wendy's, IHOP, Denny's, and Waffle House here at the very least, where the hell are they?"
It wasn't until I did an extreme zoom, where the scale bar was 200ft, that I could see them. This is the level of zoom that pretty much shows the restaurant building and not much around it, so it's not very useful. When I zoomed back out to normal, they disappeared.
And yeah, I tried again after clearing my cookies & site info to make sure I wasn't logged in getting "personalized" results. And used the "Open 24hrs" filter as well. Same result.
This enshittification has gone so far that some search engines are active enemies of web site owners. (I find some search engines better so these are worth identifying.) In other words you don't want them, they give negative value.
Suggested actions by web site owners.
0. Find other ways to get visitors.
1. Evaluate search engines identify the friends and foes. This article helps. Promote those that are good citizens if you choose.
2. Identify visiting ML engine scrapers (may be same as search). User agent, IP, redirects, DNS lookup, automatic IP updating, ... you decide how much.)
3. For hostile scrapers deliver ML / LLM poisoning content, or blank page if you choose, and/or robot exclusion instructions. (Hand written, algorithmic, gradual change, ...)
4. Manage the system.
Clicks are down, but conversion rates are significantly higher. I don’t recall the figures from the presentation, but it was significant. The discussion quickly turned to how search engines can monetize that and deeper integration into websites and commissioned sales, and that’s far more interesting/concerning.
Google (and Bing) are going to try and capture a percentage of sales driven by AI to offset the losses as traditional keyword advertising spots dry up. Search engine rankings/AI inclusion is going to be predicated on embedding complex, granular tracking tied directly to financial data.
Investors are already searching for companies that are able to help that happen. If you know anyone in the sales tracking sphere, it’s probably worth keeping an eye on them. A bunch of money is coming that way.
A few days ago one of my tech buddies told me to add '-ai' to my search queries to prevent getting 'that shit at the top of the search results" and it appears to work. I know next to bog-all about search engines, so I don't know if this just doesn't show AI results, or prevents the engine from going down the AI part. Any search geeks know the answer to this?
Things change as technology develops. Web sites will adapt to sell their stuff or their message. The real danger has existed for some time now and it is the filtering by a small number of companies. Now we will get narrative bias and selection as well as who gets to the top of the search ladder. We don't need more control of the Internet, we need control removed.
This post has been deleted by its author
"A Man and his Wife had the good fortune to possess a Goose which laid a Golden Egg every day. Lucky though they were, they soon began to think they were not getting rich fast enough, and, imagining the bird must be made of gold inside, they decided to kill it in order to secure the whole store of precious metal at once."
Google search used to be my go-to for info (drivers, specs, repair manuals) about computer gear. No more. Even before it showed its AI chops, Google searches returned not the info I wanted, but instead directed me to the home page to sell me something. I now use Duck-Dick-Go in Firefox to fine the info I really want. This is a real pain-in-the-ass, Google!
Google has long since ceased to be a search engine for specific querries. i have initiated searches on the products of a specific manufacturer, and found that failed to find items from that specific manufacturer on the first two pages of returns, but many for items supplied by manufactures in the same or peripheral fields I have, for effect, at times just put in a query using the manufacturers name, and have it come up well down the listing.
I am increasingly turning to chatGPTto provide results unbiased by sales push.
Google's AI overviews are, in my experience, almost invariably nonsense.
Small example: I searched for "next Cumbria SRP playing day" (SRP is the Society of Recorder Players). AI Overview told me that there was one on a date, which, had it existed, would have been six months earlier, taking place in a church which does not exist in a town where Cumbria SRP does not meet and followed by a concert given by musicians who do not exist. All in quite plausible English, but that's the most you could say for it.
So, Google AI results have decreased the footfall on the very sites of those who pay Google to INCREASE that footfall? That is a fucking epic example of shooting yourself in the genitals <ROTFPMS>. Well done Google, you created a market and now you are destroying the same market. So, what are your AI's gonna scrape for info when everyone blocks your scrapers because you are costing them money and giving nothing in return. The worlds biggest Freetards hoist by their own idiocy.
I'd be interested how this affects companies such as SkyScanner who market the search data - click-throughs, IP addresses/locations, conversions to actual bookings. How much of their direct traffic has been taken away?
Of course, there may be another non-IT related reason why US Tourism and Travel searches are coming down...
People have good reason to avoid clicking through to sites in the search results. Said reason being that most of the results are garbage. These days most results are to sites specifically tailored to hoover up as much traffic as possible whilst having no real value to the reader. They exist to push affiliate links or ads at naive users and nothing more. Commercialising the web has destroyed it because every grifter sees it as easy money and they don't give a shit that they're destroying the commons.
Google needs to grow, and that pressure eventually leads it to try and replace the one thing it feeds on - the rest of the internet.
It's been pointed out repeatedly that Google has a policy of trying to prevent people from clicking out of search and into results, but then denied by the company and it's workers. They insist that search results are better than every (subtly not the same thing), and remain wilfully oblivious of the the serpent eating it's own tail.
This is ultimately unsustainable. People can't make creativity pay, but the companies that dominate the space can't make creativity 'cheap' enough to sustain their business models. In the short term, this is starving us of quality content, product innovation and new entertainment. In the long term - who knows? We'd hope something new, but whilst creators remain passive and dependent on the big players, all we can expect is continued decline.