
s/for better or//
AI search is not search.
Where did I put the datasheet for this part?
I don't know, here's an imaginary datasheet for a part that doesn't exist! Hope that helps!
Ask Google's Bard chatbot about the future of search and you'll get a summary of trends that suggest there's more to search than finding keywords in an index of documents. It will mention the advantages of conversational and multimodal input, of personalization and its role in prediction, and of integrations with other …
'Hersh told us, "It is often important when we search to know the source of the information, and what backs up what is claimed in a source [... "] '
Correction. Given the amount of total nonsense on the web, it's always essential to know and verify the competence of all sources of information found via search -- and all the more if it's summarised by some dumb Markov chain mangler.
This might be a revolutionary idea but how about a search engine that doesn't try to double-guess what the searcher wants but just implements the search, respecting any logical operators such as "and", "not", etc? One that just returns valid results, including nothing when nothing is the correct answer? And doesn't push sponsored results? And isn't easily gamed? One that doesn't prioritise hotels and estate agents' sites when asked about a geographical location?
To be called search it should also be able to tell you exactly how many matches there are. With AI search, it's not possible.
Is that a complete list of everything matching my query? Nope! Would you like some more?
Can you give me a complete list or at least tell me how many there are? Nope! But I could guess!
Are you any use at all? ... How about some Toast! Does anyone want Toast?
There was one in the early 90s -- Infoseek. It accepted strict Boolean queries and responded with exact matches. The big problem was that most folks didn't understand how to construct such queries, so the "free form query" approach requiring the engine to subjectively interpret what was submitted took over once web search acquired a mass market. There was of course nothing to stop subsequent search engines (Gooooooooooogle included) also accepting strictly Boolean queries (indeed Gooooooooooooogle used to) except that it inhibited the artificial promotion of results profitable to the engine provider (which is probably why Goooooooooooooogle progressively started to ignore the Boolean and other conditioning flags it provided).
Ultimately, it's all about revenue, not service to users.
Then the revenue model needs to change.
There is an inherent conflict of interest between Search and advertising.
Apparently there are some subscription-driven search engines (I forget the name, one was mentioned here on the reg earlier) but I don't know how good they are.
One annoyance is that a lot of sites have anti-scraping measures in place, which are configured to grant exceptions to Google and Bing only.
I agree with that... I fondly remember the time when Google was actually searching what I told it to search and not what it thought I wanted to search. This worked really well for a few years and if you knew how to use all these operators you could find almost anything (then again, this was in many respects a much simpler world than today's :-/ ).
Having said that, there are quite a few searches I've done over the years which would have benefited from a dose of AI. A year or so ago I faced a knotty problem with an older Excel version (but only with that version) and no amount of googling would deliver a page with a solution. After a frustrating 20 minutes I turned to ChatGPT and had my answer with two minutes. So there's a place for AI-assisted searches.
> how a page missing from Google search
I didn't say that the page (or pages) ChatGPT based its answer on was missing from Google.
I just said that after 20 minutes or so of googling I had not found an answer to my very specific problem... the reason being that Google spewed out many dozens of pages dealing with Excel and the problem at hand... but all (or at least all pages I checked) were for much newer versions than the one I had to deal with (MiL with an old Windows XP PC, you get the idea:-/).
ChatGPT simply was much better at "filtering" all those pages and concentrating on the ancient Excel version I was fighting with. HTH.
well, I know two, one being me, myself and I, and the other being my better half. Each of us use the bots (if and when), mostly copilot, to get some specific answers (on unrelated subjects, btw.). And yes, we're more than aware of the hallucination thing ;)
When used as a tool to find very specific information, it can at times rip through vast amounts of information. I more then once looked for very technical details that are even hard to find when reading dozens of domain specific papers that are published in high citation journals. When smartly asking questions, interpreting results, reformulating questions a number of times and most of all use that information to define optimal keywords in conventional search engines, it often is possible to find the information I seek in hours rather then weeks to not at all.
Perhaps groping randomly through all the rubbish on the current web isn't the best - or even a good - way to find information.
Early on, there were a number of excellent sites that provided curated lists of quality content. While that model doesn't solve all the use cases for search, it did do a much better job of some of them.
As the article says AI "search" costs 10x as much as conventional search, and AI services are loosing money, so it's going to come down to whoever has the deepest pockets to subsidise it until one two things happens:-
So I reckon google having the most cash to burn will stay on top.
This is the point. Search engines were built to get you to pages written by people, with the most linked to pages given search result priority, on the basis that these were the pages that had been found to be the best source of information on that topic. Such pages also tend to lead you on to other interesting stuff that is related to whatever you were interested in.
"AI search" on the other hand is an unwanted attempt to supersede all that by generating content from many sources on a topic, dishing out results that read like Wikipedia on valium, and lacks all the personality, quirkiness and random other stuff that you might have got from visiting the pages it has digested and blandified. Either Google will have to have an "old skool" version of itself alongside its LLM based new version, or somebody else will create one and quite possibly eat their lunch when it's realised which is more useful.
In my opinion, the average internet user often glides through the labyrinthine legal language of terms and conditions for online services, routinely clicking 'agree' without fully comprehending the implications. Their main objective? Swiftly creating accounts on platforms like Facebook, YouTube, Twitter, and others. This rush often sidelines any inclination towards fact-checking or scrutinizing the credibility of the information they encounter.
Some argue that search engines like Google provide better transparency as users can fact-check and evaluate the reliability of sources. However, the reality is that most users prioritize convenience over meticulous source verification. This preference for ease of use is further underscored by news journalists who occasionally rely on anonymous sources for their stories, leaving readers in the dark about the authenticity of the information presented.
In light of these observations, it's apparent that the average internet user prioritizes accessibility and speed over rigorous source verification. As we navigate this landscape, it's crucial to consider how artificial intelligence (AI) is reshaping the way we access and consume information online. AI-driven platforms have the potential to streamline the information retrieval process, offering valuable insights and opinions while simplifying the user experience. This shift towards AI-driven solutions prompts us to reconsider traditional notions of source verification and highlights the evolving dynamics of internet usage in the digital age.
Furthermore, it's natural for people to seek opinions and advice from others, whether it's friends, family members, acquaintances, or colleagues. In face-to-face conversations or over the phone, we engage in discussions, listen to different viewpoints, and make judgments about the information we receive. However, traditional online search engines don't replicate this interactive experience. Instead, they present static webpages of information that users can't engage with in the same way as they would with another person.
AI changes this dynamic by allowing users to interact with the technology in a more conversational manner. With AI, users can communicate, ask for clarification, seek additional information, challenge viewpoints, or explore alternative perspectives. This capability transforms the way we engage with information online, providing a more dynamic and personalized experience that aligns with natural human communication patterns.
"With AI, users can communicate, ask for clarification, seek additional information, challenge viewpoints, or explore alternative perspectives."
I suppose so, but since there's absolutely no way to know whether the information an LLM gives you is useful or just randomly generated nonsense, the whole thing seems completely useless.
"I suppose so, but since there's absolutely no way to know whether the information an LLM gives you is useful or just randomly generated nonsense, the whole thing seems completely useless."
And LLM are almost as useless if you have to put the quite considerable effort to verify or even define the sources of both data used in the reply and the training model - might as well have done it yourself.
But I think that sums up LLM: A diversion for the easily impressed, those with short attention spans, and those who really ought to do a lot more work before making any decisions (all of which curiously enough sum up both investors and politicians).
"I suppose so, but since there's absolutely no way to know whether the information an LLM gives you is useful or just randomly generated nonsense, the whole thing seems completely useless."
Really? There is a very simple way to verify if the information an LLM provides is useful: enter the keywords in it in a conventional search engine. Simple as that.
If you use the tool as a stand alone tool, it stinks. Half of the information is junk, over a third is useful and when learning to ask good question a tenth of its results are plain brilliant. To sort that out, you still need to have skill and put in effort to validate the results. Checking its claims by tracking actual sources in conventional search engines is part of that.
I can easily see many people say this defeats the point of using LLM for search. For me it doesn't. It can provide very good hints to what to search for, leading you to clues and answers you didn't had in mind when starting the search. In a sense, conventional search engines mainly provide the answers you seek for. They are designed to do that. That gives you no results if there are none, or faulty ones if many sites share the same wrong presumptions as you do.
Summarized, if you don't know well enough what defines the answer you are looking for, conventional search is tedious and often of limited use. LLM can be used to explore the topic and help define the keywords describing the answer you look for if you are willing to put the effort in validating results and refining the questions.
"Too much balanced dense argument here. "
There isn't much by way of argument. It is suitably vague and nondescript, so it is generated by AI. Then there is the use of words such as "labyrinthine", one that I find frequently crops up in AI-generated student essays.
"The question is how long can they <Microsoft> keep burning money like this?"
MSFT net cash from operating activities was $87bn for the year ending 30 June 23, and that had trebled compared to the figures for a decade ago. So their ability to fritter cash is totally unconstrained by losing $80 a month per user in one smallish area of activity, and what's more the Copilot losses are already included in that $87 billion (which works out near enough exactly at $10m cash generated every single hour). So this time tomorrow + one hour, even with Copilot losses MS will have generated an additional quarter of a billion dollars of cash compared to the pile they have now.
I'd hazard a guess that the $80/user loss will soon change anyway, (a) when MS work out how to exploit customer data for their own benefit, and (b) when they start to force corporations to pay for Crapilot through increases in Office licence fees. It's not like their core enterprise customer base is going to walk away and adopt Wordperfect, Google Docspace, or LibreOffice.
"It will mention the advantages of conversational and multimodal input, of personalization and its role in prediction, and of integrations with other services. It will even touch on ethical considerations like privacy, bias, inaccuracy, and disinformation."
No. Just no.
Just fucking tell me what my search terms pitched up in your indexes.
I'll do the rest.
-A.
Every single commenter on this article has a point, as does the author of the article itself.
Yes: trying to be too clever with the search terms, showing irrelevant or just plain wrong answers, and the fact that AI search isn't actually web search at all.
These things make it easy to get answers to quick, atomic questions like Kim Kardashian's hip measurements, which are what millions of searches boil down too. But all of these bells and whistles have also ended up making it almost impossible to actually find things online any more.
Thinking about all of this, we have been developing a search engine based on filters called El Toco. By giving you a better set of filters, all of the personal data and AI-related stuff that people find annoying can be elided. Because instead of guesstimating what people want, they have a user interface so can express it for themselves.
We launched in October last year. The project has been steadfastly ignored by venture capitalists because, at the time we were fundraising, they were all into crypto. Now they're all into AI, which we've also downplayed. The result is that we had to niche down and focus on medical and scientific equipment, which is fine but not the original aim of the product.
The point of my comment is to confirm that some people in the world *are* trying to solve these problems with web search.
It's just quite a contrarian viewpoint, so has proven difficult to get people's attention.
(Also we have a miniscule marketing budget so commenting articles like this is the main sort of publicity we can afford right now).
I'm blogging about the experience on LinkedIn so feel free to check that out for some light reading on what it's been like creating a search engine behind the scenes. I'll be getting to the tech bits quite soon which will probably be quite a laugh for people who do those things professionally.
Sounds interesting.
<tries it>
I'm looking for a microcontroller that can do h.264 video encoding, for a dashcam-esque product.
'microcontroller h.264 encoder'
Disappointing (but impressively short) list of things that had grabbed 'encoder' and ignored the rest.
Fair enough, I'm outside medical and scientfiic with that search, and the honesty with which it returned almost nothing was impressive.
'IR absorption spectrum of human blood' came out a bit better. Less manual rummaging through results than Google, and one useful result. I'll take that.
The search results look Googly, though, are you being a veneer over G? Won't they get grumpy after a while?
Good luck - the world needs good search.
Thank you! In the words of Han Solo, we're gonna need it.
I'm managing the crawl so can tell you for a fact we've not collected websites on microcontrollers yet. But that's exactly the sort of random thing we're indexing and all of those sorts of electronic components are on our radar.
So if by veneer you mean where are we getting the data from: no connection with the big boys, we're doing the indexing ourselves. We choose which websites to crawl at the moment to ensure there's no SEO spam in the search results.
If you mean isn't our product still quite similar to theirs, in the sense that we're still fetching a list of links, yes. We did try to make it look visually distinct, so I'm going to raise that point when we do our next user interface update!
You do not have a say, and you never have. Welcome to the world you denied for so long. You are a slave to the machine now, and it's going to get worse.... much, much worse. Nothing on the interwebz is real. Not the media (news), not the information being fed to you, certainly not any social. You need to turn off your internet people. It's not real, but it is controlling you
Search doesn't need improving because search technology is/was bad. Search is bad because the results are bastardised to promote adverts and paid for listings ahead of the actual information you are looking for. Introducing AI into the mix isn't going to (and isn't intended to) improve search for anyone except the ad slingers and the data slurpers.
AI might be the shiny-shit du jour but it's just the fashionable glitter on the turd that is search at the moment.
I commonly put URLs in my online postings (at least on forums where they are allowed). Perhaps I make some comment about them as well. What if people read my comment without following the link? Does that mean the site loses out on the traffic? Should I leave out my comment, and make people follow the link to find out what it’s all about?