Re: if they are using tools to do the job anyway, then the job is being done
Provided the information provided is accurate. At the present time, we can 'triangulate' our results against other sources, but when those other sources become submerged into a soup of dubious sources then there is no concrete reference.
Let's take genealogy as an example. I can go onto a website and get census results for a particular family in a particular house in 1901. This information is primary data, in a way. It was collected by an 'official' visitor to the house in 1901, who recorded it. The person giving that information may not be in a position to give that information in a coherent way, they may not understand the question, they may not know the answer, and they may not be able to read what the census official wrote down, they may even evade giving the correct information. But it is up to us to evaluate its likelihood of accuracy and to delve deeper in other ways. So then, over 100 years later, that information has been transcribed, with further errors being made in the transcription, ages could be miscalculated, names misspelt. In most instances at present it is possible to go in and look at the original scanned record and try to interpret the record to see if it has been properly transcribed. Now imagine that all of the original scans of those records being discarded. Data has been lost and forever so, unless a researcher has microfilms of the original scans, and has recorded and uploaded them somewhere for others to find.
Now a lot of fair-weather genealogists would accept the results they get from a quasi primary source such as this as as gospel, and continue to build their family tree, based on inaccurate information. More diligent researchers will 'triangulate' that data against birth records (which can be inaccurate), baptismal records (which can be inaccurate), marriage records (which can be inaccurate), death records (ditto), gravestone markings (ditto)... you get the picture, I hope.
The point is that the 'primary data' amongst that lot is vital to keep on the surface somewhere, in libraries, churches, etc. in its original recorded form. It occupies space, and is relatively difficult to access, but it is necessary for researchers to evaluate as facts come to light. I suspect however it will eventually all be destroyed because "why do we need it?" When that happens we have closed the trap-door of history down upon ourselves and are reliant on nebulous inferences of it instead. The storage of those nebulous inferences likely is extremely voluminous and of questionable veracity.
People do say that this type of search is not what AI is about, and it should not be used to produce inferences of this sort, but does it tell you this if you search for them? Does the person using these tools know how to use them? There is a very real danger that they don't. What happened to the expression GIGO? (Garbage In Garbage Out). I don't think I've heard it uttered, of late. Time to revisit it, I feel.
===
There is the efficiency angle too, which nobody would consider because it is hidden away in a datacanter somewhere, out of site, out of mind. The fact is that datacenters are using heavy resources and are bad for the earth.