> "principal components analysis" and similar multi-dimensional pattern analysis techniques were things I was being taught about at university in the mid '90s
Absolutely. PCA is a good example of
>> alongside all the other data reduction methods
and is still holding up its end of the job after, um, nearly one and a quarter centuries. Depending upon which named variant you decide to go with.
> decades before someone in marketing decided to brand large statistical models as "AI"
Yeah, well, the current marketing hype is pissing on both the good parts of Neural Nets *and* the term "AI" as well. Look how many responses here are based on the assumption that "AI" only means LLMs and similarly ill-defined systems and the from that the cry that "I'd never want AI anywhere near Fusion Power"! When this latest AI Bubble bursts there will, once again, be a backlash from the funding guys against 'Nets and anything else tainted by "AI".
If we try our best to ignore those shouty people, there are lots of discussions to be had about where methods such as PCA live in comparison to the general field of AI[1] - if you just go to the Wikipedia page they are lumping PCA into "a series on Machine learning and data mining" (sic) when you can be *pretty* sure that its use in those fields is more recent than its use in mechanics. On a similar note, Bayes did all his finest work before 1763 and his Theorem was taught in school[2] well before the headmasters ever learnt of Electronic Brains, yet chances are when you hear his name mentioned now it'll be in relation to its use in works coming from "the AI labs" (and confusingly not always when his particular Theorem is being used, sigh). Or filtering email (which is ML, btw). Or not at all, as Expert Systems were the victims of an earlier AI Bubble and now all very infra dig.
[1] at the (very great risk) of trivialising, such as: On the one hand, having done calculations on a particular data set, we have improved visibility of what is interesting in *that* data - and can go off and make use of that to determine things about the specific situation (e.g. experimental setup) that generated the data. On the other hand, we can look to see if we can borrow those results and use them to look at this *other* data set, without repeating all of the analytics again, and fingers crossed we'll be able to spot if there is anything interesting here that warrants a closer look (i.e. actually doing all the sums to get a properly defined result). The further you stray from the (situation that generated the) original data set, the less, um, demonstrably correct that quick answer is. But if it is still correct enough to guide a decision, say which is the better next step to take whilst walking this graph...
[2] even if only to really put the wind up students who were getting cocky about "understanding probability"! That'll learn them.