AI: a misnamed, yet important technological advance?
Is AI akin to a pulsing brain in a jar of fluid, as sometimes depicted in Sci-fi B-movies from many decades ago: outwardly impressive, scary, and, unless thwarted, destined to rule the world?
In reality, AI is each of mundane and exciting: as determined from one's perspective.
I suggest that the strengths and weaknesses of AI are most easily grasped by comparison with a much simpler tool of empirical modelling, which remains in wide use. I refer to multiple linear regression, this is an extension of the simple linear regression dating back to at least the 19th century. The advent of increasingly easy means to do arithmetic led to multiple linear regression's use (and misuse) in many academic disciplines, and in commerce.
In essence, from a set of data representing numbers and categories a 'best fitting' linear combination of the recorded variables ('independent variables') as predictors of a variable of particular interest ('dependent variable') is obtained. 'Fit' is commonly decided using 'least squares'.
An example is studying the relationship between the onset of dementia (independent variable) and hypothesised predictive (or potentially 'confounding') variables such as age, sex, educational attainment level, occupationally defined social class, and categories of ethnicity. This study may have the intent of elucidating causal relationships. In the hands of a competent researcher, various combinations of independent variables (perhaps interactions among them too) will be explored; the selection of the final model from which inferences will be drawn is not delegated to the encompassing software.
Alternatively, motivation for the study may be to inform decisions over the allocation of health and social care services to populations of varying structures. Utility of this model depends upon faith in its overall ability to predict need for financing services: insight regarding causal relationships of particular independent variables to the outcome is not sought.
Key points for consideration when comparison to AI is made are as follows.
1. For either motivation for study, variables of plausible relevance are decided by the investigators. Depending upon the reason for a study, values of variables can be sought either from routinely collected aggregate sets of data, or by interrogating individuals drawn from the population by random sampling.
2. Depending upon the face-validity and how well the gathered data represent the 'population of data' from which they are drawn, the influence of chance upon the magnitudes of resulting model coefficients may be adduced e.g. constructing confidence intervals.
3. Model coefficients are weights attached to the independent variables. This is explicitly so when the variables are standardised to a common scale.
4. Correlation and regression techniques cannot establish causal relationships. Carefully conducted, these may suggest patterns worthy of further investigation by appropriate study designs.
5. Studies intending to identify the possibility, and magnitude, of causal relationships take place within a theoretical context adopted by the researchers.
6. The calculations made by regression analysis software are simple to understand, and for a given set of data exactly reproducible.
AI technology is problematic in the following respects.
1. At present, it is hard (impossible?) to make an AI produce its chain of reasoning other than when drawing abstract logical and mathematical inferences according to a predefined calculus. In fact, special purpose software, for running in the conventional manner, appears able to achieve the same.
2. Current AI configurations don't discriminate on the basis of the provenance of information used to determine their internal model weightings.
3. Related to the above-mentioned, AI 'training' based upon general Internet 'content', and throwing in books, and music, lacks rationale beyond 'the more, the better'. Training differs greatly from mentored instruction provided to children; AIs do not mimic curiosity, neither does training for a general purpose, rather than covering a specific domain of knowledge/application, involve human judgement over the relative merits of information sources. 'Rubbish in, rubbish out' is an apt description.
4. Just as naively used regression analysis produces nonsense, so does AI. For example, university students across a range of disciplines are shown how to use statistical software packages such as SPSS; these resources are well-constructed, and invaluable in the right hands; however, an inadequately taught or supervised student may choose inappropriate analysis options, and may place undue emphasis on particular resulting statistics e.g. significance tests. These failings are identifiable by competent instructors: the nature of faulty decisions is evident. Wrongness emanating from an AI will remain opaque unless the AI's assertions are contradicted by its interlocutor's prior knowledge. How the AI strayed from reasonable summation of knowledge and deductions therefrom may never be known.
Despite grounds for strong reservations, AI technology is amazing and may offer much more in the right hands. Its capacity for interacting using human language is truly impressive, this despite reservations, exemplified in the paper quoted from in the Register report, over what epistemological status AI merits.
AI appears to complement and advance pre-existing techniques for image manipulation. Swathes of activity by business (e.g. film making) look set for fascinating development.
The 'hype', misunderstanding, and potential for nefarious purposes, already is evident, but that can be true for any innovation.