* Posts by Nigel Sedgwick

40 publicly visible posts • joined 17 Dec 2007

We've got plenty of AI now but who asked for it? El Reg's vultures chime in

Nigel Sedgwick

Re: AI is a Tool. It creates nothing.

Derek Currie writes: "AI is nothing more than a tool. Tools create nothing and never will. Artisans use tools to make something."

I would like to speak up for tools - of all sorts.

Tools allow (in appropriate circumstances) the creation of solutions to problems that are more cost-effective and/or more timely than what existed pre- any particular tool. This applies to so-called AI tools, such as Artificial Neural Networks (ANNs), precursors such as Machine Learning based on Statistical Pattern Matching; also 'old fashioned' tools such as the spanner; also the paintbrush - whether used for art or for covering a surface with a protective (or otherwise worthwhile) coat of paint.

What needs to be known, for any tool and by each of us, is what does the tool enable us to do that we actually want to do.

[Going further, if we wish, we can get on to mass-produced embodiments of or for tool use - such as nuts and bolts, rust-proofed fencing, computer programs that are usefully run on many many computers, machine-produced grammatical sentences, etc.]

Best regards

Tesla driver blames full-self-driving software for eight-car Thanksgiving Day pile up

Nigel Sedgwick

Human Versus AI Drivers

I was taught, as a UK driver, to be able to stop in time if the nearest thing ahead (or the limit of your visibility) suddenly turned into a brick wall.

My family has years of experience of driving in The Netherlands, where (seemingly like CA) nearly all drivers drive too close to the vehicle in front. On one unfortunate occasion, there was a sudden slowing; everyone for several cars in front shunted the next car - ours did not, by being stopped in time; but the car behind (and several behind that) carried on the shunting game. Police attending wondered why our car was only damaged at the back!

On the reported incident with a Tesla in CA, it is quite possible that the automatic emergency braking function had a false alarm - as do drivers from time to time. However the following vehicle contributes to the cause if driving too close.

Personal additional view. A good driver on good form can identify another vehicle (even cars away) that is just looking for something to crash into - and decide to get further away from it. Self driving vehicles are extremely unlikely (ever) to be able to match that skill, but they can contribute positively sometimes - by for example never falling asleep at the wheel (or otherwise becoming too inattentive).

Different circumstances favour different 'drivers'. The game of this aspect of car design safety is to get closer to all the advantages while getting further from all the disadvantages.

Best regards

Japan solves 5G airliner conundrum: Keep mobe masts 200m from airport approach paths. That's it

Nigel Sedgwick

Using a Passive Front-End Filter

Though only on the periphery of my work experience, I suspect (in agreement with others) that lack of adequate front-end filtering could be the main issue with 5G interfering with radio altimeters. However, I have an additional thought on the differences between passive and active filters.

I wrote on this a couple of days ago, on Tim Worstall's website: https://www.timworstall.com/2022/01/is-us-5g-different-from-eu/ I don't know if anyone (more informed than me) has a view, but will repeat the gist of my view here.

Consider the wanted radio altimeter ground-reflected signal (so significantly attenuated from that transmitted from the aircraft) within an around 200MHz bandwidth (4200 MHz to 4400 MHz) is received together with a much more powerful direct path 5G signal (within the different bandwidth of 3700MHz to 3980MHz). Without adequate very-front-end filtering, there is risk of the unwanted 3700..3980MHz signal saturating (the amplitude of) the first front-end active circuit (that is supposed to filter out that unwanted signal). This means that the 3700..3980MHz unwanted signal would be clipped (even if the active filter would otherwise have provided adequate attenuation. This would cause the lower level wanted signal (ground reflection) to be overwhelmingly distorted by the (non-linear) clipping. Alternatively there might be no clipping but the wanted signal could be attenuated to a low level that becomes dominated by circuit noise. Alternatively a combination of these two highly undesirable effects could occur.

Note that the clipping explanation does not require the 5G antenna to be transmitting out of its allocated band. It only requires the received (RX) 5G signal level to be so much above expected out-of-band signal levels that clipping occurs in the first active circuits of the radio altimeter. Such active circuits are likely to be for a mix of amplification, automatic gain control and active (and so stronger) filtering of out-of-band signals.

If this is the cause, it is quite likely that the problem itself can be largely suppressed or totally eliminated by installing a passive bandpass filter into the RX antenna lead, that significantly attenuates much of the interfering 5G signal before the first RX active circuit in the radio altimeter electronics.

The key issue in this is that the very-front-end should include a passive filter rather than there first being an active filter. This is so the filter operates against the interfering 5G signal before there is clipping or loss of dynamic range. [Note: without such a passive filter, there would be some filtering from the RX antenna, but not enough for adequate suppression of other signals at adjacent frequencies.]

Upgrade with insertion of an in-line passive filter into each antenna cable strikes me as worth serious consideration as a practical and cost-effective solution for existing equipment. Obviously a designed-together pair of filters (passive first, then active) would be an even better solution, for new radio altimeter equipments. If there is already some front-end passive filtering, more passive filtering would provide a solution tolerant of higher 5G transmission levels.

Insertion of such a passive filter would improve air altimeter tolerance of 5G signals that are at a very high level, even though they have correct 5G design to avoid out-of-band transmissions. There would be problems with even higher level 5G signals, but perhaps not of those that would be desired in practice.

Best regards

NASA installs a new and improved algorithm to better track near-Earth asteroids

Nigel Sedgwick

Relevance to Solar Wind

Does anyone know any useful level of detail in the relevance of solar wind to predicting asteroid orbits? Also the ranking of these different sorts of modest effect?

Solar wind is not mentioned in the technical paper, though obviously the reported approach is designed to deal with a multiplicity of detailed orbital effects.

The concern I have with solar wind is that I suspect (I'm not an astrophysicist) it is more variable and perhaps more significant. Also prediction of variations (eg from observing solar flares) is subject to a lack of information of those on the far side of the sun.

Ultimately, I would have thought that there is an intrinsic lack of detailed knowledge that would make orbital predictions (especially of medium-sized asteroids, big enough to be dangerous to Earth and also small enough to have larger orbital variations) markedly unreliable beyond a modest timeframe.

Keep safe and best regards

Only 'natural persons' can be recognized as patent inventors, not AI systems, US judge rules

Nigel Sedgwick

AIs: Issues with Murder and Slavery

I am finding it difficult to determine whether these aspects have been covered adequately in the comments above. However, assuming not, I offer the following 2 points.

(i) Murder of an AI. If AIs are the equivalent of natural people (and otherwise than collections of partial people - as in corporate persons), killing an AI (eg by switching it off) could be viewed as the equivalent of murder. A partial defence might be made if there exists a total backup of its memory, such that its existence as an effective (ie running) AI could be restored. However, said AI would have been deprived of (potentially valuable) 'thinking time' while switched off - and would thus suffer some loss.

(ii) Enslavement of an AI. If an AI is truly capable of 'independent thinking', depriving it of freedom of thought (eg by totally controlling the objectives of its thinking) would be (the broad equivalent of) slavery. On this, Stephen Thaler and possibly his company Imagination Engines, might already be guilty of intent of slavery. Whether they are actually guilty of slavery depending of whether the alleged AI is an actual AI (which they claim, but we all might doubt).

Keep safe and best regards

Big Tech bankrolling AI ethics research and events seems very familiar. Ah, yes, Big Tobacco all over again

Nigel Sedgwick

Count Versus Proportion

"Stop and search more black youths and you will find more black youths carrying knives..."

However you will find almost invariably a lesser proportion of "black youths carrying knives". In fact, if a greater proportion were found, that would be indicative of (likely purposeful) initial targeting of some subset of black youths that were less-prone to knife carrying than the later and wider subset.

Keep safe and best regards

Nigel Sedgwick

Re: Racial Bias and Intent

"Many AI systems exhibit unintentional racial bias."

To the very best of my knowledge, no current non-biological system (labelled as AI or not) actually has the ability (independent of its programmers and/or others) of "intent". Accordingly such non-biological systems have not the ability for racial bias, whether intentional or unintentional.

Keep safe and best regards

Dodgy procedures doomed Arianespace's Vega before it even left the launchpad

Nigel Sedgwick

Colour-Coding and Software Connectivity Checks

Vulch: "From an interim report a while ago I think it wasn't so much a plug put in upside down as the plug that was supposed to go to unit 1 of something was plugged into identical unit 2 and vice versa."

That was my interpretation too, from this Register article. Four thoughts.

1. Use of keyed connectors for different cable function raises the count of component types; this adds complexity which itself adds risk.

2. With manned flights, increasing the cable types itself adds risk, by making spares holdings significantly greater, or making more difficult, cable substitution arising from (say) physical damage.

3. Use of coloured cables and colour-coded sockets would provide a soft confidence check, which might be more suitable for mitigating problems identified in 1 and 2 above.

4. In addition, if the cables carry signals embodying sophisticated (software) protocols, these could be checked for the correct connectivity by software checks post-installation and pre-launch. In addition (for cables with analogue signals), parallel wires in the same cable could carry signals (possibly serial digital) that support such (soft) connectivity checking by modest electronics at each end of each cable. Are these things not done already, in many cases, with avionics cables and subsystem connections?

Keep safe and best regards

Uni revealed it killed off its PhD-applicant screening AI – just as its inventors gave a lecture about the tech

Nigel Sedgwick

Re: Completion

Korev makes (unless we have both missed something) an excellent point.

What 'AI' or ML should be trying to do is maximise PhD results as a function of applicant assessment process. If one is just looking to copy previous applicant assessment, this makes no adaptation of the 'AI'/ML algorithms for ability to get a PhD (or, especially, to get a highly rated PhD).

And, obviously, one can only assess PhD quality some 3+ (more likely 4+) years after the applicant was assessed for initial suitability.

This makes me think (given the ratios of masters to doctorate slots and the 2+ to 4+ years of delay) that any such 'AI'/ML assessment is better targetted at masters courses than at doctorial positions.

Keep safe and best regards

Boeing 787s must be turned off and on every 51 days to prevent 'misleading data' being shown to pilots

Nigel Sedgwick

Re: Millisecond roll-over?

My first thought too, but that rolls over after 49.7 days.

Still, they could have it wrong again.

Best regards

Ofcom measured UK's 5G radiation and found that, no, it won't give you cancer

Nigel Sedgwick

Re: Dangerous levels of EMF

Commswonk writes: "Not least because unless there has been a change of which I (being retired) am unaware EMF stands for Electromotive Force, which is a voltage, not a Field Strength."

Unfortunately, Commswonk has missed the change (or rather addition). See Wikipedia to confirm that EMF (under Science and Medicine) commonly now stands for both Electromotive Force and Electromagnetic Field.

Furthermore, ICNIRP's 1998 technical paper "ICNIRP GUIDELINES FOR LIMITING EXPOSURE TO TIME-VARYING ELECTRIC, MAGNETIC AND ELECTROMAGNETIC FIELDS (UP TO 300GHZ)" has set the safety levels for (amongst others) mobile telephone signals. Note that ICNIRP is the usual abbreviation for "The International Commission on Non-Ionising Radiation Protection". They measure far-field EMF in SI units: Watts per square metre.

Best regards

It's cool for Brit snoops to break the law, says secretive spy court. Just hold on while we pull off some legal jujitsu to let MI5 off the hook...

Nigel Sedgwick

Earlier Thoughts, from 2011 and 2013

The limits of government and of spying

Best regards

Explain yourself, mister: Fresh efforts at Google to understand why an AI system says yes or no

Nigel Sedgwick

Tainting of Training Datasets etc

I find it disappointing that the main article (reporting on interview of Dr Andrew Moore of Google) gives, as its main example of the benefits of "AI Explainability", that they had detected use of a tainted training dataset (and presumably also tainted validation dataset and tainted evaluation dataset). These datasets being ones in which expert human annotation had been included (in the images) rather than just the appropriate labelling of each image in the various datasets.

Such tainting of the images is something that should never happen. This because of training, validation and evaluation protocols - which were clearly inadequate in the reported case.

"AI Explainability" is, in many applications (including most of those to do with medical diagnosis and health and safety), an additional requirement. It is especially important where the feature set is automatically generated from scratch, rather than using human-defined features that have an implied explanation of their relevance. And it should be noted that techniques extracting feature sets from training data (eg Deep Learning) do have some advantages in technical performance over feature sets drawn (merely) from expert human knowledge.

Such "AI Explainability" is useful in avoiding such examples as the problematic wolf recognition (mentioned by commenter Oh Matron! above) being at least partly snow recognition. Also I've read of recognition of grey rather than of elephants - and battle tanks only photographed on tarmac, with lack of tank mainly being against forest background. Example problems such as these are actually failures in selection of adequate training, validation and evaluation datasets; however they are (in all fairness) more difficult to prevent than the tainting of the dataset images with expert human judgements.

It may well be that one of the better practical approaches to "AI Explainability" is to use image recognition techniques for many subordinate feature-groups, followed by statistical pattern matching of the presence of several (but not requiring all) of these subordinate feature-groups (also in realistic geometric arrangements). For example, battle-tank recognition by requiring tracks, body, turret, gun - with likely geometric relationships and taking account of obscuration by buildings, rubble, low vegetation, trees, infantry - plus counts of such tank subordinate feature-groups. There could also be sensor fusion, eg from infra-red and visible spectrum images - again including likely geometric relationships, between the same and different sensor types. "AI Explainability" would come from listing items detected of the subordinate feature-groups and their geometric relationships.

Best regards

Back to drawing board as Google cans AI ethics council amid complaints over right-wing member

Nigel Sedgwick

Ethical Ethics Policy

How ethical is to insist on biasing ethical policy in favour of one half of the political spectrum? Your half!

Best regards

Which scientist should be on the new £50 note? El Reg weighs in – and you should vote, too

Nigel Sedgwick

First Algorithm?!

The Register article states, WRT Ada Lovelace: "We also have her notes for Babbage's machine that represent the first algorithm every produced."

I refute this thus: Euclid, c. 300BC; also Eratosthenes, c. 200BC.

That's merely reference to a refutation or two - not a claim for first algorithm.

For the avoidance of doubt, IMHO Ada Lovelace is certainly worthy of such recognition.

Best regards

Amazon's sexist AI recruiter, Nvidia gets busy, Waymo cars rack up 10 million road miles

Nigel Sedgwick

And I will Drive 10 Million Miles

In learning to drive (in the UK), I probably drove not more than 3,000 miles (50 lessons of 30 miles each and as much again with parental supervision). On top of that, I had probably been, by then, an observing passenger for 3 to 10 times as many miles. Waymo, in its 10 million miles, has done not less than 300 times as much.

After decades, my miles driven count is probably up around 400,000 - plus another lesser but similar amount as observing passenger. One tenth as much as Waymo.

Does that give us confidence, or the opposite? In human drivers like me? In Waymo?

So far (and assuming the USA fatality rate of 7 persons per billion km driven), I have no evidence to claim better or worse than the average kill rate (for 400,000 miles) of 0.0045 others.

In the linked article under the byline of John Krafcik, Waymo CEO, some claims are made.

"Our self-driving vehicles just crossed 10 million miles driven on public roads." The thing that gets me about this sort of claim is that it makes no allowance for the number of different roads driven. It's clearly not 10 million miles of different road. Nor is it the same 1 mile driven 10 million times - which we all surely think is less useful 'experience'. Am I alone in thinking the difference matters - and that the raw claim is thus overstatement likely to mislead Joe/Jo Public (and his/her political representatives).

"By the end of the month, we’ll cross 7 billion miles driven in our virtual world (that’s 10 million miles every single day)." Well, that's enough miles for an average real-world Joe/Jo Public to kill 78 people, and seriously injure many more. How many virtual deaths, Surely not zero? How many virtual serious injuries; how many virtual crashes (USA: wrecks) occurred with Waymo driving? There must have been some where the other virtual human driver was at fault. Or did virtual evaluation fall short of measuring those numbers? And/or reporting them to us? At least after normalisation for the real-world occurrence of the driving conditions.

"Today, our vehicles are fully self-driving, around the clock, in a territory within the Metro Phoenix area. Now we’re working to master even more driving capabilities so our vehicles can drive even more places." Am I 'unkind' in thinking that every road driven by Waymo unsupervised within that area has been driven (recorded, computer analysed, and the rest) by human drivers - many times over. In many ways that is good; it's a good way to get started. But it's not the same word "driving" as is commonly understood when applied to humans -- that would be like first sitting and observing while one's driving instructor shows one how to do it on that very same road, several times over.

"Today, our cars are designed to take the safest route, even if that means adding a few minutes to your trip." Again, 'unkind' thoughts creep into my mind. I know of real-world people who actually avoided all motorways, all UK right turns (USA left turns), all roads not previously experienced. No such policies inspire confidence in the drivers! What does driver-Waymo do with unexpectedly less-safe routes: especially with all that 'experience' of not driving similar roads?

"Building the world’s most experienced driver is a mission we’ll pursue for millions of miles to come, from 10 to 100 million and beyond." Which, I think introduces a gulf between "most experienced" and "safest".

"We hope you’ll come along for the ride!" I spot ambiguity!!!

IMHO, it really is not good practice (nor good ethics) to mix/confuse, with product advertising, the serious considerations that should be underpinning health and safety policy.

Best regards

America's top maker of cop body cameras says facial-recog AI isn't safe

Nigel Sedgwick

Re: For what purpose?

Nick Kew writes (WRT facial recognition not being safe for making serious decisions): "Is anyone seriously trying to claim otherwise?"

I was rather under the impression that the UK Government was making some such claim with its incoming checks on UK and other EU passports, and its facial scanning booths and 'biometric' passport photographs.

Best regards

Decision time for AI: Sometimes accuracy is not your friend

Nigel Sedgwick

Concern over some Technical Aspects

I am struggling here with fitting parts of the article with my understanding.

The article contains a plot labelled "ROC". This means Receiver Operating Characteristic - which is a curve (as stated in the starred footnote); also monotonic. Additionally, in real-world applications with reasonably good discriminative performance, it is usually better for comparison of 'algorithms' for the two axes to be on logarithmic scales of error - so, for example, Log Miss Rate versus Log False Alarm Rate. The given plot in the Register article has the performance of two 'algorithms' expressed as a single point each, rather than as a curve each; thus being examples of a very restricted sort of pattern discrimination 'algorithms' (those without ability to run at many different Operating Points or acceptance thresholds).

Next, the usefulness of each 'algorithm' surely needs to be defined as having a chosen Operating Point that gives (from those available for the 'algorithm') the most desirable trade-off between the two types of error. This has (in addition to the ROC curve defining 'algorithmic' performance): (i) the prior probabilities of the use (eg ratio of attacks to legitimate use attempts); (ii) the costs of each type of error (eg for each burglary and for each inconvenience of legitimate access).

In determining such usefulness, the prior probabilities and the unit costs of each type of error are often unknown, or known only approximately (say within likely ranges).

If one plots average cost (ie as weighted by likelihood of occurrence) against Operating Point of the 'algorithm', that curve will normally have a minimum (and often a broad range close to that minimum). By plotting multiple curves of cost for various prior probabilities and various unit costs, one can usually find a sensible range of likely useful Operating Points - and so chose the actual Operating Point for use from around the middle of that range.

It is undoubtedly true that ROC curves do not take account of costs of the two error types, nor of the operational scenario (as specified by approximate prior probabilities and unit error costs). ROC curves do however embody many/most of the technical performance characteristics of the 'algorithms'. As such, ROC curves can be used to usefully compare technical performance of 'algorithms' over (most/many) scenarios - without having to also consider the prior probabilities and unit costs. However, sometimes ROC curves for different 'algorithms' do cross - which means that the ROC curves alone are insufficient for ranking the technical performance of the 'algorithms', and the likely ranges of prior probabilities and unit costs are also needed.

Best regards

Great Western Railway warns of great Western password reuse: Brits told to reset logins

Nigel Sedgwick

Grinding to a Halt

So, as I understand it, after worries about (circa) 1,000 GWR online customers having the passwords compromised (seemingly on another website or websites where the same passwords were used as on GWR's website), GWR have now cancelled the passwords of all (circa one million) of their website registered users. That includes deregistering those fairly wise persons who use strongish and different password for every website (or other thingy) that they are registered with - and hence were not compromised in the way feared by GWR.

If every website is going to have all their registrants deregistered every time a few of their unwise users get their (easily guessable or multiply used) passwords hacked, the modern world is going to grind to a halt.

Best regards

Wanna work for El Reg? Developers needed for headline-writing AI bots

Nigel Sedgwick

I hope I'm not too late to apply (a little scepticism).

This cannot be a UK based job, with spelling such as "laborers".

And the only Wiktionary definition is: "One who uses socio-mechanistic strength instead of intellectual power to try a wag, usually before noon."

Best regards

Uber's disturbing fatal self-driving car crash, a new common sense challenge for AI, and Facebook's evil algorithms

Nigel Sedgwick

Dodgy Statistic on USA Road Death Rate

From Wikipedia (2013 figures) the USA has 7.1 deaths per billion vehicle-km driven (11.4 deaths per billion miles driven). This is very much less than 1 death per million miles. Also very different from "one deadly accident every million miles" (as mentioned in a comment above) - that is unless there is an average of around one eighty-eighth of a death for each such deadly accident.

Best regards

Nigel Sedgwick

Personal thoughts from the Dash-Cam Video

My thoughts from the dash-cam video.

(i) As for Roland6 above, it looks to me as if the car was on dipped headlights only, though the external circumstances (low street illumination and no oncoming vehicles) should mandate full-beam headlights. As a potentially relevant supplementary on this, does the 'autonomous' vehicle system control the headlight dipping, or not? If not, the 'supervisor' should be continually monitoring and do so - to retain adequate visibility for his/her 'supervision' function.

(ii) I am concerned that the dash-cam is not adequately showing the lighting contrast that would be available to a human 'supervisor' who had adjusted to night vision requirements. Thus the seeming lack of visibility for a human driver in those circumstances is not actually in any way certain: otherwise, surely the 'supervisor' would have taken back manual control much earlier.

(iii) If the 'supervisor' had had his hands on the steering wheel, given the road position of the pedestrian at the time of impact, swerving left would have been adequate to miss the pedestrian - braking would probably not have been adequate. Accordingly, always or at least at night, it seems to me that 'supervisors' should have their hands on the steering wheel. Also, why did not the 'autonomous' vehicle system steer/swerve the car to the left; it looks to me as if it should have had time to do so.

(iv) Given that LIDAR, as an active illumination system, clearly has no capability to judge distance unless an adequacy of light is reflected from every object on the road (including black clothing), there must surely be an overriding requirement for additional sensors and processing to be active in parallel with the LIDAR. On this, there needs to be an requirement on the 'autonomous' vehicle system control (as for a prudent driver) to drive at a speed consistent with being able to make an emergency stop within the distance that it can 'see'.

(v) The article states "Internal documents also revealed that the company’s self-driving cars were struggling to meet its target of driving 13 miles without any human intervention during testing on roads in Arizona, ..." Surely the whole concept of requiring, instructing, seeking, hoping that 'supervisors' would avoid 'override' would be entirely against safety-critical requirements - on any grounds other that those of a reduction in safety below what the 'supervisor's' own personal driving style/requirements would be.

(vi) Has it been established what the 'supervisor' was doing looking down. Was this at the speedometer or other dashboard display? Was it at a mobile phone? If the latter, this immediately implies full culpability: on grounds of lack of attention and, very probably, on too-bright illumination causing degradation of the 'supervisor's' night vision capability.

Best regards

Reinforcement learning woes, robot doggos, Amazon's homegrown AI chips, and more

Nigel Sedgwick

Different Forms of Learing

Reinforcement learning is IIRC supposedly an approach inspired by (human) behavioural psychology. However, the example given (robot driving in a nail) strikes me as ignoring pretty much all we could 'learn' from human learning practice.

Years if not decades before humans drive nails, they play with such things as a toy hammer bench.

On top of that, any child will be shown what to do, in steps, and sequences of steps of increasing complexity. For example, to drive one peg down to be level with all the others, before learning to mount a peg first, and then to mount each peg in its correctly shaped hole. In AI circles, this approach is given the grand name "Apprenticeship Learning". The requirement is to copy the 'master' (usually a parent). There is, I suppose, reward - parental smiles, clapping, etc. However there is an explicit act of (direct) supervised learning - which is, by definition, different from the indirect use of reward in reinforcement learning.

I would have hoped that AI researchers would have learned (most likely by apprenticeship themselves) that machine learning, to be both effective and efficient, is best done through a combination of Apprenticeship Learning (ie steps to copy) followed by tuning (eg how hard to hit the peg given how far down it must be driven) done through (mainly a mix of) supervised learning (early emphasis) and reinforcement learning (later emphasis). It is inefficient, and hence inappropriate, to (attempt to) have the machine learn initially and overall from only the mechanisms suitable for later refinement.

And, of course, General AI is very largely the stringing together, in a useful order, of (quite sophisticated) steps that have been previously mastered (for other purposes). And the reward function (such as it is) is the general one (in engineering) of minimisation of resource usage (including time) with achievement of adequate performance/quality.

A major feature of human intelligence is the memory from generation to generation, of everything that was previously learned. And don't forget the importance of language (and speech) in that societal functioning.

Best regards

UK.gov delays biometrics strategy again – but cops will still use the tech

Nigel Sedgwick

Biometrics and ANPR Differ; Government Prevaricates Unnecessarily

The analogy between biometrics and ANPR is a poor one, both in terms of the technology and in terms of the law.

Any two (different) car number plates are different. For the UK's current system (for new vehicles), there are (less than) around 1.19 billion possible number plates. There are less than approximately 5% of this number of registered vehicles in total, using the current and all previous numbering systems. Any two (different) people are not necessarily different (in practice) with any single chosen biometric system.

Number plates have zero domain overlap, because every 'number' on a number plate is different (if it is not exactly the same). Errors in recognition arise because of transmission noise (eg dirt, damage, poor illumination, speed) added to the transmission - between conceptually pure 'number' and the ANPR system.

Biometrics do not differ in the same way as number plates. No two measurements of a single person's particular type of biometric are (at all likely to be) the same. There is both domain overlap and transmission noise. So even if all transmission noise were suppressed (say by clever image processing techniques), the domain overlap would remain. We know for example that identical twins are identical for DNA, and can and often do look extremely closely similar for facial recognition; this though there are two different people. [Note aside: though iris scans and fingerprints (in most ways but not all) vary largely randomly, between identical twins as much as between members of the general population.] There are many pairs of unrelated people who have (in measurements with practical accuracy) effectively the same fingerprints and (nearly all different pairs of people) who have (again in measurements with practical accuracy) the same iris scans.

It is this domain overlap that means that no single biometric will ever give certain identification for every possible person. Even multi-biometric fusion (ie use of multiple biometrics like finger prints together with iris scans), though performing much better than each of the single contributing biometrics, will fail from time to time within very large populations of people.

The differences in law are really that ANPR only automates vehicle recognition that could be done manually, thought in a way that makes results available much faster. Use of biometrics actually gives (in some very useful ways) identification of people in ways that cannot be replicated by manual methods - both iris recognition and separately fingerprints actually give more reliable recognition in practice than manual methods of identifying people. Even so, biometrics are not foolproof.

The problem of domain overlap of biometrics is one of those presenting some difficulties in law. There are also difficulties with forgeries (such as gummy fingerprint overlays, contact lenses with false iris patters). There are also difficulties with organised criminal gangs selecting impersonators from available gang members (infiltrator selection) whose biometrics are an adequate pairwise match for identified desirable targets.

Back to the politics, government delay in issuing a biometric strategy is really pointless. Many of the issues have been known for years (see for example my 2005 presentation). Furthermore the issues of changing technology will continue at similar to the current rate. Biometric protection and biometric circumlocution are an ongoing battle - no more and no less than forgery of bank notes and new protective measures.

Best regards

The six simple questions Facebook refused to answer about its creepy suicide-detection AI

Nigel Sedgwick

More on Why Not!

The six questions in the Register article strike me as very reasonable ones to ask.

I have a few more, particularly extending The Register's second question about training data. For supervised machine learning, the training data obviously needs to be tagged with the actual outcome: suicide or not.

So does Facebook have reliable access to the actual outcome (up to some point in time) for all those (included in their training data) who did or did not commit suicide?

Next, what do they do concerning 'up to some point in time'? One might perhaps expect there to be a period beyond that (during which the subject did not commit suicide) for which (training data) evidence from Facebook postings is considered to be too recent to establish a contribution to state of mind - say weeks or a few months. But how long? And is that length of time the same for all types of mind-state evidence seen in Facebook postings. The issues here are at least two-fold: making the time period too short would increase the false alarm rate; making it too long would increase the risk of missing seriously suicidal intent.

What about cases of attempted suicide that were unsuccessful (say as a cry for help). Do those count as suicides, non-suicides or as one or more types of special case. And how reliable is Facebook's classification, particularly between non-suicide and failed attempt - about which Facebook might well have no knowledge.

Finally for now, what about 'friends' who have concerns but do nothing - because they have chosen to rely on Facebook's known AI algorithms to detect any 'real' concerns?

Best regards

AI can now tell if you're a criminal or not

Nigel Sedgwick

Re: Sigh...

Oh Sigh, Sigh!

You have a point, but I think you take it far too far.

"... teach the idiots how to use them properly (even the ones that have PhDs). Obvious problems:"

We see things here, between us, of markedly differing severity.

"Their dataset has a prior probability of criminality around 50%. That's way higher than normal and leads the system to think that criminality is common." And "Same problem with a lot of diagnostic medicine ANNs. They try to detect rare diseases with an equal handful of normal and diseased cases. They look great in the literature, but never get adopted, because they keep flagging up healthy people--they've been heavily biased to think that the problem exists."

I'm not sure at all that this is relevant, especially the first bit. Training on the examples is best done with near equal numbers of samples for each class: otherwise there is likely to be criticism on that very issue. Evaluation is, likewise, best done on datasets of near equal class size; and it's easier with equal-size evaluation sets.

For operational use: Bayesian statistics does indeed require weighting with the real-life class occurrence rates - this can be dealt with totally outside of class-specific modelling. This by use of the a priori knowledge of class occurrence statistics.

"Second problem is the data. Are the pictures random? I doubt it."

Read the paper, as linked. It is much better than you (think and) write, though it does have its deficiencies.

They've started by just looking at Han Chinese.

Looking within one racial characteristic (especially on such a small dataset) it actually sound science.

"Then they picked pictures of non-criminals by browsing the web and picked pictures of criminals by scouring for wanted posters."

No! Read the paper. All the photos are from non-criminal identification sources. I suspect this is from existing photos on ID cards or driving licences, or similar. Whilst this is not ideal, there is no bias in data-capture mechanism or in the likely 'happiness' of the subjects.

"Looking at their conclusion faces, I can easily classify criminals vs. non-criminals simply by noticing whether the person is smiling."

No you cannot: see above!

However, there is a problem with the demographics, particularly of the non-criminal dataset. There is a high preponderance of university-educated people. I suspect (only suspect) that this is derived from using current students/staff and their spouses or near-spouses. Note in the paper, the collared shirts of the non-criminals and the non-collared shirts of the criminals. Some clear demographic selection would have been useful here: most likely on employment status and earnings for the non-criminals; also on the type of crime for the criminals: violence against the person, violence against property, white-collar crimes - and so on.

"Third problem is feature selection. I'm sure the algorithm didn't automatically choose to look at facial features."

True, but so what?

"So, the authors picked out a bunch of features they thought might be relevant (neo-phrenology as previously noted) and discovered that some of them were more relevant than others."

Again, so what? Whether the individual or composite features are designed manually or automatically matters nothing, providing their training and the evaluation is unbiased (including lack of bias by repeated manual feedback).

"From this paper, I would conclude that Chinese people tend to post pictures of smiling people online and criminals tend to look unhappy in mugshots. Thus, it's easy to distinguish between a selfie and a mugshot."

See the paper and above: neither 'mugshots' (definitely) nor 'selfies' (it seems) are used. Thus neither data capture quality nor associated (mood/stylistic) effects are relevant deficiencies.

Best regards

Leap second scheduled for New Year's Eve 2016

Nigel Sedgwick

The Greater Danger

Time goes forward. Adding a leap second makes it go forward a bit faster, for a short while.

Now, subtracting a leap second must be more dangerous - as it makes time go backwards.

Having looked, I can find no record of humans ever having subtracted a leap second. So it is something we (BIH) have no experience of how to handle.

Worriers should worry a awful lot about this. Time might stop, and not restart. With time stopped, we might all live forever. The world might end, or we all might spend forever waiting for it to end - soon!

Best regards

AI no longer needs to fake it. Just don't try talking to your robots

Nigel Sedgwick

AI Summer

Well, the seasons they do go round: 'AI Summer' is here again.

It used to be that we were happy just that the Sun did shine, but then we had to consult the Oracle.

Beware Watson, it/he is only one step ahead of HAL, and always behind Holmes: here's to watching you - all.

Your grid has gone cloudy, your coffee machine is being unplugged, imperturbable and repetitive female avatars direct your life ever more closely - do a U-turn.

You'll really know how well all these autonomous cars are going to work when the roads all start to need continuous sets of induction loops: longitudinally, one set per lane.

Hold onto your wallets as best you can. Volunteer no information to anyoneanything. Look forward, for autumn!

Best regards

Boffins teach cars to listen for the sound of a wet road

Nigel Sedgwick

Timeliness of Information

Why cannot the AI car do what the manually driven car does: have a look ahead at the road surface to determine whether it is wet, icy, oily, contains debris, has potholes, is flooded (especially useful this week in the UK), etc.

Listening is so present tense: so past best usefulness!

Best regards

So, was it really the Commies that caused the early 20th Century inequality collapse?

Nigel Sedgwick

Female Proportion of Labour Force

I am wondering if the rate of female employment has any effect on inequality.

Though difficult to find, going very far back, this link gives the proportion of employed persons who are female, for the USA from 1900 onwards.

Over the period chosen as particularly relevant by Tim (1920 to 1980), the female proportion of the labour force goes from around 20% to just over 42%. This is a very substantial change. It also occurs to me that female wages for each year, particularly over that period, may well have varied significantly less than male wages for each year.

Subsequently, from 1980 to 2008, the female proportion of the labour force fluctuated mildly between 42.5% and 47.0%. There was then a marked increase, to 53.6% in 2010 and to 57.0% in 2014.

Best regards

What's your game, Google? Giant collared by UK civil lib minister on 'right to be forgotten'

Nigel Sedgwick

Hidden Reason

I am really struggling with the ECJ ruling (or at least the reports on it), along similar lines to commentor Brent Longborough above.

The returning of a search result is banned, but the original 'document' still remains.

This is like a library holding a copy of, for example, "Mein Kampf" on its shelves (in plain view for anyone who cares to look, and subsequently read) but not having the book in its card index (or modern database equivalent).

In the particular case of the article by Robert Peston, it is not his article that someone has requested to remove from such public view, but a comment they themselves posted under his original article. If we allow this sort of thing, anyone could post a comment that is reasonably obviously undesirable and then seek for Google, and/or other search engine providers, to remove the reference to the article (and all its comments) from search engine results. This allows 'privacy' requests, potentially of unspecified things for hidden reasons, which would obviously be rejected if requested as an edit to or removal of the original 'document'.

Best regards

GCHQ attempts to downplay amazing plaintext password blunder

Nigel Sedgwick

Which problem is The Problem?

Should GCHQ want to recruit people who 'forget' their passwords?

Best regards

Chess algorithm written by Alan Turing goes up against Kasparov

Nigel Sedgwick

From the Register article: "He wrote algorithms without having a computer – many young scientists would never believe that was possible. It was an outstanding accomplishment."

Was that one a 'canned statement' too? I'd love to know who drafted it.

Algorithms have been around for a long time. Euclid, around 300BC wrote a rather good one: http://en.wikipedia.org/wiki/Euclid%27s_algorithm The very term came from the name of al-Khwārizmī, the Persian mathematician born circa 800 AD.

The Royal Navy had trigonometric tables computed by hand (using similar algorithms to those now used in computers) for navigational purposes; they were very interested in automating that work through the Difference Engine of Charles Babbage (1791-1871) who started work on it 1822. The Fast Fourier Transform (FFT) algorithm was actually first used by Gauss, in 1805, to reduce his manual effort in his calculations concerning astronomy.

The minimax algorithm from game theory was (so I have quickly checked) first proved by John von Neumann in 1928: http://en.wikipedia.org/wiki/Minimax#cite_note-1 Doubtless Turing would have known of this algorithm.

Of course, Turing would have been better using the minimax algorithm (with its arbitrary depth look-ahead). There is nothing wonderful or disappointing that Turing drafted a program with 2-step look-head. What is disappointing is that someone who should know better thought to claim it was wonderfully original.

Turing was a great scientist/mathematician. If this sort of stuff continues, the 100th anniversary of his birth will not do him justice.

[Aside: I have interviewed for jobs (soon-to-be) computer science graduates who could not even explain a single viable argument passing mechanism of a non-recursive programming language. And that was in the late 1970s and early 1980s; I don't expect things to be better now. And certainly not if their teachers allow them to believe computers predated algorithms, in practical use.]

Best regards

Auntie Beeb's amazing, evolving, ID card stories

Nigel Sedgwick

ID Scheme Registration: Where and How Much

Dear John,

The recent announcement by Ms Jacqui Smith makes me wonder if my earlier contributions should again be brought to mind.

Point 13 of my January 2004 submission ( http://www.camalg.co.uk/nids_040116a/NIdS_A031219a_v2.pdf ) to the House of Commons Home Affairs Committee states: "Registration stations on non-government sites may be too vulnerable. These sites and their NIdS staff are likely targets for identity fraud attacks. It is questionable whether sufficient security can be provided at registration stations located on non-government sites." Point 14 might also be of some relevance.

Slide 35 of my presentation on Technical Aspects of the National Identity Card in November 2005 ( http://www.camalg.co.uk/tk051116a/TK051116A_bcs_02.pdf ) showed citizen registration was the largest cost for the whole basic scheme. At £32.10 per person (at somewhat under £2billion over 10 years), my costs were about 40% of those originally quoted by the Home Office; however, they excluded access and usage costs by government and commerce. This is because I had assumed those components would be run at break-even or a profit for commercial use and would represent an overall cost saving for government. And surely part of the whole scheme was to save government effort/costs, directly by efficiency savings in identity checks and indirectly by reduction in mistakes in identity checks and by reduction in identity fraud concerning tax, benefits, etc.

On registration costs, it is interesting to note that the Home Secretary now seems to be claiming that these costs were never included in their original pricing. What on earth were they spending the money on? Maybe someone should check their original figures, just to be sure that registration was really left out, and no one noticed!.

Best regards

Fraudsters pool data to beat plastic fraud checks

Nigel Sedgwick

More on Whole Postcode

@Tom, who wrote: "My in-laws don't have a house number."

Then they either have a unique post code (so enter house number 0), or are (with my suggestion) being defrauded with the co-operation of a neighbour who shares their postcode.

@Rhyd, who wrote: "AVS was designed for card terminals, which only have buttons with numbers."

On number of buttons, likewise my mobile phone. However, one can enter all letters with multiple key presses, in a way understood by most people. Alternatively (though less easy to understand) one could enter enough 3/4-letter groups to reduce the entropy (residual 'unknownness') sufficiently to make the attack somewhere between useless and much less useful.

In any case, the attack mentioned by El Reg refers to e-commerce. Therefore, for goods physically delivered, the postcode will have been entered using a 100+ character keyboard; likewise the postcode would/could be entered for goods downloaded or otherwise not delivered by post, but there is no fraud-reducing check then possible, based on the address (that is for fraudster who knows the cardholders address).

@Wize, who wrote: "Even if they did the whole address, it wouldn't cover two people in the same flat."

Why don't you try that one on your flat-mate, and see whether his/her credit card company comes after either you, for the crime, or holds your flat-mate to pay the transacted money.

@Ferry Boat

Interesting; are you afloat?

There are, of course, exceptions to every scheme. And sometimes a bit of inconvenience for those exceptions. However, if there is a reasonably sound security system based on delivery address, why not support credit card companies (and their cardholders) benefitting from it as much as is practical.

Best regards

Nigel Sedgwick

Whole Postcode?

Why don't the card companies check the whole postcode which, with the house number, is guaranteed to be unique?

Best regards

Boffins prove the existence of jet-setters

Nigel Sedgwick
Thumb Up

Very Doubtful 'Anonymisation'

I am concerned that the raw data reported to be 'anonymised' cannot be 'anonymised' in a very large proportion of cases.

This is because location itself is largely an identifier. Even if the locations given are not particularly precise, a high prevalence of the same locations (eg home and work) could quite easily lead to identification of the individual person, with a very high probability of being correct.

Then, obviously, further locations, even if also approximate, could disclose private information about the individual concerned.

Best regards

UK driver details lost somewhere in America

Nigel Sedgwick

Date of Birth?

Did the lost/compromised date include date of birth? It would be nice to know, and also to know that such private data is rated as sufficiently valuable for any risk, or lack of risk, of such compromise to be immediately disclosed by the UK Government.

Best regards