
Anonymisation of data
Wow. Never heard of that before. It's amazing what novel ideas can come up reading el Reg.
Imagine you’re in charge of technology and data for part of the UK’s chronically cash-squeezed National Health Service. A world-famous technology firm offers you a cool new service, either free or for very little money. All it wants in return is access to the patient data that will make the service work. What are you going to do …
Anonymisation of data is not sufficient. Pseudo-anonymised data (i.e. anywhere there is a 1:1 mapping) must be considered personal data, as if it were not masked at all, because of the proven ease of reconstructing an identity from metadata.
This is one of the key reasons both the ICO and Caldicott condemned the programme - just anonymising the data is not enough. It can be a positive step, but in isolation is never enough to treat the data as if it weren't personal information. Consent *plus* anonymisation, with the anonymisation process being explained to patients to obtain their positive consent, may have been sufficient.
The approach taken by Genomics England is probably the way forward. Data never leaves your system. You buy in services and software, you don't give away the data. Ever. This is also the approach taken by the ONS for their new "data campus", enabling much broader research access to government data sets.
"Smith says that such work should go through review processes akin to those for drug trials – which would probably have rejected the handover of 1.6 million patient records. "
Strange how shiny IT makes pretty standard ethics procedures fly out of the window. Special case of Jaron Lanier's 'siren servers' perhaps??
Coat: glad they didn't get fined. Mine's the one with the appointment card in the pocket.
Actually yes it should. That owner has been proved time and again to slurp any and all data available to it first and worry about the legality of it later. When caught they blame "a rogue engineer who added the code" and other ridiculous excuses. Who owns DeepMind should definitely be a concern for anyone whose most personal data is ultimately being shared with a company that cannot be trusted.
Actually no it shouldn't. The contract should be built with safeguards that explicitly state that the data is for the use of the named company only and not any subsidiaries or owners so that there is no grey area or wiggle room.
Also, wasn't the rogue engineer story to do with the VW emissions thing? Haven't heard that one from Google yet.
DeepThroat or LongTongue preferred.
On a serious note.. Properly anonymised data please. Strip any and all data that can potentially be used to identify the OWNER of that data and we have a positive. Of course if we are looking at regional variances some location data my be required. However, a postcode/zip code is going to far.
The word owner is capitalised because corporate entities forget just who owns the data. Being an aggregate of data does not equate to ownership.
I know of an attempt to anonymise voting data from previous years to test a proposed new nomination and voting method for an annual literary award. Despite initial statements by the people writing the code it turned out the historical voting data couldn't be anonymised sufficiently to prevent reconstruction of personal identification without destroying the information the test software needed to see what effects it would have on nominations and voting. Just removing the basic ID fields (name, address etc.) wasn't enough to prevent someone willing to put in the effort to determine who had voted in previous years and what they had voted for if they got hold of the raw data.
It's could be that health records are in the same boat, either they are processed by experimental software in a secure, trusted and licenced sandbox with only the aggregate results made available to researchers or they cannot be used safely at all as research material.
If so, a few thousand or more requests should tie up DeepMind for an hour or two.
Would not the Trust having a small DeepMind datacentre on premises mitigate this issue? If the data cannot be taken off of Trust premises and therefore not able to be de-anonymised by DeepMind the Trust would be more trustworthy. I refuse to accord DeepMind or any other Google related entity.
"If this was a new drug, the approach Google took was to synthesise a new molecule and inject it into a random bunch of people who walked into Accident and Emergency and see what happens."
Surely people know difference between analyzing person's data and injecting actual drug into a person? Privacy is important, but so is perspective.
And usually in a poorer light than this.
Yes strong contracts and penalty clauses (and enforcing penalty clauses) will help.
But.
Google is a data fetishist with a high recidivism rate (like other kinds of socially unacceptable fetishists).
IOW They can no more keep their hands off patients data (and it's patients data, not the NHS's to sell or give away) than Uncle Ernie can keep his hands off Tommy (Nice work by Phil Collins, really channeling his inner nonce).