back to article Digital pathology and the big Cs (that’s ‘cancer’ and ‘cloud’)

Have I got cancer? “Maybe," says my oncologist, “so I’m going to take a biopsy and we’ll have a look.” A small piece of my body, tissue from the potentially cancerous organ, is obtained through an incision, and sent to a pathology lab. A thin, thin slice is cut off, stained with revealing chemicals, and then checked by a …

  1. elDog

    Hard to argue with the good use of technology in this article

    It would be nice to think that in 5-10 years all radiology and pathology samples will be digitised and available for medical and research use. I've already lost most of my privacy so have at it.

    I don't think "It’s competing with IBM’s heavy-hitting Watson initiatives..." as the rest of the paragraphs say that there's good room for collaboration.

  2. Anonymous Coward
    Anonymous Coward

    Slides are fine; but how much patient data are they intending to bundle with said slides?

    The basic idea is good; but it could very easily go horribly wrong.

    1. John 110

      As a NHS Biomedical Scientist, I can state that we need as much (or little) data as is necessary to uniquely identify the patient. Currently where I work, the minimum would be a 10-digit number which includes the date of birth, but we actually like the Patient's name as well (because patients are people, not numbers). As the NHS has yet to standardize on a patient identifier across the UK, we would probably want the actual date of birth and current address as well, especially if the stored info is to be shared across regional borders. If you are using this stuff for diagnostic purposes, we would probably like some clinical details as well so that we can decide what to test for and interpret the results to your GP/Consultant when we get them.

      Why? Just how little (or how much) do you think is important?

      1. Anonymous Coward
        Anonymous Coward

        What I'm envisioning here is that someone goes for a biopsy and all of a sudden their insurance goes up and banks start trying to claw back all their long-term stuff before the patient dies; possibly before the patient has got the results back themselves. That's just two of the most obvious things that can happen; and is not what you really want if you've been diagnosed with something horrible.

        From the sound of it, this information would be shared on a fairly broad basis and it's a statistical certainty that there will be an ethically-bankrupt money-grubbing bastard or two amongst the sharees; if not now then next week. Look at the shit Uber get up to with their data and they are just a taxi company. Extrapolate that attitude to serious stuff like life-threatening illnesses and think of who has a financial finger in the pie (insurance, banks, people the patient does business with; people who stand to inherit, medical profession etc etc) and things could get nasty very quickly.

        For the path:

        Patient --> GP --> Testing centre --> GP --> Patient

        ...you do need a unique identifier, name and history.

        For the collaborative, analytic and statistical work you'll need a unique identifier (because that's how databases work); and I should imagine that some clinical details would be helpful too. The unique identifier doesn't have to have anything to do with the patient though in this context. You'd also have to be very careful with the clinical data too...it's *amazing* what people can fish out of databases. Off the top of my head a 2-system design; one with the patient's actual details that is rabidly secure, that also has an extra field for random numbers that you use as the unique identifier for wider dissemination. That way the patient could be identified/contacted; but you'd need a bloody good reason to gain access to the 'real names' system.

        Not sure what you'd do for the clinical details...I'm not a Biomedical anything so am unsure what 'clinical data' is made of, exactly; so can't really offer suggestions on improving the safety/anonymity.

        The business -as described in the article- is basically 2 distinct fields of operation with diametrically opposed data needs. Field 1 (diagnosing individuals and reporting results back) does need personal information. Field 2 (collaborating and letting Big Data munch on the numbers) needs to be as anonymised as humanly possible while still including *necessary* information.

        1. djhu

          Correct me if I'm wrong but isn't that highly illegal for an insurance company to do, at least in the US (HIPAA)?

        2. TivoExPat
          Boffin

          Definitive listing of Meta-data

          See DICOM PS 3.3 (found at medical.nema.org )

          A.32.8.1 VL Whole Slide Microscopy Image IOD Description

          For a complete description of the international standard for encoding all of the information (including meta-data) that was considered relevant by those involved. Note that there are mechanisms identified within the standard for clinical trial information, anonymization (de-identification), and digital signatures.

          The Patient Information Entity contains the patient identification.

          Patient ID is not universally unique as defined by most clinical sites, so the cloud service provider would probably need to intervene in de-identifying prior to storage in the cloud with a mechanism for providing a secure re-identification method to the local clinician (and away from cloud storage).

          Not sure why storage in the cloud would be so much better than local storage, and for images as big as they say, wouldn't upload bandwidth be an issue?

          Machine learning part would certainly be facilitated by having centralized storage (and centralized processing... not sure the clinicians want to foot the bill for the machine learning).

      2. Arachnoid

        Put it another way

        How much personal data is presently included on the system [slides?] that is used at the moment for diagnosis

        1. John 110

          Re: Put it another way

          The main aim is not to give you the wrong (ie someone else's) result. So the basic data would be (as I said above) to enough to uniquely identify an individual.

          The secondary consideration is to use any details supplied by the clinician to inform the result (in the case of a tissue biopsy, the clinician might tell us "wobbly lump on bottom" and we'd have to give consideration to the observed changes based on that. If in addition they tell us "has a habit of lying face down on a sunbed for hours" then we would take that into consideration when the result is interpreted.

          Something to bear in mind is that with the current systems where I work, the images are stored on one server, referenced by a unique identifier (Lab number) and the patient data and the results, including the interpretation are stored on the Lab system on a server under our control. So in my world, the patient information would not be "cloudy" even though the raw image data might be. Insurance companies would then have to breach the firewalls to grab patient identifying information and data. Or the evil government of your choice could sell it to them, I suppose.

          1. This post has been deleted by its author

            1. Anonymous Coward
              Anonymous Coward

              Re: Put it another way

              Sorry. I didn't phrase it well. You're thinking about how to make your bit run better; but how do you make the whole thing run smoothly? All of it.

  3. Anonymous Coward
    Anonymous Coward

    Secure storage linked to insecure storage with added feature to enable higher use of insecure storage, what could possibly go wrong?

  4. phil dude
    Boffin

    personal data, encryption etc...

    I have had to work on a few of these slides for writing image segmentation software (strictly academic, but informative). Very interesting use of mathematics for boundary detection.

    It would be a great advance to have the meta information available such that as more intense diagnostics become available (e.g. genomic, blood analysis). Then as models of diseases for each phenotype are determined, it maybe possible to "predict" outcomes based upon other more common diagnostics.

    I put "predict" in quotes because I have read/heard discussion in the biomedical community on the value of prevention as precursor to prediction (i.e recognise symptoms early).

    Back on the IT, perhaps patient records need to be hierarchically encrypted. Say, the HIPPA data is one key, the clinical data another key. Special permission for person features (e.g. X-rays, MRIs )

    Anyone know if there are mechanism to achieve this?

    P.

  5. P. Lee

    Why is it cloud-based?

    What advantages does that provide?

    1. jonathanb Silver badge

      Re: Why is it cloud-based?

      The advantage is that it allows them to say it is cloud based, a very important word in the buzzword bingo, and nobody will take you seriously if you don't use clouds.

    2. Michael Wojcik Silver badge

      Re: Why is it cloud-based?

      Capacity can be expanded immediately to meet demand. That's the whole damn point of utility IT.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like

Biting the hand that feeds IT © 1998–2022