back to article Data is the new uranium – incredibly powerful and amazingly dangerous

I recently got to play a 'fly on the wall' at a roundtable of chief information security officers. Beyond the expected griping and moaning about funding shortfalls and always-too-gullible users, I began to hear a new note: data has become a problem. A generation ago we had hardly any data at all. In 2003 I took a tour of a new …

  1. Headley_Grange Silver badge

    It's a poor analogy. Uranium is intrinsically dangerous. Data is harmless until people do bad things with it.

    1. hh121

      Refined data might be (relatively) harmless when held by the white hat that refined it ("don't be evil" anyone?), but if it gets into the wrong hands...this is what CISOs are bricking themselves about. Seems like pretty apt analogy to me.

      Nothing's going to stop most of these data gathering and refining exercises though, unless they are made illegal or highly regulated.

    2. vtcodger Silver badge

      why I downvoted

      I hate it when folks downvote my comments without explanation. So here's my explanation for downvoting yours.

      I don't think your post is moronic or anything like that. But I do disagree. The current fad in IT seems to be that data is great. One can't get enough of it. Never can tell when some trivial item you've collected will come in handy. The problem is that your operation is undoubtedly accessible online. if nothing else, purchasing and accounts payable probably are dependent on the Internet. And the Internet is wildly insecure. But, you say, our operation is secure. Not likely. In point of fact, the number of CVEs (potential vulnerabilities) has increased pretty steadily from 1438 in 2000 to 28961 in 2023. Are all your potential vulnerabilities mitigated? Or even identified? Almost certainly not. Sure looks to me like the internet is an ongoing threat that is not remotely under control. AI will, I think, only make the situation worse.

      So, I think it likely that attitudes will likely slowly shift from "data is great, can't get enough" toward "data is toxic, save only what you need and get rid of it when you no longer need it".

      1. Anonymous Coward
        Anonymous Coward

        Data has a cost...

        Hit with staggering charges for several TB of info storage, I proposed archiving. Anything > 7 years (our local legal requirement) was archived off production storage to a NAS.

        Within minutes of that going into effect, we had a request for Baby Shower photos from 2012 from HR for a retirement dinner, the actuarials needed spreadsheets from 2007 onwards and another department demanded all of their files be restored (earliest date was 1997). We brought all of them back and started charging each department on their budget based on every 100GB chunk. Within two months every department had deleted roughly 1/3 of the old crap I had archived.

        Users...

        1. Doctor Syntax Silver badge

          Re: Data has a cost...

          Somebody else's budget is always cheaper.

        2. AVR Bronze badge

          Re: Data has a cost...

          The older the data the less you save (in dollars or potential pain) by deleting it though. Older data is less likely to be sensitive and tends to take up less storage space than current - less videos, less bloated pdfs which someone has decided to keep ten marginally different versions of, if there are stray vast spreadsheets or even databases they tend to be smaller. Very old data certainly should be tidied up but even finding the people knowledgeable enough to do so (if they're still employed) and getting them to spend their time doing so may be more pain than it's worth.

      2. Ian Johnston Silver badge

        Re: why I downvoted

        I downvoted because depleted uranium is about as harmless as stuff gets, unless someone is firing it at you, But then few things are harmless when fired at you.

        1. Anonymous Coward
          Anonymous Coward

          Re: why I downvoted

          > I downvoted because depleted uranium is about as harmless as stuff get

          A pedant notes: TFA refers to uranium, not depleted uranium.

    3. Blazde Silver badge

      Data is harmless until people do bad things with it

      There are frequent data 'breaches' that don't involve bad people doing bad things. An S3 bucket was accidentally left exposed.. it probably didn't actually leak but there aren't logs to prove that, and so legal, financial and reputational consequences follow.

      Not only that but there are an abundance of people wanting to do bad things with data, so whatever intrinsic difference in dangers there is seems moot.

      It's actually a good analogy precisely because it challenges your perspective.

  2. Anonymous Coward
    Anonymous Coward

    I hope this catches on. Less data gathering could be very positive move for cost control and everyone's happiness.

    1. Harry Kiri

      Data is worthless

      Data has no value. Information extracted from data has value. Just storing data and hoping for the best is bottom-up nonsense. You should understand what problem you're trying to solve, from this derive an approach and the data (and its quality) you need to support that. Data supports something you specifically want to do, its not an end in itself.

      1. Cris E

        Re: Data is worthless

        Truth. So much of that historical raw data is never touched. When was the last time an information harvesting journey went all the way back into history to find nuggets? Usually important fields are missing or incomplete so the search starts today, or only a few months back, and all that tasty looking 2020 log and audit data turns out to have spoiled when no one was looking. If you can't explain why you're keeping it most historical stuff is risk without corresponding value.

      2. Ken Moorhouse Silver badge

        Re: Information extracted from data has value

        I would add...

        If it is correctly interpreted.

      3. Bilby

        Re: Data is worthless

        Neither data, nor information derived threfrom, is of any value, unless the original data is true, correct, and accurate.

        Business decisions are made on the basis of data that is believed to be all of these things, but almost never is any of them.

        Data is almost always wrong. The small amount that is not, is rapidly mixed in with the data that is; And while two wrongs don't make a right, nor do a right plus a wrong.

        We have a system at my place of employment where workers can notify extra minutes worked. This is used to pay them for their actual time; But when submitting an adjustment, they are required to select a reason from a drop down, which has seven possible causes of delays, and no "none of the above" option.

        I am not sure what happens to this "reason" data. The best case is that it is discarded/ignored, and exists only to waste time for people who are already under time pressure.

        My fear is that it is used to formulate policy in an attempt to reduce lateness; For which, as it consists of essentially random tallies of non-reasons, selected solely because the actual reason wasn't an allowed input, it will be spectacularly awful.

        But it's in the computer, and it appears on reports and graphs. So it is believed.

        1. deadlockvictim

          Re: Data is worthless

          Bilby» Neither data, nor information derived threfrom, is of any value, unless the original data is true, correct, and accurate.

          The relatives of the late Mr. Archibald Buttle would disagree with you.

        2. wimton@yahoo.com

          Re: Data is worthless

          Long time ago, I had a manager that required extremely detailed project breakdown reports.

          I knew that this data was not used afterwards, so I wrote a spreadsheet generation these randomly, with the total amount of hours spent and the weight of the individual items.

          Everybody loved it, but the manager was a bit miffed when I told the story at my goodbye party.

  3. Pascal Monett Silver badge

    "We don't know what a 'data Chernobyl' might look like"

    I imagine that the tens of millions of people who've already had their identity stolen and their bank accounts abused (not to mention their credit rating) might have a slight notion of what that might look like . . .

    1. Al fazed
      IT Angle

      Re: "We don't know what a 'data Chernobyl' might look like"

      however, many as yet undiscovered organisations are are busy stuffing unvalidated, incorrect, inaccurate, highly misleading data into their internal data silos.

      Why ?

      Because the muppets taking wage to supply this data think they arenot paid enough to ensure the data's integrity before submitting this shite into the company data base.

      In the instance I am thinking of, it's going to be as big a concern to the organisation as the Fujitsu/Oracle/Post Orifice scandal.............when it finally surfaces, with teeth enough to bite the organisations CEO and senior managers for Professional Negligence.

      In truth, their employees are charged with making accurate records of meetings with their clients. However, owing to lack of proper oversight, all sorts of utter rubbish is being entered into the data trawl and then it's regurgitated for use in County Court prosecutions, removing disbaled vulnerable people frtom their homes, even sending the poor unsupported defendants to prison for things like Anti Social Behaviour which is in fact noted as NOT ASB in the company documents like, Tenancy Agreements, Anti Social Behaviour Policy Handbook.

      Not to mention breaches of GDPR (UK), the Disabilities Discrimination Act, The Equalities Act, to name a few.

      It really beggars belief when a complaint about a solicitors firm hiding the defendant's evidence from the court by not including it in the court bundle, is brushed aside because in the words of the "legal firm", say that it's "OK, because the defendant did not rely on it in court". Not to mention ASB Case Officers being able to freely commit perjury with utter bollox is falsely entered as being "the truth" in the witness statements. Judges disregard the incorrect, inaccurate, down right lies as the being the truth and will happily plough on with the prosecution instead of stopping the trial in it's tracks.

      It allows legal representatives to totally misinform the ICO, the SRA, the Housing Ombudsman and others who may be seeking the truth.

      Anyway, as a professional with more than 30 years in this sector, I can only say bring on Chernobyl and wipe out this fallacy that all of the data stored is indeed, valid, verified, authentic, rather than a Sunak type of Data Unicorn which must be keep in the data silo, at all costs.

      ALF

  4. veti Silver badge

    Good trend

    Someone - and I'm thinking CISOs as a group are better placed to do this than most anyone else - needs to quantify the true cost of collecting and keeping data. Then send that bill to the people (mostly marketing, I imagine) who do it.

    Then maybe anyone who wants to add a new tracking cookie will need to submit a cost-benefit analysis. Maybe that will slow them down a bit.

    1. Anonymous Coward
      Anonymous Coward

      Re: Good trend

      Hear hear. Read the line "the cost of managing data sometimes exceeds its value" and came to the comments section to say - good, we're heading in the right direction. How about larger fines for data leaks? It would be a Good Thing for businesses to really stop and think through what data they MUST have, especially about customers (and even more so, about people who they're trying to get to be customers). For non-customers, the answer really should be "zero". (Note: someone seeing an advertisement is NOT a customer.)

  5. Neil Barnes Silver badge
    Mushroom

    The best place for uranium is in the ground

    Unless you need it for something useful, like keeping the lights on.

    So perhaps not the best analogy; I remain unconvinced about the benefits of critical masses of information. Mostly, _that's_ best left in the ground.

  6. Guy de Loimbard Bronze badge
    Facepalm

    Glad to hear it's being discussed

    Just because we can store data and keep doing so until either the cost of the cloud storage goes through the roof, or our NAS/SAN is full, doesn't mean we should.

    The problem is as much about scalability without any real thought, as it is about whether we should be collecting so much data in the first place.

    Just look at your classic office user, look at how large their email storage is, if it has a quota limit at all. Reams of inboxes have mega/terabytes of crap stored just because "I may need it" and there's no consequences if you don't manage it, until it's full of course.

    My analogy to most users is: "If that was physical mail coming through your post box, you would have got rid of most of it within a matter of a day or two, why are you storing emails from 10 years ago?"

    "Unlimited storage" is baked into the cost of your licensing most of the time, but it's well hidden to the point it doesn't appear to have an empirical value that can be scrutinised.

    So, until we re-educate data users, we've got a long way to go before we can secure it all.

    1. TheBruce

      Re: Glad to hear it's being discussed

      Had to start a quota on emails once the workers ran out of their local disk quota...

    2. Doctor Syntax Silver badge

      Re: Glad to hear it's being discussed

      An inbox is not the place to store read mail. A well-run paper-based system will have a filing system to handle old documents. What doesn't fit the filing criteria won't get filed. Problem solved.

      Having said that and worked in an organisation with a filing system like that it was important to keep lab notes, instrument charts and any case document received from outside because it might become important a few years down the line. That would have applied even if the information was incorrect; in fact it might have been even more important to have preserved a copy if it was incorrect. Cases can have a long life so the case files have to as well. The best way of dealing with that sort of storage problem back in those days was microfilm.

    3. hitmouse

      Re: Glad to hear it's being discussed

      I had to deal with over-zealous creators of PDF newsletters sending their difficult-to-read wares everywhere so that thousands of people had copies in their email storage.

      The probably was vastly compounded by the PDFs containing very-very-high resolution photos of guest speakers (20MB files) that they dropped in without reducing them to a respectable 50k or less for the thumbnail visible.

      Trying to convince these people to send links to a URL where the newsletter sat (and could be post-corrected/updated) was like asking them to sacrifice their first born.

  7. Locomotion69 Bronze badge

    And in the near future the problems with data only get worse due to AI learning efforts.

    Not to mention the need for power to run -and cool- all the storage systems around.

  8. Bebu sa Ware
    Windows

    the medium ...

    Not quite channeling Marshall McLuhan but once upon a time all the "older" data was migrated onto tape which was stored nearby and after a period was shipped off to cold storage in some warehouse in the boondocks to molder and be forgotten.

    Unfortunately there is now so much hot storage that old data lingers on like an ancient inconveniently long lived relative kept alive by a mercantile medical industry.

    A lot of big tech would benefit from an extensive application of involuntary euthanasia. (Literally and figuratively. :)

    A decent Carrington Event could solve this and a few other problems. ;)

    1. Doctor Syntax Silver badge

      Re: the medium ...

      OI, I'm now ancient and find that there's nothing inconvenient about being kept alive.

  9. Wang Cores

    We will get data protection...

    ...Right after they decide it's time for a big war (gotta distract from those yuge tariffs) and beat the drums for a big Shooty Shooty Bang Bang in the Pacific and, the Opposing Nation, being more than likely to be a near-peer state with multiple experienced hacking ops, will simply zero out everyone's accounts or DDOS financial processors.

    1. O'Reg Inalsin

      Re: We will get data protection...

      Not just the Pacific. 100K NK troops already on their way to Europe.

      1. Wang Cores

        Re: We will get data protection...

        I say the Pacific because that's pretty much the only area where the incoming admin won't sell out an ally... cheaply. The Ukrainians are straight fucked if they expect anything short of "capitulate and we won't roll the Bones (B-1 Lancers) on you."

  10. Doctor Syntax Silver badge
    Megaphone

    At last.

    Icon: need to amplify sound of penny finally dropping.

  11. ecofeco Silver badge
    Mushroom

    Yes, but which hand is crucial to human survival!

    Are they saying it's NOT paramount to know which hand we wipe our ares with? That just crazy commie talk! Won't someone please think of the data center bros and list sellers! And what about the knock on detrimental effect to hackers and ransome-ware markets! And just who will train the AI's, pray tell?!

    This will not stand! There is no such thing as too much useless trivial data! Hoarding is a vital component of human existence!!

  12. MrAptronym

    I worked in a lab that had, a decade before my arrival, bought a kilo of a uranium compound (not yellowcake, not even enriched). The lab only actually used a gram or two a year at most. The lead for the lab did this because it was much cheaper to buy in bulk: Maybe 4 times the price for 20 times the chemical.

    Then the funding for uranium projects dried up and we just had about 980 grams of the stuff left. The govt takes accounting very seriously, so the material (held in a locked room) had to be accounted for every quarter. The area was checked for contamination monthly, security was maintained and paperwork was kept in perpetuity. We could have gotten rid of it, but it turned out the material would cost thousands to dispose of. So a part of my job was to maintain compliance on this single locked up container of the stuff. Every month I swabbed the room, I did my training, I filed and signed my paper work and calibrated the detection instruments. It also meant I got to take the locked room as my own private lab space, away from other pesky people in the lab :P

    I wonder how many people are in a similar situation with servers full of unused data right now.

  13. captain veg Silver badge

    Nothing to hide

    I have no principled objection to, say, the recording of the fact that I ran over a pressure strip on a road near my home in the pursuit of analysing traffic patterns.

    I have every objection to the photographic recording of my mug whilst I did so on my bicycle. Or of my number plate whilst motorised.

    There's data, and there's PII. The latter needs regulating to within a millimetre of its life.

    -A.

    1. Anonymous Coward
      Anonymous Coward

      Re: Nothing to hide

      > There's data, and there's PII. The latter needs regulating to within a millimetre of its life.

      There's PII and there's Personal Data - PII is typically a USA term and is a subnet of "Personal Data" which is term that the (UK) GDPR and (EU) GDPR use.

      It makes sense to use the correct term when referring to something (I'm assuming you're in the UK not USA).

    2. Ian Johnston Silver badge

      Re: Nothing to hide

      You did the driving or cycling in public, so you can't complain if someone keeps a record of it. Sorry.

  14. Anonymous Coward
    Anonymous Coward

    Email deletion policy

    Default delete all emails older than x years. Solves so many ills.

    1. Doctor Syntax Silver badge

      Re: Email deletion policy

      You have a dispute about some object or service you bought years ago. The other side has kept its paper trail, you don't. Guess who loses.

      1. Ian Johnston Silver badge

        Re: Email deletion policy

        The Guardian would have lost its case against Aitken if credit card receipts from the Ritz had,'t survived in a box. When cases of alleged wrongdoing are brought decades after the events in question it makes sense to store relevant data for decades too.

  15. hitmouse

    The data will never be managed properly while there are unimaginative CIOs who just look at the rapidly expanding stockpile as an opportunity to grandstand over new cloud platforms, and "smart AI management" and the rest. (This resume-padding approach applies in other business areas obviously).

  16. Tron Silver badge

    There is an alternative.

    Doing stuff on paper is a lot cheaper with less risk. Creating and then having to curate data you don't need is nuts.

    The original design of social media could have gone several ways. Run all data through the company server so they can monitor it and monetise it, encrypt it whilst it moves through company systems or move the data user to user, so the company never sees it.

    They chose the first way on the basis that more data is better. But it backfired. Now they are expected to spy on it, censor it, block it and gate access to it, which will eventually cost more than they can make on it, and their services will end.

    They should have chosen the last way, moving data user to user via the e-mail protocol. The company would have still been able to monetise it, but without the costs of bandwidth, storage, and responsibility as a publisher, libeler, disseminator or enabler.

    They could simply change, but there is no innovation in GAFA any more, and the window for being allowed to may be closing as governments take over.

  17. amanfromMars 1 Silver badge

    For a Braver Neureal World of Surreal Order ?????

    Easy to Imagine and Impossible to Prevent as be Evidenced by Current Happenings/Novel Trails and Noble Trials and AI Tribulations.

    Surely, Mark Pesce, regarding ....

    CISOs therefore increasingly feel that the cost of managing data sometimes exceeds its value. Those I observed have found themselves wishing for a world with less data that needs securing.

    ....the actual real live, realtime and expanding increasingly problematic difficulty and relentless exploitable vulnerability which more than just a few would advise is impossible to address and resolve presently with a universally accepted solution, and which is leading the ignorant and arrogantly fooled to future unavoidable slaughter, is more accurately shared with the information that CISOs find themselves wishing for a world in which they are not floundering and failing to protect security with exclusive executive secrets suddenly made known for open intelligent source sharing/world wide web presentation ...... for once out there and readily available for viewing and global dissemination, are those genies never going back into any type of Majestics' MAJIC* lamp.

    AI certainly recognises the situation for publishing and is both able and able to enable in A.N.Others the making and taking of overwhelming advantage in the opportunities newly realised and freely shared.

    * ....... MAD** Allied JOINT*** Intelligence Committee

    ** ...... Mutually Assured Destruction

    *** ..... JOINT Operation Internetworking NEUKlearer Technology

  18. Stoic Skeptic

    ... the cost of management sometimes exceeds its value ...

    A sign of poor management.

    1. neilg

      the cost of management sometimes exceeds its value

      the cost of management always exceeds its value.

  19. file

    Is the issue the amount of data, or the complexity of systems?

    As I see it, the problem is as much about munging lots of systems together to achieve some kind of greater functionality. The individual systems are often from different teams or vendors, all with their own way to store data, their own ideas about unique identifiers and access control.

    It becomes rather hard to manage it all...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like