back to article Tired: Data scientists. Wired: Data artists

Data scientists are important, but what the world needs now is data artists, according to analysts at Gartner's Data and Analytics Summit in Sydney, Australia. Analysts Sally Parker and Peter Krensky explained that data artists are people who ask wider – even perhaps tangential – questions about data, and what it might reveal …

  1. Pete 2 Silver badge

    Wondering out loud

    > Asking really good questions about what data can describe matters more than collecting more info

    Has anyone collected any info to back up that claim?

    1. xyz Silver badge

      Re: Wondering out loud

      I'm doing the very thing just now and management is really going for it. It takes imagination as well as a deep understanding of the data. Never heard the term data artist until now... I think the techs refer to me as the data arse.

      1. Bitsminer Silver badge

        Re: Wondering out loud

        It takes imagination as well as a deep understanding of the data business.

        FTFY.

  2. Blergh

    > The pair cited the example of a hotel chain that analyzes customers using just two data points: whether they use the gym, and if they choose healthy food.

    And how did they manage to narrow it down to just those two data points? By analysing a big pile of lots of data points!

    And then of course in 6 months time the question they want to answer will change, and then they will then need three data points. But which ones?

    1. Anonymous Coward
      Anonymous Coward

      You know the presentation is written by the technically impaired when instead of philosopher you read "Data Artist". There's other words that could of been used: analyst, statistician, surveyor but it was an out of work philosopher that wrote this dribble.

      Sit and let me tell you about... shapes.

  3. tiggity Silver badge

    Poor models

    The Belgium bus issue seems like really poor data scientists to me - should not have needed the people with real world knowledge to point out what they had woefully overlooked.

    If you are modelling vehicles then effects of road conditions (be that "hills", amount of extra stop starts for junctions, scheduled on a route, pedestrian crossings, time spent at different speeds etc.) should all be part of the model. e.g. most owners of non EV cars know that a lot of short, low speed urban journeys cause a lot more wear & tear than covering same distance in a single journey primarily on "interruption free" roads mainly at 50 - 60 MPH.

    1. trevorde Silver badge

      Re: Poor models

      or the data scientists could have just talked to the operations people at the start. They're the ones at the coal face and know what's going on. They don't need a whole lot of numbers to tell them what they already know. They were probably pointing out for years that rotating the buses would spread the wear...

      1. Mike 137 Silver badge

        Re: Poor models

        "or the data scientists could have just talked to the operations people at the start"

        The problem is the ethos of AI+ML practice. It's assumed (at least subliminally) that the automaton will reveal unanticipated answers. Asking those at the coal face before running the automaton is assumed to introduce "bias".

        When the answers can in fact be anticipated, using the automaton may not be necessary (but we can't have that can we?)

      2. chapter32

        Re: Poor models

        Call me naive but how can data science work in cases like this if there isn't a meaningful dialogue between the people with the domain knowledge and the data scientists? We have a huge amount of operational data where I work and to make better use of it people are teamed up with data scientists or they are offered some basic data science / analysis training. There is a great deal of benefit to be had from this data that doesn't require the full blown 'scienetific' approach.

        1. Evil Auditor Silver badge
          Thumb Up

          Re: Poor models

          Well, call me uninformed but how can data scientist do any meaningful work without having a thorough understanding of the real world?

          1. Il'Geller

            Re: Poor models

            How does it work in AI database? In it annotations are incredibly huge. For example, a single number, symbol, image fragment, or word can be annotated with many thousands of meaningful phrases, which are organized by timestamps. Moreover, this annotations are done without human intervention, automatically.

            While for a typical SQL (or noSQL) database such the annotations are simply impossible, because are too costly both in terms of the necessary manual work and the evident lack of requred information. AI can overcome all SQL problems through AI-annotating.

            SQL is over.

  4. Ali Dodd

    Seems actually sensible

    Instead of grab everything perhaps look at what you can grab and start off with targeting more, ends up with more relevant and less data which is easiest to analyse. The land grab of record 'EVERYTHING' has been massively wasteful in storage and processing as most organisations don't tend to use 90%+ of the data in the end, more often than not you use the same data you did before.

    List every item you can record, plan out what you might need and target on that, can always expand that if you know what is avaliable.

    Perhaps a short term big pile is useful to do this but doing that evermore is just a waste of time and resources

  5. trevorde Silver badge

    Uptitling

    bartender --> ale artisan

    coffee maker --> barista

    cocktail maker --> mixologist

    sales person --> vice president of sales

    help desk --> customer support executive

    data scientist --> data artist

    data controller --> data concierge

    1. Sceptic Tank Silver badge

      Re: Uptitling

      conman -> data artist

      1. Bitsminer Silver badge

        Re: Uptitling

        cynic --> commentard

    2. Pete 2 Silver badge

      Re: Uptitling

      purveyor of smut -> vice president

      blogger -> journalist

      pick-up artist -> dater wrangler

    3. breakfast Silver badge

      Re: Uptitling

      Astrologer -> SEO specialist

  6. Anonymous Coward
    Anonymous Coward

    Artistry Anyone?

    Strange that the phrase "big data" is not mentioned here. In fact the general tone of the article seems to reflect a focus on "relational" which Mr Codd would have recognised.

    I though that in modern times "data" was much "richer" than SQL......you know......tables of course, but graphics, 3D models, video, DNA profiles......and who knows what else. Plenty there for the typical "data artist" to dabble with.

    I think we should be told about "big data"......I'm sure there are folk in Cheltenham (or Fort Meade) who would love to develop more "artistry"!!!

  7. OhForF' Silver badge

    So do i have this right, a data scientist is someone who can collect data and produce nice graphs and calculate things like average and mean and someone who can actually interpret and understand the graph and statistic values is a "data artist"?

  8. Anonymous Coward
    Anonymous Coward

    Artist my arse

    <Rant> Why do arts majors always assume they are inherently superior when it comes to imagination? </Rant>

    The people they describe are not 'data artists', they are the people the data scientists should be treating as internal customers - i.e. the people best placed to relate the data to its real-world meaning or application. This should not be a revelation to anyone or require a new term.

    Ideally this should also be a closed loop of these customers requesting information relevant to a problem and subsequently analysing it, but since one often needs historical data before a problem is identified, there is a natural tendency to record everything in case it is needed and then (as noted in posts above) think that this *will* lead to insights in and of itself.

    / Just remember that correlation does not equal causation before concluding that switching from beef to quorn in the canteen will result in an increase in widget shipments, or something...

  9. Anonymous Coward
    Anonymous Coward

    No science without hypotheses, and no hypotheses without creativity

    I doubt that "artists" have any kind of inherent superiority here, and I was a Fine Art major.

    Indeed, any correlation between Fine Art majors and and arrogance about imagination does not equal causation.

    But if your 'data scientists' aren't actually coming up with any testable hypotheses then... where is the science?

    "Gather all possible data and wait until something sticks out" is not a hypothesis.

  10. DrG

    I am not a scientist, but wouldn't rotating the buses just distribute the same amount of wear and tear on more buses, resulting in the same total maintenance/repairs over time?

    Some specific buses will break less often, but the fleet will have the exact same cost associated with it.

    Unless they mean rotating within a single day, meaning being on easy route -> hard route -> easy route allows for cooldown periods that reduce effective wear... But as stated, it is a good example of wasting time looking at data ;)

    1. Caver_Dave Silver badge

      Buses (at least used to be) replaced when they reached a certain mileage. Thus if you can reduce the wear so that more of them reach the replacement mileage before a major breakdown occurs, then you are quids in.

      This whole topic is just a rehash of the old consultancy rule. "Consultants never tell you something that wasn't already known in the organisation."

  11. General Purpose

    Operations Research

    Understanding data's always a challenge.

    Step 1 (routine): collect data on where the holes are in returning bombers.

    Step 2 (radical): reinforce the bits that didn't have holes.

  12. RobLang

    Data engineers are needed first

    In my experience, the largest part of "data science" isn't training neural networks, producing beautiful infographics or pushing back the boundaries of the field. It's understanding what data is there and what the limitations of it are. You have to do that before you do any "science", let alone art.

    Collating, mapping, defining, filtering, adjusting, understanding data is more like an engineering task than a science one. You design your process based on known methods, carry it out, investigate deviations and repeat. That's more like engineering. The article hints as such. You don't want to invent anything particularly clever to do that, often it's best not to. Good old fashioned dependable statistics should be applied first. Do you fancy modeling once you are certain you know what you've got. To do that, you need to talk to the experts.

    I'm also suspicious of simulated data - I've seen stats and ML models performed on simulations that overfit the simulation itself and aren't effective in the messy, ugly real world.

    Inventing a new term sounds more like Gartner trying to stay relevant rather than offering anything helpful.

  13. BebopWeBop
    Pint

    Modelling

    Having worked (and built a successful business) modelling complex systems I have observed that the act of thinking about the model, discussing it, and condering the data that might or might not be available to support it is frequently as valuable as any predictions that are made by executing it.

    It frequently leads to low-cost, rapid solutions being developed with a good understanding of why the system, is behaving in the way it is (or will do when constructed) and quite often whether the questions being asked are actually sensible ones.

  14. SloppyJesse

    Domain knowledge is important

    Reminds me of a data analytics team that spent several months coming up with a complex model to identify high risk credit card customers. Quickly demolished by one analyst who matched their performance using a much simpler test - has the customer taken cash out at a cash point.

    His reasoning was simple - if you're taking cash out on a credit card, you're either on holiday or in financial trouble.

    Some of the IT management that have swallowed Gartner's previous kool aid about big data should probably take note that it didn't magically fix everything.

  15. Il'Geller

    Data specialists as well as artists are no longer needed. Data can organize itself without human intervention, which is achieved by automatic annotation and the establishment of internal relationships. This is AI database!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like