back to article Geneticists throw hands in the air, change gene naming rules to finally stop Microsoft Excel eating their data

Geneticists have issued new guidelines in naming human genes – after spending years wrestling with Microsoft Excel and similar software that automatically converts the names of genes to dates. The Gene Nomenclature Committee of the Human Genome Organisation (HUGO), which sets the standard for the titles and shorthand labels of …

  1. DavCrav Silver badge

    "Symbols must also only contain uppercase Latin letters and Arabic numerals, and cannot be offensive or derogatory."

    So how exactly should I abbreviate my new gene dicovery. Cytosine: Upper Nasal and Tracheal?

    1. jake Silver badge

      How?

      Veeeerrrrrryyyy carefully, at least in today's atmosphere of word censorship at the drop of your knickers a hat.

      More to the point, when are they going to get rid of such symbols as BRCA (Breast Cancer)? It is clearly micro aggression, indicating Gene researchers want all women to wear burkas.

      For the humo(u)r impaired, that's .... ah, forget it.

      1. jgarbo

        Re: How?

        BR**ST??!! How dare you mention that evil appendage that nourishes the new born. Call for the guards.

        1. gerdesj Silver badge
          Childcatcher

          Re: How?

          "Wicked child ... devil's dumplings"

          (Please complete your own White Adder quote)

          1. Stumpy Silver badge

            Re: How?

            Thanks for the reminder that I have a thingy shaped just like a turnip

            1. Psmo Silver badge
              Thumb Up

              Re: How?

              Username checks out.

      2. Anonymous Coward
        Anonymous Coward

        Re: How?

        I think you've misunderstood modern feminism mate.

        I think they will want science to find a way for men to get breast cancer too so we can "experience their pain"...because men won't have any diseases like it therefore we can't understand. Because we're dumb men.

        1. Anonymous Coward
          Anonymous Coward

          Re: How?

          Joke's on them then, since men can and do indeed get breast cancer. They just have less tissue in the area, so are less likely to develop it than your average woman.

          1. Doctor Syntax Silver badge

            Re: How?

            And are also liable to prostate cancer.

        2. Anonymous Coward
          Anonymous Coward

          Re: How?

          Men can get breast cancer, biology is ahead of you there - more prevalent in over 60s but happens in younger men too so worth keeping an eye / having a feel for.

      3. jason_derp Bronze badge

        Re: How?

        "...it is clearly micro aggression... "

        Did something happen at work? This topic line is like your whole identity in the comments.

        1. jake Silver badge

          Re: How?

          "This topic line is like your whole identity in the comments."

          No, not really. But a portion of my identity is poking fun at idiots.

    2. jgarbo
      Facepalm

      How old are the children doing this work? "Derogatory science" is an oxymoron, you bloody morons. It's fcuking science! Kindergarten researchers must now be accompanied by parents' affidavit of family sanity...and niceness.

      [I despair]

    3. Anonymous Coward
      Anonymous Coward

      Wrong tool for the job

      Classic examples of how off the shelf software is fundamentally limited to what vendor can understand, and users embodying classic hammering of square peg into round hole.

      I have often wondered how much money/time is spent around the world battling business software into half doing something other than business when a bespoke solution would do it out of the box. Odd when you consider the wasted brainpower of who is doing the battling.

      1. Version 1.0 Silver badge

        Re: Wrong tool for the job

        We get purchase orders from distributors and corporate customers sent as excel spreadsheets and HTML files every few months. But we get malware infection attempts via spreadsheets and HTML files almost every day.

      2. The obvious

        Re: Wrong tool for the job

        If scientists can’t manage to find/use the import wizard where you can tell it that that A specific column data is text then I’m not sure there’s hope for the human race.

        1. Fluffy Cactus

          Re: Wrong tool for the job

          Well, but if you are a dedicated scientist, then you won't throw up your hands, but try and find ways

          to work around the problem.

          For example, if you are faced with a column of text data, where some of it has been changed into dates,

          then you can fairly easily copy that column into a new spreadsheet, change the format of that column

          to 'Text', and then use the 'Search and replace' function to undo the damage. You could even write

          a little macro, or a formula to do something like that. Yes, Excel is wrong and dumb, as they do not let the user decide up front what data format an imported data can have.

          Way back when, I once wrote a macro in Lotus 123, that replaced all the missing zeros in a system

          that required "left-filled zeros", and formatted the field as text. No good reason why, but can you argue with tax authorities?

          With Lotus 123, I could actually do that. With Excel, not so much, because their macro language is not intuitive, not in the least. In fact, I consider it so utterly un-intuitive that it stinks.

          Nonetheless, a scientist ought to be resourceful, and if they are not, they are not good scientists.

    4. Efer Brick

      See you next Tuesday, Wednesday and Thursday

    5. Brewster's Angle Grinder Silver badge

      The old Copper nanotubes back again.

    6. Dog Eatdog
      Childcatcher

      Little do you know that there really *is* a gene called fucK

      https://www.uniprot.org/uniprot/P11553

      1. Ken Moorhouse Silver badge

        Re: Little do you know that there really *is* a gene called fucK

        It would appear from the link that the use of a codon is mandatory for reproduction to occur.

        (IANAB)

  2. Drew Scriver Silver badge

    I must be missing something...

    Geneticists find it easier to rename genes than to format a cell (or column) as "text"?

    This kind of problem will keep hounding them. I doubt gene names are the only issue.

    1. HildyJ Silver badge
      Facepalm

      Re: I must be missing something...

      Clearly formatting cells is rocket science, not bio science.

      1. jake Silver badge

        Re: I must be missing something...

        Wouldn't formatting cells be micro-surgery?

        1. bpfh Silver badge
          Go

          Re: I must be missing something...

          I see what you did there :D

        2. Anonymous Coward
          Anonymous Coward

          Re: I must be missing something...

          You'd certainly want to steer clear of macro-surgery.

    2. DS999

      Re: I must be missing something...

      I think it is more about importing CSV files. They get mangled when imported, and you can't unmangle them by reformatting the cells.

      1. Sampler

        Re: I must be missing something...

        You can set field types on import.

        As someone who works with Market Research data and has to do this very often and would kindly prefer if Microsoft stopped helping, it's a trivial step but seemingly manageable by this uni-drop out, so, one would hope not that difficult for phd's...

        1. Charlie Clark Silver badge

          Re: I must be missing something...

          The problem is that it is a manual step that it is easy to avoid so people forget to do it.

          I don't know why you think scientists who are experts in their field, should be any better at office software than they are say at plumbing or arc-welding.

          1. ibmalone Silver badge

            Re: I must be missing something...

            People doing data analysis on genetics shouldn't really be using excel, I think the quoted "Biologists in particular are reluctant to invest time in learning programming skills." is somewhat condescending when this is the community that produced Bioconductor. Experts in their field know how to analyse their data.

          2. Anonymous Coward
            Anonymous Coward

            Re: I must be missing something...

            Yeah but giving them a free pass on this, because IT, is like saying you can't expect them to drive to the highway code because they're not professional drivers

          3. doublelayer Silver badge

            Re: I must be missing something...

            "I don't know why you think scientists who are experts in their field, should be any better at office software than they are say at plumbing or arc-welding."

            They should because the office software is part of their job. Now probably it shouldn't be that big a part, but if they insist on using Excel for their database, then they need to know how to use it for the database-style things they intend to do. Column typing is one of those things.

            I'm a programmer. I only need to know how to write code, so I shouldn't have to know very much about infrastructure which I don't administer, right? We probably all know programmers like that. Yet that knowledge is crucial to understanding how my code will be working and therefore making my products useful. It isn't a thing specifically named in my job description, but if I don't know how to do it, I am not as good an employee.

            1. Tim Almond

              Re: I must be missing something...

              "They should because the office software is part of their job. Now probably it shouldn't be that big a part, but if they insist on using Excel for their database, then they need to know how to use it for the database-style things they intend to do. Column typing is one of those things."

              It's no excuse to not know about the tools you use. No chef would say that they don't know about the different knives and pans they use. No violinist would be blase about the violin they use. As a programmer I'm really fussy about my hardware, hosting and tools.

              I've known that Excel makes assumptions for years. One of the reasons I try and get people using a combination of SQL Server + some sort of sql reporting tool is that SQL doesn't mess with your data. You can also set up things like SSIS jobs that will import your data in a consistent way (like filtering data or removing columns). And this stuff is cheap now.. Put it on a server and run PowerBI.

              1. AndrinaW

                Re: I must be missing something...

                Exactly - I find it disturbing that this happens at all

                One should stick one's data in an SQL db if it's as important as almost anything really

              2. Mark 65

                Re: I must be missing something...

                Double-clicking to open an Excel associated file type like a CSV is like hitting "I'm feeling lucky" on a Google search with questionable search terms. It might work out well, but it probably won't.

      2. oiseau Silver badge
        Facepalm

        Re: I must be missing something...

        I think it is more about importing CSV files.

        Indeed ...

        And the ham-fisted default setting in Excel when opening the file.

        O.

        1. Falmor

          Re: I must be missing something...

          The default CSV import settings seem to have been designed to do the worst possible job. Think they may have been trying to discourage people from using Excel to look at CSV files.

        2. Adrian 4 Silver badge

          Re: I must be missing something...

          I despise the auto data conversion (and many other autocorrect features that seem to get it wrong far more times than they get it right : I turn them off for a more productive experience).

          But if you were regularly importing CSV into a spreadsheet, wouldn't you write a macro that set the column types first, then imported it ? And pass that macro around the community ?

          1. Uffish

            Re: But if you were regularly importing CSV ...

            "But if you were regularly importing CSV into a spreadsheet, wouldn't you write a macro that set the column types first, then imported it ?"

            No, I would continue to expect the spreadsheet to treat my data as data and not pre-mangle it into corrupted data. If some of the data required cell formatting I would do that, as and when I wanted.

            Bloody stupid marketing departments !!!!!!!!!

      3. E_Nigma

        Re: I must be missing something...

        The frustrating thing is that for txt files there is a nice import wizard on open that allows you to set, among other things, the type of each column, so you can label the sensitive ones as text, but CSVs just open and you don't get to stop Excel from doing nasty things to them.

        The workaround is to change the extension to txt before opening, but I'm sure that a lot of people in those research institutes aren't aware of it and don't know how or don't even have the permission to make Windows show extension for known types.

        The additional problem with CSVs and Excel is regional formatting, because a lot of times a CSV gets exported under one set of local settings and then opened on another computer with different ones, such as reversed decimal and thousands separator (probably more prevalent in the Balkans, where half the computers run with US settings and half with local, but any large international cooperative is likely to bump into something like that).

        Excel and CSVs, definitely not a match made in heaven, but if MS simply added an import wizard such as the one for txt files, it would almost make things bearable.

        1. A.P. Veening Silver badge

          Re: I must be missing something...

          The additional problem with CSVs and Excel is regional formatting, because a lot of times a CSV gets exported under one set of local settings and then opened on another computer with different ones, such as reversed decimal and thousands separator (probably more prevalent in the Balkans, where half the computers run with US settings and half with local, but any large international cooperative is likely to bump into something like that).

          How about the field separator? About half the world uses "," and about half uses ";", with the remainder using more exotic characters. Luckily this one can be worked around using a non-data header line specifying the separator.

          1. The obvious

            Re: I must be missing something...

            About half uses “;”, so not comma-separated at all?

            1. A.P. Veening Silver badge

              Re: I must be missing something...

              Nope, as that conflicts with a comma as decimal "point" (nearly the complete non-Anglic world), but what's in a name?

          2. AndrinaW

            Re: I must be missing something...

            I like using the tab character as a separator

        2. David 132 Silver badge

          Re: I must be missing something...

          The frustrating thing is that for txt files there is a nice import wizard

          I believe I once heard David Bowie singing about something similar for this type of data - the “Gene Genie”.

          1. John Brown (no body) Silver badge

            Re: I must be missing something...

            On the other hand, Microsofts handling of this PR kerfufle makes them look like a bit of a Gene Hunt.

            1. David 132 Silver badge
              Happy

              Re: I must be missing something...

              Or we just hand the whole problem off to those tiny little guys in pointy hats and beards who lurk around genetics labs holding fishing rods.

              You know, the Gene Gnomes.

        3. hitmouse

          Re: I must be missing something...

          "if MS simply added an import wizard such as the one for txt files, it would almost make things bearable."

          Like the one on the ribbon labelled "Text/CSV"?

        4. Ken Moorhouse Silver badge
          Coffee/keyboard

          Re: if MS simply added an import wizard...

          Upvoted your post for the general sound advice offered, but your suggestion (above) did make me laugh.

          ====

          Those of us using OpenOffice are well-served with options - prompted-for - when doing similar operations.

      4. Kevin Johnston Silver badge

        Re: I must be missing something...

        One of the tasks I have to do often is taking a sub-set of data from a system and producing a report to show status. One of the identifiers is a Hex string but if Excel sees no letters it converts it to a number with exponent and due to the length of the string it will lose the last 8 digits...forever

        1. Rol Silver badge

          Re: I must be missing something...

          An option in Excel settings. That's all what is needed. For Christ's sake they have options for every other conceivable thing under the Sun in there except "Would you like me to permanently stop guessing at what you are trying to do? Y/N"

          God help us if Excel ever gets a real AI upgrade.

          "Hi! Snoopy here, I've just instructed your bank to move all your money into Jelly Beans, based on your spreadsheet, "Investment tutorial for beginners"

          1. nijam

            Re: I must be missing something...

            > An option in Excel settings...

            Yes, for example to delete Excel and run some other spreadsheet.

            1. StephenH

              Re: I must be missing something...

              I had a similar issue with customer reference numbers for bPay payments. These are 16 digits and excel changes the last digit to zero. I was using google sheets as it didn't do this.

              Then google announced an upgrade to make it more compatible with Excel and introduced the same problem.

          2. SloppyJesse

            Re: I must be missing something...

            > An option in Excel settings. That's all what is needed.

            No one would use that option. If there is no guessing then when opening any text file it would have to assume all columns are text. That would make any genuine numeric and date data unusable without the pain of excels data type conversion functions.

            There are 2 problems with the current behaviour

            First it guesses on a cell by cell basis, not a column basis. So if you have some values that might be a date they get converted but the rest of the column looks OK.

            Second there is no visibility of what's been converted so when the user sees the first screen full and it looks alright they assume it's all good.

            Even a notification on opening a non-native file that it is being interpreted/converted would be an improvement - I'm sure I'm not the only person in financial services that's had to break out the luhn algorithm to regenerate the last digit of credit card numbers after a csv file has been opened and saved in Excel.

    3. Psmo Silver badge

      Re: I must be missing something...

      Don't forget that sometimes Excel will autoflip the formatting back.

      Not using Excel is safer.

      1. jgarbo
        Linux

        Re: I must be missing something...

        Using LibreCalc on Linux. Input DEC10 and it stays there. Never seen this problem. Is it only Excel?

        1. James Anderson Silver badge

          Re: I must be missing something...

          Happens in Google sheets as well. Enter "10-12" to indicate pages 10 and 12 and you get Dec 10th or 12th Oct depending on your locale.

          Incidentally did they take into account locales e.g. MAI would presumibly be mangled if users Locale was set to French so when nameing genes you would need to avoid all of this:

          https://web.library.yale.edu/cataloging/months

          1. tip pc Silver badge

            Re: I must be missing something...

            “ Enter "10-12" to indicate pages 10 and 12”

            10,12 would indicate pages 10 and 12, 10-12 would be pages 10 through 12.

            1. James Anderson Silver badge

              Re: I must be missing something...

              Why so many up votes for a pedant!

              1. Anonymous Coward
                Anonymous Coward

                Re: I must be missing something...

                Are you new here???

                1. James Anderson Silver badge

                  Re: I must be missing something...

                  No -- but really!

              2. Anonymous Coward
                Anonymous Coward

                Re: I must be missing something...

                You missed off the question-mark! HTH :-)

              3. DryBones
                Pint

                Re: I must be missing something...

                Because they were correct.

            2. MOH

              Re: I must be missing something...

              Or page -2

          2. Anonymous Coward
            Anonymous Coward

            Re: Happens in Google sheets as well.

            Happens in Google sheets as well. Enter "10-12" to indicate pages 10 and 12 and you get Dec 10th or 12th Oct depending on your locale.

            For additional fun share a spreadsheet with people on different locales and ask them to fill the cells. Happened on an onedrive-shared Excel file. it was so annoying I had to split the date into three columns.

            1. A K Stiles Silver badge
              Pint

              Re: Happens in Google sheets as well.

              All dates on sheets I create are now entered in 'ISO 8641 Date' format for those very reasons.

              Sometimes somebody will comment - "What is that value?" to which the answer is "A date, formatted unambiguously in compliance with an internationally recognised standard". People have complained that it is difficult to understand, to which the response is along the lines of "Well that's not a problem with the data or the system..."

              Long week - must be about time to go to the brewery!

              1. Drew Scriver Silver badge

                Re: Happens in Google sheets as well.

                I work in the tech department of a Fortune 500 company in the USA.

                Can't even get people to use 24H clocks, let alone ISO 8641 date format...

                Having grown up in Europe, where even a six-year-old knows how to tell time in 24H, I'm baffled by the insistence here on using am and pm. At first I thought it was just my unfamiliarity with the system, but it wasn't long before I noticed US-born colleagues oversleeping because they thought the dot on their alarm clock meant am, not pm. Or missing planes because they got to the airport 12 hours late. Or setting up a cron job to restart a server in the middle of the night, only to have it go down at 15:00.

                And if I got a penny for all the times I've been asked if 00:00 is the end of the day or the beginning of the day...

                1. Martin Howe

                  Re: Happens in Google sheets as well.

                  As for 00:00, I find it useful to think of "Midnight AM" and "Midnight PM", as the context (end of one day or beginning of next) is often important but not obvious from the original sentence.

                  1. J.G.Harston Silver badge

                    Re: Happens in Google sheets as well.

                    00:00 is the start of the day, 24:00 is the end of the day.

                    1. Rol Silver badge

                      Re: Happens in Google sheets as well.

                      I'd argue 23:59:59 is the end of the day, and that 24:00 is actually 00:00.

                      1. PeteA
                        Coat

                        Re: Happens in Google sheets as well.

                        Unfortunately, the day doesn't really have "an end" point - there's a concrete *start* at 00:00, and various times leading up to that such as 23:59:59.9999. But no specific end point, just an analogue rollover. There's also no such time as 24:00; the constraint is 0 <= HH <= 23. So midnight is always YYYY-MM-DD 00:00, where DD represents the day that is just starting [which feels a bit counter-intuitive].

                        FWIW, the most serious Excel issue here doesn't seem to have been noted: the default association of a CSV with an application that doesn't handle CSV's properly. All one of those scientists has to do to destroy their data is double-click the file and then hit "Save"; there is no point at which it is even possible to intervene during the "import". This has caused grief in just about every business I've worked in, particularly when the marketing team have been involved.

                        Mine's the one with the group policy to associate CSV's with notepad....

                        1. NorthernMonkey

                          Re: Happens in Google sheets as well.

                          There was 24:00 up until 2019!

                      2. NorthernMonkey

                        Re: Happens in Google sheets as well.

                        And up until 2019 your argument would have been wrong - however TIL since ISO 8601-1:2019, both the term "midnight" and "24" as an hour value have been deprecated. You're now meant to say "beginning of the day" (as opposed to the end of the day, which is presumably "00:00"). Retrograde meddling.

              2. AliBear

                Re: Happens in Google sheets as well.

                What has an ISO standard about metric self-locking nuts got to do with date formats?

        2. NATTtrash Silver badge
          Boffin

          Re: I must be missing something...

          With LO true for DEC10. But did you try DEC-10 and DEC 10?

          1. Vometia Munro

            Re: I must be missing something...

            I'd be more impressed if it autocorrected to PDP-10 or KL10.

      2. logicalextreme Silver badge

        Re: I must be missing something...

        Excel: not even #VALUE!

    4. MJI Silver badge

      Re: I must be missing something...

      I hate Excel it just breaks data by default.

      Customer gets an EDI file does a small mod to a piece of text then saves.

      Item codes with leading zeros, gone, long barcode numbers changed completely.

      Importing prices for customer 016, does a tidy in Excel, now customer 16.

      I use CSVED and Notepad++ as Excel will make me swear.

      Finally NEVER open a DBF with it and save, it will not save in that format but completely change it.

      I remember the telling off of the custoomer who did it and the trawls through backups.

      Does almost as such damage as crypto worms.

      1. hoola Bronze badge

        Re: I must be missing something...

        And not forgetting that the data is formatted on input so when it then turns something into a date or whatever, when you then go and put the correct formatting in the 10/07 that was turned into 10 July gets converted into complete irrelevance.

        It is called progress and the usual developers knowing best without any concept that someone may not want it to do this. The fact that you do not appear to be able to turn it off is just as infuriating.

        It is "usability" gone nuts.

      2. J.G.Harston Silver badge

        Re: I must be missing something...

        I've notice my build sheets have telephone numbers entered with a leading 'o' instead of '0' to prevent it being numberified... which would be neccesary if the morons entered the telephone numbers correctly in the first place.

    5. Charlie Clark Silver badge

      Re: I must be missing something...

      It's one of the reasons why so many scientists are switching to Python's Jupyter and Pandas, with Excel relegated to the format for reports.

      Excel's import of text files has always been miserable, though this isn't helped by the deficiencies of the CSV format. But it really would be useful to be able to disable type inference as a preference and not fiddle with it, file by file.

      1. Trygve Henriksen

        Re: I must be missing something...

        We need a 'CSV 2' format, with first line containing 'CSV2', the second containing formatting info, the third containing the labels fior each field, and a 4th line containing 200 commas...

        The data, from line 5 and down, will all begin with a comma.

        That should make such a complete mess of Excrement's import function that whoever is using it can't help but notice and will be forced to take action...

        Add a 'CSV2_CSV2_CSV2_CSV2' line every 100 line or so, with a random number of commas before it to screw with autogenerated field widths...

        1. PeteA
          Trollface

          Re: I must be missing something...

          Perhaps tweak the first line identifier to "#!/bin/vi" ?

      2. John Brown (no body) Silver badge

        Re: I must be missing something...

        "Excel's import of text files has always been miserable, though this isn't helped by the deficiencies of the CSV format. But it really would be useful to be able to disable type inference as a preference and not fiddle with it, file by file."

        Thankfully, I've not had to deal with the vagaries of CSV files and various programs' methods of imparting them for many, many years. But I do remember having to wrangle CSV files to get them into DBase II, Delta, SuperCalc and Lotus123, often writing code specific to a particular CSV file to make it look like what the destination program expected.

    6. Doctor Syntax Silver badge

      Re: I must be missing something...

      Files get passed around. If you rely on setting the program that opens them to something non-default you come up against the problem of relying on everyone to do that correctly every time otherwise munged files start getting circulated. It's a bit like you rely on everyone to use mv or rm -r correctly in root. Of course nobody ever got the latter wrong did they?

    7. Lance-Corporal Obvious

      Re: I must be missing something...

      UPVOTE. I was literally about to post the exact same thing. I know people aren't all stupid, but really how difficult is it to define a spreadsheet column as text. I've had that problem myself when some entry in a spreadsheet column is defined by the spreadsheet as a date field. Right click, define as text, it's probably different in different software but takes two ticks (2 February 1943 to genetic scientists)

  3. tcmonkey

    Easy to scream and shout at MS for this one, but methinks this is more a case of everything looking like a nail to a man who has only a hammer. Spreadsheets were not designed for such tasks, and jamming them in where they do not fit was bound to cause problems. Anyone who has worked support in an office environment will tell you the same.

    1. logicalextreme Silver badge

      And this alleged "solution" to the problem is depressingly like those that we see in companies every day — rather than use an appropriate tool, they've decided to beat and mangle what they've got until it just about fits into the holes they've already decided to use (I know that there isn't any one authority out there that's decided they're going to use spreadsheets, but I feel like the drum could have been sounded a little louder on the message of not using spreadsheets for anything that needs data integrity).

      I look forward to the new names being broken by a future Excel "feature" update (or even a localisation issue in the current version), requiring another rename drive and at least two sets of mapping data to translate between the 3+ generations of naming convention. Personally, I'll stick with the time-honoured tradition of vetoing software that corrupts data silently.

      1. NATTtrash Silver badge

        I see you try to write SomeThinG ImPORTanT?

        I look forward to the new names being broken by a future Excel "feature" update...

        Indeed. Or any other software TBH. As for example people know who try to write a publication/ text with words/ abbreviations that have a different format of caps and lower case than your software thinks appropriate...

      2. Zippy´s Sausage Factory
        Windows

        I look forward to the new names being broken by a future Excel "feature" update

        The last time my work Excel had a "feature update" it became a crash-prone mess for three months until the next feature update. I ended up installing Libre Office so I could actually get some work done...

    2. Anonymous Coward
      Anonymous Coward

      “ Spreadsheets were not designed for such tasks,”

      What alternative would you suggest?

      Spreadsheets are designed exactly for such tasks, it’s just that spreadsheet manufacturers felt it was helpful to auto format some data on import/input, if that default behaviour could be changed then problem solved.

      https://en.wikipedia.org/wiki/Spreadsheet

      A spreadsheet is a computer application for organization, analysis and storage of data in tabular form.[1][2][3] Spreadsheets were developed as computerized analogs of paper accounting worksheets.[4] The program operates on data entered in cells of a table. Each cell may contain either numeric or text data, or the results of formulas that automatically calculate and display a value based on the contents of other cells. A spreadsheet may also refer to one such electronic document.

      1. Dave 126 Silver badge

        If this issue affects gene researchers, it might also affect others. Whilst the gene researchers have, as a group, decided on a workaround, it would be sensible for MS to allow users to configure Excel to not auto-format when importing files - this way benefitting other users who might otherwise run afoul. Alternatively, have Excel bring up a dialogue box telling the user how to convert the cell type - "Hey, it looks like you might be working with dates in this column. If you wish to convert it to a date format, select the column by double clicking the header and then right click, Format Cells".

        Lots of people use Excel for some pretty exotic stuff, including control and automation. That horse has bolted. Some options, clear and well communicated, would seem to be the sensible way forward. Maybe just a choice at install: Are you an accountant or an engineer?

        1. oiseau Silver badge
          Facepalm

          ... sensible for MS to allow users to configure Excel to not auto-format when importing files ...

          Yes, it would be.

          But you do understand that Excel is a MS application, don't you?

          Where do you think the term sensible applies?

          O.

          1. A.P. Veening Silver badge

            Where do you think the term sensible applies?

            In the case of Microsoft directly following the negation.

            1. Rol Silver badge

              I look forward to the time that Excel gets a real AI appendage to its program.

              Maybe the Doomsday clock will start ticking a little faster at that point, but have no fear, we can force M$ to take on the maintenance of that sayer of the end, and a few updates later, it will no longer be 23:59 31st December, but 23rd of the 59th month 31 minutes past 12 and humanity will have been saved forever by m$'s unflinching hubris.

              1. logicalextreme Silver badge

                Ah yes, we'll all look back fondly on the time the clock updated to #N/A

        2. 96percentchimp

          "Alternatively, have Excel bring up a dialogue box telling the user how to convert the cell type - "Hey, it looks like you might be working with dates in this column. If you wish to convert it to a date format, select the column by double clicking the header and then right click, Format Cells"."

          Maybe they could give it a little personality, and a name, say "Clippy"? (or don't, really, please don't)

        3. Alumoi

          OMG! No! Not Clippy again!

          1. Sudosu

            The movie trailer;

            In a dystopian future, where the Microsoft world believes you cannot think for yourself, an unwanted hero has has returned...

            Clipp-AI

            Clip-AI - "Hey I see you are trying to do something productive, so I'm going too help by converting some of your data randomly to date format"

            Helpless user - "NOOOOOOOOO!"

            Clipp-AI - "Glad I could help! Now lets take a look at your bank account and see what great new cloudy services we can sing you up for"

            1. logicalextreme Silver badge

              I mean it's better than the dystopian present, where it does that without telling you.

      2. Anonymous Coward
        Anonymous Coward

        @ "Spreadsheets were not designed for such tasks", indeed they were designed to get rid of book keepers of such business records as invoicing and sales.

        @"What alternative would you suggest?", software designed for the purpose by people who understand the subject i.e. use the right tool for the job

        "A spreadsheet is a computer application for organisation, analysis and storage of data in tabular form", your definition also applies to other software including databases, report etc so let us agree that not everything that uses the format is EXCEL/SQL, that a lot of time is being wasted because EXCEL/SQL is selected because the people holding the purse strings recognise the name.In reality the tabbed format is just easy for humans to process since they are trained at school to extract data from reports in the same format.

        1. Anonymous Coward
          Anonymous Coward

          "your definition "

          not my definition, I lifted it from Wikipedia and included the Wikipedia link before the definition.

          Yes i have a science degree,

          Yes i was taught to reference my sources.

          Yes i could have written my own definition but why bother creating definition pollution if someone else has done a perfectly adequate job already!!

          1. Anonymous Coward
            Anonymous Coward

            Science degree and you trust wiki as credible source?

            1. AndrinaW

              Why wouldn't it be?

        2. Trygve Henriksen

          Once upon a time the most popular Word Processor on the PC was Lotus 123...

          Because the tabbed format made it easy to visualise where the text ended up when you printed it later.

      3. Doctor Syntax Silver badge

        "What alternative would you suggest?"

        How about an application that tabulates data but defaults formats to text unless told otherwise on a by-column or maybe by-row basis?

        1. Rol Silver badge

          You mean like how Excel used to work.

    3. Richard Gray 1
      Pint

      Provides a useful tool

      I have been fortunate enough to work for a University Lab doing medical research.

      One of the researchers asked me (a few years ago now) for a 4k laptop and 64bit office.

      As you can imagine this was an arm and a leg, and at the time 64 Bit office was as stable as a very unstable thing.

      When I asked why, he showed me ( I don't pretend to understand a fraction of what they do other than it was really important work for Diabetes ). They import the data then use the conditional colours to see the various gene snips.

      It can make it much more obvious where changes / outliers are. you look down the column of red and see a blue indicates a different gene snip that can possibly indicate the faulty \ different gene sequence. Much easier than looking down a list of numbers \ letters

      Knowing I was helping people in a very small way did sort of help my soul recover after working for _ing Lawers.

      Beer (or Non Alcoholic beverage of their choice I know some of them don't drink, but the others take up their share ) for all the wonderful medical workers \ researchers trying to find cures for all kinds of stuff.

      1. Anonymous Coward
        Anonymous Coward

        Re: Provides a useful tool

        My experience of spreadsheet for science in academia is that the format is selected as it allows some protection of functions so as limit expertise theft and/or where the academic in question cannot understand/recreate the calculations themselves.

        Yet another example of dumbing down due to IT again allowing the inept to present themselves as otherwise.

      2. jake Silver badge

        Re: Provides a useful tool

        "t can make it much more obvious where changes / outliers are. you look down the column of red and see a blue indicates a different gene snip that can possibly indicate the faulty \ different gene sequence."

        That is trivial to do in vim or XEMACS. Has been since the 16-bit days. On 4 megs of RAM. (Well, at least 8 meg for EMACS + 4 more for the GUI, of course ... and back then vim was Stevie & ran on an Atari ST.)

        1. Anonymous Coward
          Anonymous Coward

          Re: That is trivial to do in vim or XEMACS

          If "Databases for biologists" can be a tough sell for those using spreadsheets, I'm not holding not much hope for "Vi and Emacs for biologists" :-)

          1. Anonymous Coward
            Anonymous Coward

            Re: That is trivial to do in vim or XEMACS

            Obviously something simpler is needed ... perhaps ed? :-)

          2. Doctor Syntax Silver badge

            Re: That is trivial to do in vim or XEMACS

            Some-time biologist who moved into databases here. That's databases as a complete career change. You can keep your emacs but vi is hardwired to the fingers.

            Nevertheless a database might not be the right solution here. As soon as you start talking about tabular data and databases in the same breath you start thinking about RDBMS but the time taken to set that up for what might be a one-off is probably not worth it. There's almost certainly a need for a simple way to tabulate data without making too many assumptions about the nature of the data. It's a principle I've observed many times; the more assumptions you make in the design of an application the less capable it becomes.

            In this case take on the data as text columns which is probably the most minimal assumption you can make. If you want to flag the column* as date - or anything else - that's the time to let the application reformat the text. If you want to use a database to store the data you can, of course, have the application generate the schema at this point.

            * Spot the implied assumption here.

  4. VeganVegan
    Trollface

    And you thought that Clippy went away

    He’s more insidious hiding among all those actions that lurk in the software. After all, M$ knows better than you what your data should be, and who are you to argue with that?

    1. Rol Silver badge

      Re: And you thought that Clippy went away

      I vaguely remember the analogy "If cars developed like computers, a Rolls Royce would cost £50, do 5,000 miles a gallon, etc"

      If cars had developed like Excel, most of the world would be reading this from their hospital bed.

  5. Binraider

    I have to admit, the ability to customise Excel so it doesn't do certain automated tasks like this would be beneficial. But for hacking the kind of quantities that are normally talked about in genetics; I think I'd be looking at a language of some sort.

    1. IGotOut Silver badge

      Right click, format as text.

      1. NATTtrash Silver badge

        File > Excel Options > Proofing > AutoCorrect

      2. Binraider

        Per-file editing. That's not the same as configuring it to NOT auto-format for specific circumstances.

    2. hitmouse

      There is an entire Data section on the Excel ribbon, of which the left hand side is given over to importing and formatting data. The underlying functions have been available for decades.

      The problem is largely around the complete failure of universities and other institutions to take software training seriously. Since most of these people can't even book a meeting properly, forget about them trying to manipulate data.

      1. Joe W Silver badge

        No, the problem is the brain dead default software behaviour of excel. People are lazy and always will be. And excel is a stupid program for any data analysis that is more complex than basic bookkeeping... Ok, that's three problems.

        Looking at typical excel course curricula, the data import is probably in the advanced course. Fun thing: this stupid autoconvert happens (at least used to) also when your LOCALE setting is not English or American. In many other languages the string 1DEC has no calendaric interpretation.

        Yeah, functions for this exist. They are a pain to use, unless you have scripted some sort of import filter - otherwise you have to jump through the hoops for every single file you use. Excel also has problems with csv (comma separated values) in the default settings when trying to load the file by just opening it.

        1. Wellyboot Silver badge

          I'll go with a poke at both sides.

          As a curmudgeonly git I expect that whatever I type into an excel sheet* should stay exactly as I type it - until I do something about it.

          Learning how to 'Format Cells...' isn't hard. Being a prima-donna about not learning to use the simple formatting tools is just unprofessional.

          The first 20 years of desktop spreadsheets (visicalc onwards) had a lot of usage restrictions and yes formatting was a pain at times but at least you knew that the data you typed stayed as typed.

          * I'd extend this to wordprocessors as well, underline in red by all means but don't change the letters I typed which may well be a correct technical reference or foreign word (or both).

          1. Doctor Syntax Silver badge

            "As a curmudgeonly git I expect that whatever I type into an excel sheet should stay exactly as I type it - until I do something about it."

            The triumph of hope over experience.

        2. aks Bronze badge

          Using CSV? That's far dumber and open to misinterpretation than Excel's default of interpreting text strings as dates or numbers using Standard format rather than Text.

          Use Tab-separated files rather than CSV wherever possible.

          1. Anonymous Coward
            Anonymous Coward

            WTF?

            God no - Tab separated files are the work of the devil and more or less guaranteed to not be readable by the bit of software you need to read it. CSVs are mostly brilliant - and about the only time they get their knickers in a twist is people spraying apostrophes round.

          2. AndrinaW

            Tab FTW

            Then you don't need any quotes etc

        3. Doctor Syntax Silver badge

          "And excel is a stupid program for any data analysis that is more complex than basic bookkeeping... Ok, that's three problems."

          I'm sure there's a much larger set of suitable problems than that. However there are other problems for which it isn't suitable and for which a similarly accessible application doesn't exist.

      2. A Non e-mouse Silver badge

        When opening a CSV I think Excel (& LIbreOffice) should give the user the choice: Either let the program auto-guess the column types or allow the user to specify them. (Ideally allow the user to supply a template of the column types so you don't have to rebuild the format every time)

        1. Dave 126 Silver badge

          Whatever it does, it should let the user know it is doing it. Minimum. Ideally, it should ask the user if it should do it.

        2. I ain't Spartacus Gold badge

          When I open .csv files Excel does exactly nothing to them. Perhaps that's because I open them as .csv rather than importing them into Excel? I then save them as .csv without making any changes. If I want to import data into an Excel file I usually copy and paste it into places where I've already set up the formatting. So dates get turned into dates in a way that I control. It's 20 years since I was a heavy Excel user, so I've forgotten loads of it now.

          It does suffer from being a program designed to do financial stuff - that's morphed into a Swiss Army Knife - which then gets abused for jobs a chainsaw would do better. Yes, I did once use Excel to print a crude poster for a seminar on the Water Regulations, who's asking?

        3. hitmouse

          You do have that choice - and people are complaining about the auto-guess step. If you import via file open then you get to override those auto-guesses with a simple columnar application of format buttons. if you import via well.."file import from text/csv" then you can nerd out on column type choices and token separators.

          This stuff has been around for decades.

          This response exemplifies the user issue of simply not doing a very simple check on how to use the tool properly.

      3. Anonymous Coward
        Anonymous Coward

        "The problem is largely around the complete failure of software companies to take interaction design principles seriously"

        FTFY

      4. Paul Kinsler

        complete failure

        It doesn't have to be /complete/ failure. Just one person in the handling chain has to forget once, and the mangling is done.

      5. oiseau Silver badge
        Stop

        ... ribbon ...

        Ribbon?

        No, pleeeeeaaaase, nooooo.

        Not the goddamn ribbon.

        Aughhhhhhhhhhhh!!!!

        The #!|?¡¿$ ribbon is why, when for some reason I have to use Excel for anything (instead of LO Calc which I still have to get used to), I just pull up Office '97 in a VM.

        O.

        1. Uncle Slacky Silver badge
          Stop

          I use UBitMenu to get old-style menus: https://www.ubit.ch/software/ubitmenu-languages

  6. Imhotep Silver badge

    Interesting - I just today read Humble Pi which touches on this exact problem. And I'm sure we've all had to wrestle with Excel when it insists on treating text as numbers and then mangling them.

    It does seem odd to tailor your processes to accommodate a vendor's quirky software, but realistically: Excel isn't going to change.

    1. Warm Braw Silver badge

      realistically: Excel isn't going to change

      Realistically, biologists aren't going to change. That's actually more concerning.

      1. David Nash

        if someone sends a CSV and it opens in excel just by double-clicking, then people are going to double-click. You can't stop people doing things the easy way except by actually preventing it from working.

        (Corrupting the data silently so they may not notice doesn't count as preventing it from working)

      2. Doctor Syntax Silver badge

        Realistically the requirement isn't going to change. What's needed is a more appropriate means of fulfilling the requirement.

      3. Paul Hovnanian Silver badge

        "Realistically, biologists aren't going to change."

        'Evolve' is the appropriate term.

      4. TrumpSlurp the Troll Silver badge
        Pint

        Biologists?

        There's been a lot of similar commentardery with two common themes:

        (1) I've spent my entire sad and geeky life buried in data. Why can't these people do what I do?

        (2) Biologists should be able to deal with this.

        Please remember that these people are just Lusers. They have all the aspects of the vast herd of non-biologists who drive the average IT support person to drink on a daily basis.

        Your average Luser will, early on, learn to use a word processor and spread sheet at a very simple level and will then be bound by the few functions that they learned for the rest of their working life.

        Trying to break them of the habit, or educate them, is a way to profound despair.

        Hammer/nail. Financial people will do everything in a spreadsheet, however unsuitable this seems. Marketing people will use PowerPoint in a similar way.

        Deciding to change what you can control instead of another decade or more of fighting what you can't control seems eminently sensible to me.

        Just thinking about it makes me need ->

  7. Kanhef

    Not just Excel

    LibreOffice also likes to auto-detect data types, and then auto-format based on what it thinks you're doing.

    One solution would be to have a setting for default cell formatting. Auto-detect can still be the default option, but let people set it so negative numbers are always red and in parentheses, for example.

  8. jake Silver badge

    They are creating a database ...

    ... so why are they using a spreadsheet?

    Horses for courses & all that.

    1. doublelayer Silver badge

      Re: They are creating a database ...

      Some people have this very strong aversion to databases. They don't necessarily provide a reason, and if they did you wouldn't understand it. I've seen lots of people do this, including several types who know about databases but still use spreadsheets. Part of it might be that they don't want to have to write the UI around the database and the only reasonably common portable database format is SQLite*, but that's not a great reason.

      *For example, the MS Access database format isn't easy to open if you don't have a license for Access. Dumps from other databases might need tailoring if you're using a different server, or people just don't want to run servers. I view an SQLite file as a perfectly valid way to send a dataset, but I'm comfortable issuing SQL queries. I don't know if there are good GUIs for that which allow viewing, sorting, adding, and all of that without needing to learn SQL, but if not maybe we should write one for the biologists. Anyone want to collaborate on that open source project?

      By the way, XKCD has been prescient as always.

      1. Dave 126 Silver badge

        Re: They are creating a database ...

        You beat me to the XKCD, very relevant!

        At least Polaroids are an honest record of known optics and chemistry - there are no algorithms in the middle.

    2. Anonymous Coward
      Anonymous Coward

      Re: They are creating a database ...

      Data sharing? CSV is a useful format for sharing data. Unfortunately, given Excel's behviour, the following scenario is probably: click on "download" link, file opens in Excel auto-corrupting date-like data, researcher saves file to disk using Excel's "save" function, imports into real database.

      I did suggest that adding a machine-readable data format (JSON) would be helpful on the last project I worked on, but this wasn't taken up as it wasn't common for people to be able to process it.

      1. Anonymous Coward
        Anonymous Coward

        Re: They are creating a database ...

        I think the reasoning is like this: in a spreadsheet, the user can see where everything is, and navigate it easily - it's just a virtual grid of pigeon holes. If you can navigate yourself K3 or whatever, and know how to reference sets of row or columns, it's all easy, and the mental model is clear and simple.

        However, who know what goes on in a "database"?-- anyone who prefers a spreadsheet probably doesn't, and very likely doesn't want to know either. Here the mental model is probably something like handing your stuff to a cloakroom attendant who puts it away where you can't see, and you can only do things by learning some weird query language which you will probably keep forgetting, and then have to export as a spreadsheet anyway -- again with some command or other -- to send to other people. Perhaps such misconceptions can be changed, but will it be easy?

        Still, perhaps if this auto-mangling of spreadsheets gets more profile then attitudes could shift. Then again, IT stuff does weird incomprehensible shit all the time and ordinary people learn to shrug, adapt, and cope - more or less - with that. :-)

      2. Doctor Syntax Silver badge

        Re: They are creating a database ...

        "I did suggest that adding a machine-readable data format (JSON) would be helpful on the last project I worked on, but this wasn't taken up as it wasn't common for people to be able to process it."

        CSV is also a machine-readable format. The problem isn't having the data in a machine readable format, it's having the machine read it in the right way. People have a machine to read the CSV - it's just the wrong machine. For your JSON you'd also have to provide the machine.

    3. Primus Secundus Tertius

      Re: They are creating a database ...

      "... so why are they using a spreadsheet?"

      Answer: it used to cost extra to get a version of Office that included Access.

      But I understand Access is there in Office 363⅔.

      1. Anonymous Coward
        Anonymous Coward

        Re: They are creating a database ...

        It's not in Home&Student edition. Also, Access is Windows-only, Mac users need not apply.

    4. Doctor Syntax Silver badge

      Re: They are creating a database ...

      Are they?

      Not necessarily. They just want a quick and dirty way of viewing and exchanging tabular text data. Excel is a case of "if you have a hammer...". An RDBMS would make that a sledgehammer. What's needed is something simpler, not something more complex. And I say that as a biologist who changed to become an RDBMS specialist mid-career.

  9. Winkypop Silver badge
    Facepalm

    User Error

    Don’t blame the tool

    1. jvf

      Re: User Error

      I am an excellent mechanic. While it’s true that good mechanic doesn’t blame his tools, I have held plenty of crappy and badly performing tools in my hands. Excel’s auto format (and for that matter Word’s and most cell phones spelling auto correct “feature”) have caused me much grief over the years as these programs try to do my thinking for me. Sorry Winkypop, it IS the tool that is causing the problem.

      1. jake Silver badge

        Re: User Error

        No, it's not the tool. Its the user choosing the wrong tool, in this case a spreadsheet to do the work of a database. You don't use Metric tools on your old Ford[0], do you?

        [0] Chevy, Mopar, pick your poison ...

        1. jvf

          Re: User Error

          Hi Jake,

          My Snap On dealer once told me a story about a Ford production line where the car came down the belt with metric on one side and English on the other. True? Who knows, but funny. The argument about Excel (or any other program) being appropriate can easily be made (especially when a database should be used instead of a spreadsheet). However, in the real world, people use what’s at hand or what they’re familiar with. And, since most are not really skilled, the subtleties of controlling the formatting are not evident. But, even with me paying attention, I have had many bad experiences with Excel formatting.

          1. jake Silver badge

            Re: User Error

            "where the car came down the belt with metric on one side and English on the other. True? Who knows"

            Not unless the Union negotiated an extra 15% for all lug-nut installers in order to reflect their updated training. 20% for lefties on the right side and vice-versa, of course.

            1. Neil Barnes Silver badge

              Re: User Error

              Sometime mid seventies, some UK Fords came off the line with imperial heads on metric bolts. (Or possibly the other way around - it's been a long time since I took one to bits). I always assumed it was something to do with their production line transition to metric fixings.

              (And I recall a Datsun of the same vintage which was metric throughout, except for a single 3/4 BSP thread for the temperature sensor. Though that had at least some excuse: the engine was a licensed version of the BMC engine found in Minis and Morris Minors.)

              1. david 12

                Re: User Error

                BSP -- British Standard Pipe -- is a metric thread. The metric thread for pipes.

                I can only imagine the negotiations and horse trading which led to that result (standards creation is notoriously corrupt), but out of all the thought that went into creating the metric standards, BSP was chosen without modification.

                For nuts/bolts, a modern thread profile was wanted, which was why all the existing British bolt threads were abandoned, and Britain had the choice of going with Europe, or continuing with the new standard it had just developed in co-operation with the USA. Britain abandoned the new trans-Atlantic standard, and adopted the European standard instead. And Europe chose BSP for pipe threads. Was that the best choice for Europe? I have no idea what their pipe threads looked like before metric threads. But after metric threads, I know what their pipe thread standard is: BSP.

          2. jake Silver badge

            Re: User Error

            "However, in the real world, people use what’s at hand or what they’re familiar with."

            Of course. But it's still not the tool's fault when somebody tightens the wheel-bearing nuts on a Dana 60 with a Stillson wrench (for the uninitiated, that's the proper name for your common or garden pipe wrench).

            1. Tom 7 Silver badge

              Re: User Error

              I still think its MS selling you a lie. 'Here is a tool that will make computing easy' has cost people far more than even MSes massive profits. If data was plastic the seas would be solid with the waste data they have created.

            2. Anonymous Coward
              Anonymous Coward

              Re: User Error

              "But it's still not the tool's fault when somebody tightens the wheel-bearing nuts on a Dana 60 with a Stillson [...]"

              Resorting to a Stillson is possibly because of the need for leverage. Even a socket spanner's T-bar can bend - especially when getting extra leverage with a piece of steel pipe.

              The recurring frustration when undoing my Range Rover's wheel nuts was only solved by a very large one-piece four-legged socket "spider". For a country which didn't have department stores in the early 1980s - Luxembourg City had a large Aladdin's Cave shop of tools. They even stocked road construction equipment.

              1. jake Silver badge

                Re: User Error

                Not lug nuts. Wheel bearing nuts. On a Dana 60, the retaining nut for the wheel bearings is a large (1.25") nut that, following the proper procedure, is ultimately tightened to just barely finger-tight[0], basically setting the bearing pre-load. A Stillson is massive over-kill for such a delicate operation, and has lead to many early wheel bearing failures. So why do people use the pipe-wrench? Because most shade-tree mechanics don't have the proper socket/spanner in that size ... and/or can't be arsed to read and follow the simple proper procedure.

                Sound familiar, in the context of this series of comments?

                [0] Finger tight for me, your mileage may vary. Consult your manual.

        2. Anonymous Coward
          Anonymous Coward

          Re: User Error

          "No, it's not the tool. Its the user choosing the wrong tool, in this case a spreadsheet to do the work of a database. You don't use Metric tools on your old Ford[0], do you?"

          Yes, if it fits, better than scrabbling through a box of AF, BA, Whitworth and BSF before finally grabbing an adjustable wrench.

          1. Anonymous Coward
            Anonymous Coward

            Re: User Error

            Adjustable wrenches are great. No matter what unholy combination of metric or imperial nuts you are presented with, you can round them all off in moments with just one tool.

          2. jake Silver badge

            Re: User Error

            Remind me not to purchase a used vehicle from you.

            Actually, one look at the condition of the fasteners will tell me all I need to know.

        3. Anonymous Coward
          Anonymous Coward

          Re: User Error

          No, it very much is the tool.

          Consider the following scenario: web site offers data in CSV format for download. Researcher clicks "download" link - the browser downloads the file and helpfully opens it ... in Excel - at this point the data is now mangled. The researcher clicks "save as" to save it somewhere for import into something else later.

          If you don't consider that the software is the problem in the above scenario, then I would suggest you go and listen to J Nielsen and D Norman for a bit - unexpected behaviour is a defect in the software when a sufficient proportion of people find it unexpected.

          Honestly, Excel's auto-formatting of dates is a menace - CSV is a perfectly fine data format for data exchange, but Excel's behaviour does make it problematic

          1. jake Silver badge

            Re: User Error

            It's not the tool. It's the choice of tool. The tool is doing EXACTLY what it was designed to do. It is hardly the tool's fault that it doesn't do what somebody expects it to do.

            It amazes me that some readers of this august forum seemingly expect the tool to read the mind of the user, and then do that thing instead of what it was programmed to do.

            So let me get this straight ... if you order the fish at a steakhouse, and you actually received fish (as ordered) instead of steak you bitch to the waiter about it "because everybody knows that you eat steak at a steakhouse"? That's hardly sensible, now is it?

            Be very careful what you wish for ... DWIM functionality causes far more headaches than it fixes. See modern autocorrection if you're not old enough to remember Damn Warren's Infernal Machine.

            1. albaleo

              Re: User Error

              "It amazes me that some readers of this august forum seemingly expect the tool to read the mind of the user"

              I think it's the opposite. By changing data without any express instruction, Excel does try to read the mind of the user. It wouldn't be so bad if it only changed the appearance of the data, but kept the original data intact behind the scenes.

              1. jake Silver badge

                Re: User Error

                Note that I didn't say I liked Excel. I don't. In fact, I never use it. Not for over ten years.

                However, the reality is that for the vast majority of users it is behaving exactly the way they need/want it to behave. And for the few who don't like the default behavio(u)r, it can be changed to whatever they want it to do (in this particular case, anyway).

                But no. Instead the gene researchers are going to change the nomenclature that the entire planet has been using for decades. Who, exactly, is going to re-write (and then pay to publish & distribute) all of the textbooks and other documentation to reflect this change? Or are they going to turn it into a giant clusterfuck because they can't be arsed to learn how to use the proper tool for the job?

                If the job is important to you, LEARN TO PROPERLY USE THE PROPER FUCKING TOOLS TO DO THAT JOB OR BE PREPARED TO FAIL MISERABLY! How fucking hard is this to understand?

                1. david 12

                  Re: User Error

                  Nobody wants this feature of Excel.

                  Lots of people like Excel, and lots of people use Excel, but nobody likes this feature of Excel. There are a large number of people who haven't been bitten by this feature of Excel recently, who don't care today, and there are people who have been bitten, who do care, and there are people who don't use Excel, but there are no Excel users who want this feature of Excel.

            2. Doctor Syntax Silver badge

              Re: User Error

              "It's not the tool. It's the choice of tool."

              I agree it's the choice of tool. But the correct choice of tool isn't something more complex than a spreadsheet, it's something simpler. The standard PC, whatever the OS, doesn't have it. If it were text you'd have the choice of your word processor or Notepad/Kedit/whatever. For tabular data such as CSV you have a choice of spreadsheet or.

              1. hitmouse

                Re: User Error

                If you're in a university then most of these alternate tools are actually available, if not in a standard disk image for computers given to researchers, then by a download from a standard set of enterprise-licensed research tools.

              2. jake Silver badge

                Re: User Error

                "For tabular data such as CSV you have a choice of spreadsheet or."

                Or any decent text editor.

          2. Primus Secundus Tertius

            Re: User Error

            "CSV is a perfectly fine data format for data exchange"

            Oh no it isn't! Capital letter 'O' instead of numeral '0', etc. Some years ago, exam results in Scotland were screwed up by that problem in the data.

            XML with a schema that enforces data types is something you could write into a contract to ensure reliable data transfers between organisations. You cannot reasonably type XML manually, but Excel can, with difficulty, be made to export that way.

            But clearly XML is a non-starter for the situation the article describes.

            1. Anonymous Coward
              Anonymous Coward

              Re: User Error

              Oh I wondered how long before someone brought up XML schemas. has anyone ever actually seen one that works, is supported by software and actually used to exchange data?

              I bet there's a beautiful XML schema for gene wrangling. And I bet that 99% of the really clever (not Excel) software geneticists use won't read it because it's a ridiculously complicated construct that by trying to catch all the weird edge cases that got put up in the millennia of committee that wrote it, ends up so complicated that no-one in their right mind would try.

              1. Imhotep Silver badge

                Re: User Error

                I have had to supply data in XML files to CMS (Medicare/Medicaid), so there is at least one schema that works and is routinely used to exchange data.

                They even helpfully supplied a sample file, so I was able to write a program to extract our data and create the file they required.

                Would this have any use to anyone besides moving data? Nah, certainly not a useful format to view or manipulate data.

        4. Anonymous Coward
          Anonymous Coward

          Re: User Error

          "You don't use Metric tools on your old Ford[0], [...]"

          In general a set of Metric spanners will fit most Imperial hex bolt heads. There are only a few Imperial sizes that are needed for some interpolations. Even then the bolt head or the spanner tolerance - or the bolt head wear - can mean the expected spanner doesn't quite fit. I have several nominal 13mm and 1/2inch spanners which are mix&match against some bolts/nuts.

          Apparently LandRover put different size heads and nuts on one bolts' fitting - so you didn't need two spanners of the same Imperial size to adjust it.

          1. 2+2=5 Silver badge

            Re: User Error

            > so you didn't need two spanners of the same Imperial size to adjust it.

            I've never understood why no one produces a standard C and ring spanner set with different sized ends on each spanner for exactly that reason.

            1. Anonymous Coward
              Anonymous Coward

              Re: User Error

              "I've never understood why no one produces a standard C and ring spanner set with different sized ends on each spanner [...]"

              It can be done - but possibly not in that way. You can buy "double ended" spanners with different sizes at the ends. They come as either both ends "open" or both ends "ring". So two spanners - one of each type - will replace two "combination" spanners of each size. You are still stuck if you want to use only the open or ring ends though.

              https://www.germantools.co.uk/product-category/proxxon-spanners/

            2. jake Silver badge

              Re: User Error

              They do make such sets. Most mechanics hate them.

          2. jake Silver badge

            Re: User Error

            "In general a set of Metric spanners will fit most Imperial hex bolt heads."

            For values of "fit" that includes "completely ruins bolt heads".

        5. Primus Secundus Tertius

          Re: User Error

          "a spreadsheet to do the work of a database"

          I have regularly used Excel as a database for membership lists of organisations. From there I could do Mail Merge postal or email messages to the members. Also it was a format that most other committee members could handle if necessary.

  10. RM Myers Silver badge
    Headmaster

    Time to be pedantic

    These are not the gene names, they are the common abbreviation used for the gene. Thus, the gene abbreviated as BRCA1 is named "Breast Cancer 1, early onset'. And yes, I realize people rarely use the full name, and then probably only once per article. This article is not about changing the name, but rather the standard abbreviation.

    Also, excel is being used to display genetic information in a format people can easily read and manipulate for small subsets of data. The source genetic data typically uses formats like SAM or BAM or other specialized formats, which are much easier to work with programmatically, but rather difficult to browse (and they can be HUGE). There is a large amount of open source software for analyzing genetic data, but much of it is fairly complex and probably overkill for small subsets of data, particular for non-IT literate scientists.

    1. Joe W Silver badge
      Pint

      Re: Time to be pedantic

      Have one - - - >

    2. Tom 7 Silver badge

      Re: Time to be pedantic

      I wonder if this day and age non-IT literate scientists are not scientists in the same way engineers without maths aren't.

      1. Dave 126 Silver badge

        Re: Time to be pedantic

        You would lose a lot of talent, skills, various ways of seeing the world, and thus insights, if the only people who were allowed to be scientists were IT literate to a certain level. However, there some issues to be sorted out.

        In a similar vein, a good number of working scientists aren't statisticians, and can as a result fall into traps. Some universities therefore employ some full time statisticians for researchers to consult.

        So perhaps the answer is to educate scientists in IT enough so that they know what they don't know, and thus seek assistance - and then provide said assistance. Of course, the learning would go both ways, with the IT specialists learning about the sort of issues in IT the scientist in his or her field might come across in their workflow*.

        *And that's another thing - if you're at the cutting edge of something, you don't always adhere to an established work flow. You might carefully record data in a logbook (or Excel file) out of habit, but only discover it to be significant down the line when something surprising has happened.

        1. RM Myers Silver badge
          Thumb Up

          Re: Time to be pedantic

          Excellent comment, particularly concerning the statistics. Statistical errors seem to be one of the most common reasons for retractions and corrections of scientific studies. It would be nice if all studies could include a professional statistician, since many of the issues are not at all obvious to non statisticians.

          1. Intractable Potsherd Silver badge

            Re: Time to be pedantic

            @RM Myers: "It would be nice if all studies could include a professional statistician, since many of the issues are not at all obvious to non statisticians."

            Many studies would love to have a statitician on board, but there aren't enough statisticians to go around - there aren't even enough to go round for research ethics committees! My last REC was fortunate enough to have a really good statistician, and she (and, to a much lesser extent, I) used to get VERY frustrated with really fairly basic errors which meant that the project wasn't testing what it thought it was. All thus because no-one was adequately teaching stats to researchers, and there was no-one to run a project past.

        2. J.G.Harston Silver badge

          Re: Time to be pedantic

          In today's world, "being able to use IT" is yesterday's "being able to drag a pencil across a piece of paper". Without it you are functionally illiterate.

      2. Snapper

        Re: Time to be pedantic

        Exceedingly well put Sir, I doff.

      3. MJI Silver badge

        Re: Time to be pedantic

        This IT literate person knows to not use Excel for data handling.

        I use database tools, I KNOW what it will do and not screw up my data.

        1. Anonymous Coward
          Anonymous Coward

          Re: Time to be pedantic

          "[...] I KNOW what it will do and not screw up my data."

          I THINK I KNOW what it will do and not screw up my data.

          FTFY

          Ah - the computer is always right.

          There once was a database which was apparently losing track of important records on terminal searches. The symptom was only spotted when an astute operator was asked if they wanted to add a very common family name to the indexing eg "Smith".

          An investigation showed that several names had been re-entered as "new" when they already existed - and this had happened several times for the same names. The result was that any previous records linked to that name became orphans. The records were still accessible as serial data - so were not actually lost.

          The fault lay in the disk controller. When the logic rack voltage was in spec - but not spot on - it would very occasionally fail to recognise a data match. The relevant data key was for unique names - on which the system could request searches***.

          ***The database was serially archived and reloaded for efficiency every so often - by tidying up the additions and deletions. This changed which names' keys were potentially affected by their position on the disk.. Only the last record on a track would be potentially missed.

          1. Imhotep Silver badge

            Re: Time to be pedantic

            And Null is an actual surname that has caused database entry problems.

            1. RM Myers Silver badge
              Pint

              Re: Time to be pedantic

              And so is Void, which can make many of the database issues null and void.

              Sorry, bad joke. Time for some liquid refreshments ==>

  11. Allan George Dyer Silver badge
    Coat

    and cannot be offensive or derogatory

    I have a sudden urge to make DELEC a swearword...

    "You stupid DELEC!"

    1. Anonymous Coward
      Anonymous Coward

      Re: and cannot be offensive or derogatory

      From Azure Function Host (https://github.com/Azure/azure-functions-host/issues/6480):

      Update terminology used in runtime source #6480

      @@ -75,30 +75,30 @@ public bool TryGetFunctionMetadata(string functionName, out FunctionMetadata fun

      /// Gets the function metadata array from all providers.

      /// </summary>

      /// <param name="forceRefresh">Forces reload from all providers.</param>

      - /// <param name="applyWhitelist">Apply functions whitelist filter.</param>

      + /// <param name="applyAllowList">Apply functions allow list filter.</param>

      /// <returns> An Immmutable array of FunctionMetadata.</returns>

      public ImmutableArray<FunctionMetadata> GetFunctionMetadata(bool forceRefresh, bool applyWhitelist = true)

      public ImmutableArray<FunctionMetadata> GetFunctionMetadata(bool forceRefresh, bool applyAllowList = true)

      1K issues to deal with just with the Azure Function Host, and this one bubbles to the top of the list.

  12. Anonymous Coward
    Anonymous Coward

    The 'Do it our way or not at all'

    rule No 1 in the Microsoft Mantra wins again.

    F**K Mircosoft!

    1. Anonymous Coward
      Anonymous Coward

      Re: The 'Do it our way or not at all'

      Or uncheck the "auto-format" box we gave you.

      1. Doctor Syntax Silver badge

        Re: The 'Do it our way or not at all'

        This is of limited value when the data arrived mangled because someone who passed it to the person who passed it to you didn't do that. Or possibly the person who passed it to them. Or an assistant who opened the CSV file that had been emailed to the boss.

  13. A Non e-mouse Silver badge

    Not just genes

    Shall we change the global phone network and get rid of leading zeros from phone numbers too?

    1. Giles C Bronze badge

      Re: Not just genes

      That one often causes me an issue, which is why I always put a space in the number. But of course the USA doesn’t have leading zeros in their domestic numbers.....

    2. J.G.Harston Silver badge

      Re: Not just genes

      If you bloody enter the bloody phone numbers bloody correctly, then you'll have no problems.

      1. SloppyJesse

        Re: Not just genes

        > If you bloody enter the bloody phone numbers bloody correctly, then you'll have no problems.

        My number is 018118055. That is all the digits in the right order. Why do you think that is not entered correctly?

  14. Neil Barnes Silver badge
    Joke

    I have to use Excel at work

    And now I've grown three extra fingers and a spare ear. To whom should I complain?

    1. Spherical Cow Bronze badge

      Re: I have to use Excel at work

      Not us, thank you.

      1. Doctor Syntax Silver badge

        Re: I have to use Excel at work

        With a handle like that you're not the one to complain.

    2. The Oncoming Scorn Silver badge
      Coat

      Re: I have to use Excel at work

      Your parents, especially if they are also brother & sister.

      1. Dave 126 Silver badge

        Re: I have to use Excel at work

        Why complain? You can now Alt Ctrl Del with one hand!

        1. jake Silver badge

          Re: I have to use Excel at work

          "You can now Alt Ctrl Del with one hand!"

          If you honestly can't do that already, you can either buy a new keyboard or re-map your existing one.

  15. ecofeco Silver badge

    WTH?

    Why don't they use some other spreadsheet of database?

    This is insane!

    1. Anonymous Coward
      Anonymous Coward

      Re: WTH?

      Consider using a CSV file for data exchange, and then consider how the Excel issues might bite you ...

  16. adfh

    This REALLY shits me about Excel

    I remember, once upon a time, you'd open a CSV and it'd give you the wizard.

    Now, it defaults to just opening the CSV - and if you want it to open the Wizard, you have to rename it to .TXT and open it - or use some weird data import wizard that configures the CSV as a data source ?

    Grrr.. humbug...

    1. sabroni Silver badge

      Re: This REALLY shits me about Excel

      The put the wizard on a menu so people who didn't understand it weren't scared off when they opened a csv while capable users could still run the wizard from a menu.

      What a bunch of complete cunts.

  17. Roger Greenwood

    Interesting quote

    "Biologists in particular are reluctant to invest time in learning programming skills."

    Of all people Biologists should know that populations that are the least adaptable are doomed....

  18. Anonymous Coward
    Anonymous Coward

    Where is the outrage?

    So hundreds of comentards moan about removing derogatory language but are clearly not fussed about biologists changing their nomenclature to avoid spreadsheet application auto correct.

    People’s standards are deplorable, I’m glad I don’t have to come into close proximity to you lot.

    1. Dave 126 Silver badge

      Re: Where is the outrage?

      You must have read a different comments thread. Most of the people here are either bashing MS, suggesting how Excel should behave, suggesting that scientists should a database, defending the use of excel in this context, talking about cats, or saying things about scientific research culture.

      Reading between the lines, most of the comments here are based upon an unease that scientists should have to change their way of working to fit a tool. That most here aren't expressing their unease as outrage actually speaks well of them.

      ( As for removing loaded, potentially offensive terms, yeah, I might differ from the majority here, but in the appropriate threads I merely suggest to them that experiments and studies have been conducted and that it's worthwhile looking at them. It's not so much offense that is an issue, but continuously and subtly perpetuating views and assumptions that do not withstand objective scrutiny. )

      1. jake Silver badge
        Pint

        Re: Where is the outrage?

        Cats? What cats? ElReg must have automatically killfiled the cats because The Beaky Ones have decided that that's the way most users want to see the thread.

        Try to think about what I just typed in the context of this conversation before taking it literally. Perhaps have a beer while you cogitate. I'll get this round in.

        1. Dave 126 Silver badge

          Re: Where is the outrage?

          Cheers Jake! On a thread about about software changing users' input - cars to cats typed on a phone - we must never forget that Cats might be a deliberate change for the purposes search engine optimisation!

          1. jake Silver badge

            Re: Where is the outrage?

            I did wonder ... but cats being the topic that TehIntraWebTubes was built on (after pR0n, of course!), I just figured the mods were being pussies and had nuked a cat-fight.

    2. Doctor Syntax Silver badge

      Re: Where is the outrage?

      " I’m glad I don’t have to come into close proximity to you lot."

      You're here aren't you? And the only comments I've seen here about removing derogatory language are those taking the piss out of such snowflakes.

      1. jake Silver badge

        Re: Where is the outrage?

        I usually hate the word "snowflake" when not talking about actual snow ... but in this context it is growing on me. Perhaps English is mutating in front of our very eyes.

  19. Robert Grant Silver badge

    > and cannot be offensive or derogatory

    This will be the easiest rule to troll, ever.

    1. jake Silver badge

      Yes. It will. And indeed already is.

      See above.

  20. PassiveSmoking
    Thumb Down

    The Ghost Of Clippy

    I freaking hate software that thinks it's smarter than you are, especially if it doesn't make it easy to turn that functionality off.

  21. Mike 137 Silver badge

    It's even worse than that

    Excel doesn't actually change e.g. 'DEC1' to '01-Dec' when your enter it. That's just what it shows the user. It actually modifies the data into the internal clock tick representation of the date (in this case the number 44166). Because of this, if you reformat the cells to "general" and back to "date", it now reads "01/12/2020", which is even further from your original entered data, but is still 44166 internally..

    This has always been the default behaviour for anything that looks remotely like a date. It can only be overridden by opening Excel and then importing the data. At that point you get to specify how it's to be interpreted. We had this problem once when analysing a c. 10,000 entry password dump - almost half the passwords were corrupted by Excel.

    1. Omgwtfbbqtime

      Re: It's even worse than that

      However, if you know that has happened, you can import the Excel abomination into something like Alteryx and run a formula on a column that contains string data to replace 44166 with DEC1 again, then output into a useful format.

      After all - you do have the data specification for your inputs don't you? - Don't answer that...

  22. 0laf Silver badge
    FAIL

    Easy to point a finger at researchers for blaming their tools, but they are scientists not developers. It would be better if the tool maker produced a product that made it easier for lay-people to achive what they want without having to go through additional training.

    But as another posted pointed out; in this current era of MS making changes to your configuration without proper warning (yeah yeah you publish your changes but to my users that's like changing one digit in a phone book and expecting them to notice) it's a significant risk that if you change Excel to not carry out this particular piece of autoediting MS will reenable it randomly with updates or just for shits and giggles.

    MS should consider the implication (but I know they won't) that a significant body of professional workers cannot trust their product to process information which is of massive importance to humanity.

    1. hitmouse

      The user interfaces for most of the scientific software I have seen (which is quite a lot) suggest that their practical and aesthetic sensibilities were developed sometime around 1990. While it wouldn't take much effort to update most of these interfaces, their file import/export, and god-forbid their network handling, they will also charge a phenomenal amount of money for updates. For those people who think Office is expensive (because they don't remember when Lotus 1-2-3 by itself was $2000 in the mid-80s), science packages run into the $10Ks with huge hikes to "support" operating system updates.

  23. sizbut

    So at some point in the last few decades they couldn't have just said prefix all Gene names with "gene" or "G"!?

    1. Doctor Syntax Silver badge

      Or just suffix them with "SE" for "stuff Excel". Other consonants are available.

    2. RM Myers Silver badge

      Or just realize these are not the gene names, but just the common abbreviations used in place of the (much longer) name.

  24. Ross 12

    CSVs

    This is another great example of why CSVs are shit and need to be left in the past where they belong.

    1. I Am Spartacus

      Re: CSVs

      XML? JSON?

      Are these not good enough for Scientists?

      1. Doctor Syntax Silver badge

        Re: CSVs

        I think you're going the wrong way, complicating instead of simplifying. What's needed is something that just displays CSVs without trying to do anything more, rather like a text editor just displays text without worrying about fonts and font effects.

  25. Anonymous Coward
    Anonymous Coward

    CSV annoying

    It doesn't help that nobody made any attempt early on to make CSV abide by any rules at all and then have programs that export in this format stick to them. Excel seems to have a few things it can put in the first line.

    I build electronics and more often than not the easiest way to share co-ordinate data for components (aka a centroid file) is using a "CSV" . Exported from the Cad package this usually has some text at the top, author, owner, units etc - standards bodies could easily have agreed a format for that early on. These same programs will happily export fields that contain the separation character or worse, use tabs or spaces combined with excessively long fields that then break any attempt at using Calc or Excels methods of handling them.

    And then one you have hacked and chopped at the supplied file so it imports vaguely sensibly you find you've missed the column Calc (yes it does it too) or Excel randomly decides you don't want your leading zeros because it surely must be a number. I don't understand why this sort of thing is only done to how it is displayed not how it is stored so you could undo it. Even a nice checkbox, treat all columns as text by default on the import dialogue or in the settings would be good.

    Yes custom tools could do the job better but writing such a tool than also handles all the many ways people/CAD packages format the document is not a trivial job, all these mean the same thing... R1,R2,R3,R4,R5 or R1-5 or R1..5 or R1-R5 or R1,2,3,4,5,

    1. tip pc Silver badge

      Re: CSV annoying

      a CSV is a CSV,

      if you need more structure just use xml or if its specifically intended for opening in spreadsheets use the ODF spreadsheet format.

      there are solutions. use a acsv for sharing CSV data, if it needs more structure then use ODF & structure as you want and need.

      use a CSV then don't complain that there is no standard when there are standards you could use.

      Your issue is different to those that want to import data into a sheet or the spreadsheet mangles the data on entry/import.

  26. karlkarl Silver badge

    I turn all that auto formatting stuff off. It slows me down and having text flicker and faff about like that just looks too "broken".

    It isn't like any typos will get through because they are still highlighted in red.

    I also recommend the same with Visual Studio. I see more people fighting with the incorrect autoformatter than actually writing correct code.

  27. This post has been deleted by its author

  28. Anonymous Coward
    Anonymous Coward

    > "Really I think the issue is that non-explicit auto-conversion of data types is a bad default software behavior."

    Absolutely. It should never be a default behaviour to automatically and irretrievably alter the base data based on some half-assed assumptions.

    At most, it should be just a presentation layer that doesn't alter the original.

    I have a similar issue with putting hex addresses into a spreadsheet. Every one containing a E gets auto-incorrected to an exponent.

    Personally I get sick of having to turning off all the autoformat and autocorrect options. I'd like a single checkbox that says "What I typed is what I want. Don't fuck it up".

  29. Richard Cranium

    Excel is a problem but so is the CSV format

    There is an RFC but it's not really a standard.

    I have a regular task to import a very odd "csv" file to a MySQL database. My solution is some task-specific code. The generalised problem, for which I've only found one possible "one size fits all" conversion program (in Python & not tried it) is: identify the peculiarities of the incoming file, identify the requirements of the destination app, do the conversion.

    It may be possible to write a file parser to identify the characteristics of the input file but the user would need to enter the requirements of the destination app.

    My specific task, the incoming file uses tilde as field separator ("comma" equivalent) tilde being unlikely to appear in the data, then a mix of quoted and unquoted fields which may include quotes, commas, apostrophes, tabs, escape characters and a load of other "surprise" characters.

    I read that 20% of a large body of scientific papers on genomics included CSV data that would be misinterpreted by excel

  30. JDX Gold badge

    "This is apparently easier than changing the format of cells in Excel."

    Anyone else at all worried these are the people in charge of things like vaccines?

    What if I get the American 12-04 rather than European 04-12 version?

  31. Anonymous Coward
    Anonymous Coward

    setting Excel column type when importing CSV files

    er... what is csv?

    ...

    oh, that's interesting... so, why don't they call is xls or something...

  32. Tim99 Silver badge
    Windows

    Excel (now there’s a misnomer)

    Treats the year 1900 as a leap year because Microsoft’s main aim when they introduced it, was to kill Lotus 123. Lotus (on the PC) replaced VisiCalc on the Apple II who’s epoch was 1 Jan 1904, so VisiCalc didn’t go back that far. Microsoft’s original spreadsheet Multiplan didn’t cut it, so Excel (originally introduced for the Macintosh) was rapidly promoted to run on Windows... and spreadsheet-jockeys attempt to run major business with it...

  33. john 103

    other excel woes

    Credit Card numbers transformed to scientific notation -BTW these were test card #s - We weren't storing real customer numbers in excel.

    When loading to excel - excel looks at first 20 rows and "guesses" the datatype so you end up having to load a Dummy row with chars at the top to prevent excel deciding its an Integer column.

    Schizophrenic Compatibility Checker Dialog box - did i save it or not?

    Dates transformed into numbers based off 1900 -01-01 as 1 - Why?

    Non-standard copy-paste rules - you get one shot and then excel helpfully clears your clipboard.

    VBA is still the best way to program with Excel - In 2020 really?

    C# Interop is "not recommended" for Server code - what's plan B?

    Named Ranges - A clue as to the Range names would be nice.

    Formulas - can't append more than 255 chars - however CONCAT() works - why

    Still no easy way to cycle thru sheets via keyboard (hint CTRL-TAB would be nice)

    1. doublelayer Silver badge

      Re: other excel woes

      "Still no easy way to cycle thru sheets via keyboard (hint CTRL-TAB would be nice)"

      In mine, CTRL with page up and down does that. it isn't great if you're at one edge and want to go to the other one, but otherwise it's pretty useful. Maybe that will work for your situation?

  34. sketharaman

    SEPT2 ... automatically formatted to 2-Sep.

    Nope. Excel autoformats the numeral as year, not day. SEPT2 becomes 1-Sep-2002, not 2-Sep.

    1. Anonymous Coward
      Anonymous Coward

      Re: SEPT2 ... automatically formatted to 2-Sep.

      and don't try and work with data pre-1900 because as far as Microsoft is concerned if they made no profits in that century, the century doesn't exist

  35. I Am Spartacus
    Mushroom

    Why you don't use a spreadhseet as a database

    How many time have we said it: EXCEL IS NOT A DATABASE.

    It a wonderfully, full function spreadsheet. It handles spreadsheet data. It does it well, and for most people it is pretty inuitive. I normally bash Microsoft, but Excel is very good.

    But it is not a data storage or data transfer tool. We have very good databases (SQL Server or Access if you want to stick with Microsoft - other databases are available).

    So, Scientists, use a tool suitable for the job and stop whining when you use a hammer to crack a peanut and your peanut goes to dust.

    1. Doctor Syntax Silver badge

      Re: Why you don't use a spreadhseet as a database

      I'm not sure a database is the right application for just passing tabular data around. If this were someone complaining about importing straight text files into Word and getting unwanted auto-corrections you wouldn't tell them to use a desktop publishing application instead would you? You'd tell them to use Notepad. Here's the problem - there isn't the equivalent tell-it-like-it-is equivalent of Notepad for tabular data.

      1. Tim99 Silver badge

        Re: Why you don't use a spreadhseet as a database

        SQLite usually works for me...

      2. jake Silver badge

        Re: Why you don't use a spreadhseet as a database

        "there isn't the equivalent tell-it-like-it-is equivalent of Notepad for tabular data."

        As I pointed out earlier, EMACS and vi are both perfectly suited for this kind of thing. There are many other editors that can also be coerced tweaked into handling this kind of thing.

        And NO, vi and/or EMACS are NOT too hard for scientists to learn to use for this kind of thing. The subset of commands required to adequately use either is on the order of a couple of dozen, including the basic editing commands. My 9 year old granddaughter learned that much vi in an afternoon (with a hand-written, by her, cheat sheet).

        Or are geneticists devolving from their tool using ancestors?

  36. Chris Evans

    The new guidelines are where?

    The new guidelines only seem to be available to Nature subscribers

    I found out of date info at:

    http://www.hugo-international.org/HGM-News last news item 2018

    https://www.genenames.org/about/guidelines/ seems to be the 2002 version

    https://en.wikipedia.org/wiki/Gene_nomenclature

    https://en.wikipedia.org/wiki/HUGO_Gene_Nomenclature_Committee

    Presumable there is also a list of those genes that have been renamed but I didn't find it.

    To be useful they need to disseminate the info as widely as possible.

  37. Sherrie Ludwig

    Using the wrong tool

    As Matt Parker says in his book "Humble Pi", about when math(s) goes wrong, Excel is not a database, was never meant to be a database, and if you try to use it as one you get what's coming to you - errors, errors and more errors.

  38. N2 Silver badge
    Trollface

    Well done MS

    Pretty slick to just take 16 years

  39. Mage Silver badge
    Coffee/keyboard

    Crazy

    I heard about this years ago:

    First thing on a new install of LibreOffice: Disable Autocorrect except for smart quotes.Doesn't fix THIS problem.

    I'm actually considering turning off that too as “” those are AltGr V and AltGr B on Linux. Shift for singles.

    Though the auto format recognition is due to not setting cells to text, but having the default General. No idea how to change that default on Excel, Gnumeric or LO Calc.

    General is an evil default as Indicative numbers may be in a text column and never ever should be treated as numbers, except maybe on some of them to check checksums (ISBN, IBAN, credit cards).

  40. Anonymous Coward
    Anonymous Coward

    I think the plan is to roll out use of the new abbreviations over the next 3 years, one institution at a time.

  41. eldakka Silver badge

    Macro/VB Script?

    I'm surprised someone hasn't written a standard Macro or VB Script "Open Gene data file", so you just open Excel, press the hotkey to activate the Macro that imports a prompted-for file and does the formatting as part of the Macro. Then that Macro/VB script can be shared out and be a "must-have" 'plugin' for Scientists doing Gene-work.

    1. parlei

      Re: Macro/VB Script?

      No way that is going to happen, people will see no reason to use an extra step.

      There is a saying among molecular biologists that most mutations happen at the keyboard...

  42. Unicornpiss Silver badge
    Meh

    Machine think

    I hate it when any app tries to guess my intentions, as at least 50% of the time they seem to get it wrong. Microsoft software being probably the worst at it. If it's wrong that often, have you really saved any time when you now must proofread your work carefully not just for your errors, but for ones introduced by the software?

    I can't count the number of times I've had to work with a spreadsheet that someone else created and couldn't be bothered to simply 'format cells' and set the correct format. Or just change them to 'text' if no math needs to be done. Computing is still 'garbage in, garbage out' as it has been since the beginning.

  43. steve 157

    I blame the ribbon

  44. aj69

    Micros~1 Excel.

  45. brianj

    LibreOffice seems not to have this problem.

  46. Paul Hovnanian Silver badge

    Logically ...

    ... the Excel import wizard tries to make sense of the data in each cell and select (or suggest) an appropriate format. However, data in one column tends to be of the same type (or supertype in an object world). Failing that, it should be 'Text' (a binary blob or whatever the top level class is). I'm not sure what heuristics were used for the conversion. If row 1, column 5 reads 'DEC1' then a reasonable assumption is that this is a date. However if row 2, column 5 is 'BRCA2' then the date decision should be backtracked. My guess is that the spreadsheet type selection is either per cell or based on the first row (or several). This seems to be far too simplistic and a poor assumption on the part of the import wizard writer.

    1. hitmouse

      Re: Logically ...

      The heuristics are excellent for the 99% of cases that they encounter. I can't imagine that people importing gene sequences are more than about 0.00001% of users. Trouble is people are seeing their glass as 0.00001% empty rather than 99.9999% full.

  47. Zarno Bronze badge
    Terminator

    Oh lord no...

    Clippy and CRISPR/Cas9 would be a match made in Umbrella Corp...

    "Hi, I'm Clippy! It looks like you're trying to engineer a viral outbreak to kill all life on earth, can I help? [YES] [ALSO YES] [JUST CLICK THE BUTTON, YES] "

  48. DWRandolph

    when app has line break in comment field

    Another example of CSV-to-XLS(x) problems. Was asked to help migrate old data by exporting from their database into a format the new application could read. But the data included a Comment field, and that fields data entry allowed Carriage Returns. Meaning a single record could span multiple lines in the CSV file! They sent a nice letter to my boss about the extra effort developing a "bespoke migration tool" (really just a little PowerShell to stitch the lines back together by changing the extra CR to a flag character).

    Back to the main thread, yeah - default should assume incoming CSV columns are text. Then I will assign formatting as needed for display or calculations.

  49. razorfishsl Silver badge

    MS could have issues a patch to every version of Excel to turn off this problem, calling it "Gene mode"

    but nope...... in their complete arrogance they just continue to change users data.

    1. hitmouse

      This is how we get bloat, creating features to "solve" a problem that exhibits for 0.0001% of users who could easily RTFM and circumvent the issue.

  50. hayzoos

    Excel Sucks!

    I have used various spreadsheet programs from visicalc onwards. I use a spreadsheet for calculations, light database, forms, and anything else I see fits. For serious heavy lifting database use a spreadsheet will not do. A csv file is not serious heavy lifting database.

    I have tweaked Excel to import data without mangling it, but I am a bit more proficient than the average Excel user. Even when I have disabled everything I could find (cannot remember which version), Excel would still present a date in a pulldown which had data that resembled a date.

    I still spend way too much time disabling or working around Excel's propensity to alter what I am trying to enter.

    For those commentards suggesting some automated solution to this problem, How do you think we got into this mess? The default should be just leave the data as it is.

    There was no need to specify a standard for CSV or TAB files. They are quite simple. Each "record" is a line in the file terminated by the system's text file EOL character sequence. Each "record" has the same number of "fields". "Fields" are separated by a comma for CSV or a tab for TAB, (the delimiter cannot be part of the data). Any CSV or TAB file not following these simple rules was invalid. Both gave the ability to have varying length fields and having more compact files. The alternative was fixed field and/or record length which was usually inefficient at storage use, but usually gave better performance and there was no delimiter to worry about appearing in the data.

  51. low_resolution_foxxes Bronze badge

    I'm surprised Microsoft cannot put some kind of CSV auto-script that detects common issues like this.

    How hard would it be to scan the data, detect auto calendar changes, then offer to change the format, or say “genetic data detected - do you want to convert calendar dates? "

  52. parlei

    Further dating "abilities"

    It will also tend to interpret e.g. 10-12 as a date regardless of any preferences on your side. In my case I was making a list of rack positions for test tubes (rack 10, position 12), and saw it happen as I typed and told Excel to keep its lousy excuse for smarts to itself.

  53. hugo tyson

    64-bit intengers too....

    It's not just dates; I had a customer saying "your data is useless" when they had opened the CSV in Excel, just to

    check it wasn't corrupted, AND SAVED IT BACK. Thus corrupting it....

    All the 64-bit integers were now double floats. Thus removing the significance.

    These were 32 bits of latitude, and 32 of longitude, in 1/2^32 units of a full circle.

    Handily that resolution is about a centimetre, we were very pleased with that encoding scheme for mapping.....

  54. Anonymous Coward
    Anonymous Coward

    Lazy is as Lazy Does

    As someone who is not a programmer, yet spent years using spreadsheets to manipulate and make sense of huge database dumps, the fact that these so-called scientists can't be bothered to take 5 minutes to learn how to format cells during the import process, tell me they should be flipping burgers instead.

    They also must be pretty sloppy when it comes to peer review, if his kind of thing is rampant in the field.

    1. SloppyJesse

      Re: Lazy is as Lazy Does

      > can't be bothered to take 5 minutes to learn how to format cells during the import process

      From the users' perspective there is no import process.

      User looks at file in explorer - it's an Excel file because Excel has registered itself to handle .csv files

      User opens file - Excel silently corrupts random values

      User saves file - Data is now permanently corrupted

      PS. I took four goes to write ".csv" without my browser correcting a perceived typo.

  55. Ken Moorhouse Silver badge

    Satya Nadella

    Is probably leaning back in his chair right now, muttering "yes, we have now arrived, geneticists have finally agreed that Microsoft is superior to even genetics."

  56. Chopsticks

    Improper use of Excel, yet again

    For the millionth time, this is a spreadsheet analysis tool, not a Relational Database Management system.

    This article reads like "All the bus passengers are complaining about the lack of room on my scooter!"

    Besides that, you can select an entire column in Excel and format the cell to a specific type of data (like TEXT data, instead of letting Excel assume a date type).

    Leave it up to scientist to fail basic spreadsheet training. ;-)

  57. boatsman
    Coat

    the clone does better.

    they could have switched to libre office.

    its a clone ? yes.

    does it turn sep1 or SEP1 into a date ? no.

    actually, excel does not do that either.

    the spelling help ( this kind of help kills people, yes.... )

    first sees SEP1, decides for you that it is SEP 1 (with a space) and then excel kicks in and says "hey, that is a date, let me help you... "

    and there you have the real problem: use common word processing functions stupidly and non-context aware for anything. even in a spreadsheet's data cells....

    and that indeed proves that MS doesnt give a damn. which we can fix by upgrading to something less dismal...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020