There was a similar piece of research done on spreadsheets generated by financial bods. They found something like 10% of spreadsheets contained errors affecting the calculations.
Excel hell messes up ~20 per cent of genetic science papers
Scientific literature often mis-names genes and boffins say Microsoft Excel is partly to blame. “Automatic conversion of gene symbols to dates and floating-point numbers is a problematic feature of Excel software,” wrote Mark Ziemann, Yotam Eren and Assam El-OstaEmai, of the Baker IDI Heart & Diabetes Institute in Australia, …
COMMENTS
-
-
Thursday 25th August 2016 09:33 GMT Anonymous Coward
Yes, but in that case the act of doing this on Excel was deliberate - it's called plausible deniability..
They wouldn't have that using LibreOffice.
Hmm, as ODF is the UK government document standard, maybe that's an extra argument to mandate LO? I would love to be a fly on the wall at Microsoft HQ if anyone close to government would publicly float that idea :)
Yes, I'm an evil man. Why?
-
-
Thursday 25th August 2016 06:36 GMT Anonymous Coward
My pet gripe is
That the list of custom formats does not include
YYYY-MM-DD'T'HH:mm:ss
or
YYYY-MM-DD HH:mm:ss
Makes working with time data a PITA. The standard ones don't have the seconds part.
I am sure that many others have their own ones.
Fighing Excel and its wierdness in a company that seems to use it for all sorts of things (incluing many wrong ones) is something that I have to endure on a daily basis.
-
Thursday 25th August 2016 08:20 GMT DaLo
Re: My pet gripe is
"That the list of custom formats does not include..."
Are you being serious? If you are, do you not realise that you can type whatever format you want into it, that's why it's called 'Custom'. The lists in the box are just suggestions and examples and custom ones you have already created for that workbook.
You can also save that workbook with all your commonly used custom format as a template for subsequent spreadsheets.
-
Thursday 25th August 2016 09:48 GMT Robin
Re: My pet gripe is
Are you being serious? If you are, do you not realise that you can type whatever format you want into it, that's why it's called 'Custom'. The lists in the box are just suggestions and examples and custom ones you have already created for that workbook.
A condescending reply on a tech forum? That must be a first.
But anyway, another +1 here for the format the OP mentioned. I was surprised when they let me use it in a reporting function I made for an admin tool recently.
-
Thursday 25th August 2016 11:12 GMT DaLo
Re: My pet gripe is
Not meant to be condescending, just didn't know if the OP was being serious or sarcastic and didn't want to be trolled. Maybe it's bad UX but the OP seemed to know what format they required, just didn't realise that custom formats is designed to use any custom format and you can add your own.
-
Thursday 25th August 2016 12:51 GMT imanidiot
Re: My pet gripe is
What surprises me most is that YYYY-MM-DD still isn't the commonly used/accepted format. I work in a company that does a lot of bussiness with US companies. I don't know how many times DD-MM-YYv. MM-DD-YY format F(*^ups have resulted in problems but it has to be quite often. And even now there are many forms and procedures where DD-MM-YY is the only accepted format...
-
-
-
Friday 26th August 2016 12:39 GMT jimbo60
Re: My pet gripe is
Well, how about this then: If you type in a custom date / time format as suggested, it completely fails to work when creating pivot tables from that data. You have to use one of the non-custom fields for the pivot table to correctly separate the timestamps. Not cool when I'm analyzing data timestamped to fractions of seconds.
-
-
Thursday 25th August 2016 12:54 GMT Zippy's Sausage Factory
Re: My pet gripe is
I'm guessing the problem the OP is having here is other people. Namely that there is a specific need to format the dates and times in this format. The fact that Excel doesn't contain the format with seconds means people are using the format that contains only minutes and then there are support calls when they don't understand why the data doesn't look right.
In other words, Excel is making people take a shortcut down the wrong route.
(As an aside, if I had all the money I've seen wasted thanks to Excel diverted to my bank account, I'd be able to own my own private island, which I could fly to and from every day in a helicopter made of solid gold piloted by a hired supermodel.)
-
Saturday 27th August 2016 06:41 GMT Black Betty
Re: My pet gripe is
Cosmetics. And as pointed out by others, easily fixed by creating a formatting template.
If working with time data is your thing, then shouldn't decent time and date arithmetic be topping your wishlist? Not to mention a proper time storage format. FFS compacting them into a REAL to save space stopped making sense the day computer memories breached Bill Gate's infamous 640K limit.
-
-
-
Thursday 25th August 2016 08:25 GMT Martin an gof
problematic feature of Excel software
Reformatting affects all sorts of software, such that I usually find myself turning off nearly all 'autocorrect' features, just to stop the annoyances of having to re-type stuff. The one that's annoying me at the moment is at work, where Outlook insists on autocorrecting (in formatted emails) our postcode, which ends 3RD, to 3rd
And don't talk about the problems of writing emails with a mixture of langauges - in my case Welsh and English. Best just turn off all correction features.
M.
-
-
Thursday 25th August 2016 11:16 GMT Martin an gof
"formatted emails": there's your problem, right there ;)
Well yes. When I originate an email it's plain text, but when replying to an email it's invariably formatted, and getting rid of the formatting can muck up the "history".
You know, I still can't get used to top-posting and the way Outlook mangles / ignores signature separators...
M.
-
Thursday 25th August 2016 13:52 GMT VinceH
"Well yes. When I originate an email it's plain text, but when replying to an email it's invariably formatted, and getting rid of the formatting can muck up the "history"."
When I originate an email it's plain text - and it's also plain text when I reply to one, no matter what format the original came in!
I also interleave my replies, which did lead to one colleague asking me why I sent his email back to him without replying (his "preview pane" was small enough that he couldn't see my reply, which he expected to be at the top).
-
-
Thursday 25th August 2016 13:45 GMT Wensleydale Cheese
"Except, of course, any software which starts out attempting to be too clever by half ends up not being clever enough."
I've recently run into series of supplier invoice numbers which start with 118, and the "intelligent" pattern recognition system in OS X tries to interpret them as phone numbers.
In my country, 118 is the number for the Fire Brigade.
Oops.
-
-
-
Thursday 25th August 2016 11:21 GMT Just Enough
Everything is potentially a US formatted date
Using Excel in a chain of data transference always results in these kind of errors. Usual one is telephone numbers being rendered in scientific notation. Or the old favourite of a DD/MM/YYYY formatted date becoming a "DD/MM/YYYY" string, which then is interpreted as a MM/DD/YYYY formatted date. Excel will attempt to turn just about any code with forward slashes into a guessed date.
I also struggled with one that insisted in converting the user name June1 into a date.
-
Thursday 25th August 2016 11:26 GMT Omgwtfbbqtime
Re: Everything is potentially a US formatted date
I found the best way to deal with this was set your region to USian and leave it there. I don't get the mismatched date formats on the initial import (.CSV from a USian server) and once saved to .xlsx it keeps the dates the way they are meant to be for the other users.
-
-
Thursday 25th August 2016 11:43 GMT Anonymous Coward
"Automatic conversion of any user input to dates and floating-point numbers is a problematic feature of Excel software"
Hear hear.
Another example: exported lists of usernames and passwords sent as an Excel spreadsheet.
The passwords were 8-character random hex strings. It worked fine for 99.9% of the time - except for ones which happened to be like "123456e8" which Excel had munged into floating-point numbers.
Stripping leading zeros off telephone numbers is another problem.
-
Friday 26th August 2016 04:34 GMT gzuckier
"Automatic conversion of
gene symbolsany user input to dates and floating-point numbers is a problematic feature of Excel software"As in, 60,000 distinct 14 digit ID numbers pasted into excel all coming up as 5.42342E14 and when reformatted as number, all coming up as 54234200000000? You mean that's not what was desired?
-
-
-
-
Thursday 25th August 2016 11:22 GMT Anonymous Coward
Re: Storing semi text data in a tool designed to process numbers automatically.
You know there's a very simple way to insert data as plain strings, right? IIRC it dates back as far as 1-2-3, but I never used Multiplan or Visicalc.
Any tool you don't really learn to use will bite your back sooner or later.
-
-
-
-
-
Thursday 25th August 2016 09:47 GMT Doctor Syntax
Re: but...
" Nanny Microsoft knows more about what you are trying to do than you do."
It's not just Microsoft. LibreOffice Calc has similar problems - although it might be that they're trying to be bug-compatible with Excel. The real problem is with any dev who decides to try to double-guess what the user's going to do has put a foot on a slope which is far slippier than they ever imagined.
-
Thursday 25th August 2016 09:57 GMT m0rt
Re: but...
i also deal with a lot of phone numbers, and a lot or reporting that is invariably created as a CSV file.
Excel is the bane of my life. I personally use Libreoffice, but clients use Excel. They then complain when they open their CSV reports in excel, and excel subsequently craps all over the data.
Don't even get me started on phone numbers that are international format with a '+' at the beginning. Excel helpfully starts to see that as a formula.
Whereas Libreoffice will always show you csv options before pulling in the sheet, Excel just thinks you are too stupid and opens it in the way it thinks you want your data.
Clippy is dead, to come back and haunt us as a spreadsheet poltergeist.
-
Thursday 25th August 2016 10:15 GMT Anonymous Coward
Re: but...
This week I had someone from Sales asking me to "crack the password" on an Excel sheet. Turned out it wasn't password protected, just corrupt. Well, only Excel reported it corrupt* . LibreOffice opened it fine, and when I re-saved it Excel was happy with it too. Madness.
* I'm not saying it wasn't corrupt - it came from our Sales department after all.
-
Thursday 25th August 2016 11:29 GMT Anonymous Coward
Re: but...
for one part of my job I commonly have to deal with .CSV files, because that's the only format that a piece of software we rely on for certain important jobs is pretty much guaranteed to eb fien with.Oh, it's gotten better at trying to import.xls files, but it still can;t manage them as reliably as .csv's.
So, I receive .xls files from user. I look at what the job entails. Hmmn... I can merge those two together before using the combined file with this other file to get the desired end result.So - save out first file as .csv - all OK. Save out second fiel as .csv - seemed to go ok. Reopen first file so I can tack it on end of secon done - 'cannot understand file format' or some such nonsense from Excel. BUT YOU CREATED IT JUST A COUPLE OF MINUTES AGO! Grrrr... - I had about fifteen months of that kind of nonsense before Excel finallyseemed to become capable of reliably reading .csv's it had created itself.
And I'm miffed that Excel doesn't use BigEndian as the standard date format nor does it appear to do anything sensible with columns of dates in datafiles I receive, so I have to manually format columns with dates every damned time.
This on top of teh travesty that is Outlook. And people wonder why I've gone off all things IT in this last few years.
-
Friday 26th August 2016 04:42 GMT gzuckier
Re: but...
There is/was a weird bug in a recent version of Excel that would prevent it opening perfectly fine spreadsheets at pseudorandom; the same spreadsheet format with varying daily data would work 9 days out of 10 but the tenth day it wouldn't open. Has something to do with the new file format, zipped xml, in some way I assume.
I miss the older version.
-
-
-
-
Thursday 25th August 2016 07:08 GMT dan1980
Among the things Excel does to gene names include changing “SEPT2”, the name of a gene thought to have a role in proper formation of cell structure, to the date “2-Sep”.
And after the 'auto-formatting', it once again has a role in proper formation of a cell structure - as a date in an spreadsheet cell.
-
Thursday 25th August 2016 07:17 GMT Millennia
Doesn't happen with Google
It is the constant thing with all M$ products that they think they know what you want better than you and force change it. This has now crept into Apple with it's bloody autocorrect changing what you want to say, sometimes without even giving you an option to reject the change, so you have to go back over and over again or risk sending a non sensical message.
Why don't the devs behind this just piss off with their auto-corrects? By all means underline a word, or highlight a cell that may be wrong and give the user chance to correct or ignore the warning, but quit the auto-correct shit as it actually makes us waste time going back and forcing what WE want to actually say! </rant>
-
Thursday 25th August 2016 07:22 GMT james 68
Another issue is the inverse square law of intelligence, where the more specific intelligence in a given field is displayed by an individual, the less generalised intelligence/common sense is displayed by that individual.
In essence, trying to get a geneticist to understand excel is like trying to get Kim Kardashian to understand humility.
-
-
Thursday 25th August 2016 11:29 GMT Anonymous Coward
Re: More appropriately...
Just, the reference user for Excel is the beancounter, not a genetist or people who can't understand a database. Templates are there to tailor it to your specific needs, just most people usually like to reinvent the wheel each time, and waste time in the effort. I wonder what gene is responsible for it. Something alike SATURDAYDRUNK, probably...
-
-
-
-
-
-
Thursday 25th August 2016 10:36 GMT DJO
Re: An easy fix ...
You don't even need to do that. Just format the column as Text rather than general.
Yes yes, that works a treat if you are typing in data
Now try again with importing a multi megabyte CSV file full of, for example, phone numbers (leading zero gets stripped) or SIM numbers (truncated & converted to exponent notation).
It's relatively simple to write a pre-processor to put apostrophes into a CSV file to fix all that, but it should not be necessary.
Almost all Microsoft programs try to be helpful and in almost every case the "help" ends up creating more work than it saves.
-
-
Thursday 25th August 2016 14:01 GMT Wensleydale Cheese
Re: An easy fix ...
"It looks to me you never found the CVS importer in Excel which lets you set separators, string delimiters, column types, etc. etc. "
But IIRC* doesn't give you a means to save those carefully input settings, so you get to do them all over again the next time.
It was there in the Extras folder on the Office 97 CD, which meant it wasn't installed by default.
It was not present on any corporate installation of Office that I have come across.
* it's been a long time since I had to use Excel
-
Thursday 25th August 2016 17:09 GMT Anonymous Coward
Re: An easy fix ...
I routinely used it since at least Office 2000, if not earlier. I have also to admit I shamelessly copied it in a CRM application database, to ease importing customer data in csv format. Being there for at least sixteen years, doesn't look too bad to me. I agree with you it was better if it let save settings to be reused when needed - but I never checked if it can be scripted for repetitive tasks.
-
-
-
-
Thursday 25th August 2016 10:38 GMT Tchou
Text rather than general
Been there, done that.
Sometimes Excel insists it knows better what the formatting should be, even for TEXT cells (for instance, and as far as I can remember, date in a text cell).
Only way to go is with the apostrophe..
(of course I expose the programmer point of view using Excel to import/export data, not when you write down your expenses of the week by hand)
-
-
-
Thursday 25th August 2016 09:04 GMT Jimmy2Cows
Re: Dunno. Think our minds must be too highly trained Majikthise
Have seen many very intelligent people make some very basic mistakes.
Reasoning ranges from the somewhat understandable "I'm not that kind of mathematician", to egotistical bullshit like "trivial implementation detail" which is somehow beneath them despite it being their job.
Oh and the ever-classic "I'm a genius, it can't be my fault. Must be the software." (although, with Excel formatting it often is the software...)
-
-
Thursday 25th August 2016 13:00 GMT Dan 55
Re: An easy fix ...
Think you've sorted it with the apostrophe, do you?
Great until there's a VBA macro to run over the cell... then VBA reads the cell and works out once again what the date format is, using different criteria to the Excel cell.
Then you need to export to .csv, or import from .csv which again mangles the format.
Jesus wept, somebody make it stop.
-
Thursday 25th August 2016 07:38 GMT Ken Hagan
"The paper's silent on why genetic scientists, who The Register will assume are not short of intelligence, have been making Excel errors for years."
Since they only studied published papers and (by the sound of it) did not also study the papers as-submitted to publishers or at any even earlier stage of drafting, I'd say they haven't a clue when the errors are creeping in or which piece of software is responsible. Excel isn't the only piece of software that "helpfully" changes what you type and the original authors aren't the only people involved in the publication process. (I presume that authors are offered the final version to proof-read, but as long as it is largely correct I expect they just skim it. They're busy people, you know.)
-
Thursday 25th August 2016 16:37 GMT RDW
"Since they only studied published papers and (by the sound of it) did not also study the papers as-submitted to publishers or at any even earlier stage of drafting, I'd say they haven't a clue when the errors are creeping in or which piece of software is responsible."
They actually looked at supplementary data files, which were already in Excel format, rather than the main text of the papers. Typically, these files would be uploaded by the journal in the format provided by the authors, unedited. Since exactly this form of data corruption is a well-known 'feature' of Excel (there are previous published studies about Excel and gene names, which even located corrupted 'Excel genes' in one of the major NCBI databases), it's very likely its use is being correctly blamed for the problem. It's easy to see how this happens. Some upstream software spits out data with a gene symbol 'column' in CSV format. A naive user simply double-clicks the file, which will typically be registered to Excel, rather than using the import function and specifying data types for columns. Everything seems to have worked correctly, but in fact Excel has silently corrupted gene symbols that look like dates, with the first affected cell perhaps hundreds or thousands of rows down from the top. We've seen this problem in the lab, and warn new students about it.
-
Thursday 25th August 2016 07:43 GMT Oengus
Pet peeve
One of the best (and I see all of the time) is exporting data from a Web based application that offer to export in "Excel" format. The file is downloaded and named ???.xls. It opens quite successfully in Excel but when you try to load it in an application that is expecting a "true" Excel format file it fails miserably. When you investigate it is because the export is actually a CSV or tab-delimited text file that has just been named with the .xls extension. The users don't understand the difference they think that because it opens in Excel it is Excel formatted.
-
Thursday 25th August 2016 08:06 GMT magickmark
Re: Pet peeve
I have a similar grumble, I regularly have to export datasets from an online application to use in Excel (in this case they are proper .xls formats at least) and there are a number of tick boxes that export as True or False but rather than being logical values they are exported as plain text. Not a big problem as I now expect it and have a routine worked out but still a PITA having to go through and convert them all.
-
Thursday 25th August 2016 14:12 GMT VinceH
Re: Pet peeve
There's also a related problem whereby some web applications genuinely do export in Excel format - and only in Excel format, not considering that the user might want to use an alternative, or even that it might not be destined for a spreadsheet at all.
I have to deal with one such file on a weekly basis; I load it into Open or Libre Office Calc, and then save it as a CSV file myself. An extra step because the web app developer is either trying to be too helpful, or is too blinded by a love of Microsoft.
-
-
-
Thursday 25th August 2016 07:44 GMT Sgt_Oddball
It even does it it..
On data formats that don't require any changes on a data level (such as CSV) which is marvellous for the aforementioned floating point conversion when you view the database to find everything is 30204000000000 (or some variation of) because simply leaving a bloody number be is just beyond it's comprehension.
-
Thursday 25th August 2016 08:06 GMT DwarfPants
For your own sanity don't look at Excels handling of Boolean values
Enter True or false in a cell Excel will treat it as a Boolean value, try the same thing via code and enter a whole world incorrect types. The source of this particular pain is the auto conversion of the cell to Boolean type and there being no way of specifying a cell is a Boolean value.
I caveat this with "That I have manged to find", "I quickly got bored and just used integers", "Life's to short for messing around with this", "Really FFS"
-
Thursday 25th August 2016 08:07 GMT Anonymous Coward
Oh, Is that so?
It's not hard to change the default format of Excel cell to avoid changes of this sort: you can get it done in a click or three.
Excel datetime formatting is powered by a demented deamon bound into service by some serious necromantic ritual and it's struggling mightily against it's bindings.
Try getting "language" settings and "cell formatting" aligned so that the formatting you want is applied consistently to all the cells you want formatted from "murkin date" where sometimes a cell format is "text", sometimes "number" and sometimes "murkin date" to "euro date". All cells look like "murkin date", but, they are not.
Not possible. Not with macros either. Manually retyping *all* the dates in the proper format actually works.
Why do we struggle so? Because some tosspot wanker configures Primavera P6 to use murkin time and we have to produce reports as excel spreadsheets using data from "excel" exports from P6. Jezus fucking Christ.
I hate fucking Excel! And Oracle too!! And all of our management IT consultants should be fired in the traditional way: tied over the front of a field cannon!!
-
-
Friday 26th August 2016 10:47 GMT Anonymous Coward
Re: Oh, Is that so?
No. It does not. It just becomes a different problem. In different cells. There is no way to globally locking f.ex. a column to be of a particular format. What happens when one tries, is that Excel will interpret and convert a value before sticking it in a "typed cell", so that the cell is presenting the right thing while the underlying data has been changed to make the visual presentation match the user expectation.
I.O.W: The cell *format* can be applied to *showing* proper murkin* datetime, the underlying values inside the cells are: Integer, String, Swedish Date, US Date, or US date time-shifted some decades. Calculations don't work with this.
For example:
The useless con-slut-ant bastards locked(!) the P6 export format to 'yy-'mmm'-'dd'*, because *obviously* it would be too damn hard on the Oracle licensing budget or maybe the printer toner usage (?) to just use 'yyyy-'mmm'-'dd' or even 'yyyy'-'mm'-'dd'', so, while we indeed will see '12-jun-30' intended to mean '2012-06-30' presented - the cell value becomes '1930-06-12' - in this case.
One has to fix all the formatting using Python before letting Excel's "I'll give it a go, mate"-heuristics import anything. Even then, some cells will still be screwed up.
*) What I call murkin, because ... it's fucked.
-
-
Thursday 25th August 2016 08:08 GMT Phil Lord
Why this happens
The reason this happens is that large datasets with 10,000s of columns are often imported into excel. Unfortunately, excel does the heuristic to identify the type of a per cell basis. If only 5% of gene names look like dates (say DEC10 which does, compared to CDC28 which does not), then only 5% of the cells will be adjusted. The chances that you do not see this happening are large.
The problem is well known to bioinformaticians, but less well known to biologists. Hence the problem.
It's easy to mock, of course, but just because some one is deeply knowledgeable about biology and the experimental techniques associated with it, does not mean that they are expert at data handling. It's like when computer programmers produce some egregious piece of rubbish instead of a usable program, because they do not understand the domain or its requirements: something which happens say 40% of the time.
-
Thursday 25th August 2016 08:23 GMT Destroy All Monsters
Re: Why this happens
It's like when computer programmers produce some egregious piece of rubbish instead of a usable program
Not at all. When someone suddenly decides to write a program in INTERCAL, this will be readily apparent and a robust bollocking will hopefully ensue.
Spreadsheets in general (and the horrific Excel in particular) are the equivalent of INTERCAL for the business user. This is NOT readily apparent because they hide their in-built awfulness behind nice graphics, bells and whistles, tradition and the mindest that "there is nothing else". An acceptable idea back in the seventies, after Microsoft grabbed the market and added its "special user-friendly sauce" on top, they are now used way beyond what they should be used for.
Time to rethink this.
-
-
Friday 26th August 2016 11:04 GMT fajensen
Re: Why this happens
Because they haven't been told the full/correct requirements, clearly.
Rather, they had too many requirements and so the various hacks are fighting each other.
CEO's/CTO's are yelling at Microsoft: "Make what we are doing here at $BigCorp work in Excel NOW - or ELSE", Programmers get told: "Make work, or Else".
Programmers of course code what they are told, like this:
"Make Work"
else
"Make Work"
else
"Make Work" ....
Thus cell interpretation logic became a pachinko machine, rather than a state machine.
https://en.wikipedia.org/wiki/Pachinko
Same with Word. MS could have made on-the-fly formatting optional and used style-sheet only - except - some people want to use on-the-fly formatting and don't want to select an option "or Else". So Word does BOTH, which sucks when one gets the work of an "on-the-fly"-formatter.
-
-
Thursday 25th August 2016 10:29 GMT abedarts
Nice work if you can get it
Imagine being paid to examine thousands of Excel spreadsheets looking for these errors. Aside from the incredible tedium (give it to the office junior) its nice way to spend time in a warm comfortable office with no deadline or particular objective in mind and you can then feel good about publishing 'important' research.
-
Friday 26th August 2016 11:17 GMT Anonymous Coward
Re: Nice work if you can get it
In case you haven't noticed: Most "knowledge" work is a totally useless effort in futility. Brownian motion.
The only value created from it is CO2 for the office plants - except they are plastic and once in a while a contractor comes and polish the dust off the plastic leaves and insert fresh plastic flowers into their little plastic receptacles.
Maybe 10% of what I do at work creates value, the rest just creates work for other people. For this effort, I get paid a rather large salary, which is, I suspect, is really what makes me so valuable to the business. The PHB's all the way up the tree understand costs, not value. Big Costs == Big Value.
It's just a fact of modern life. Could be worse. Keeps a lot of useless fucks off the streets and under control. Just look at what the 3'rd world is like.
Anon - Of Course.
-
Saturday 27th August 2016 10:48 GMT Martin an gof
Re: Nice work if you can get it
Maybe 10% of what I do at work creates value, the rest just creates work for other people
You are Scott Adams and I claim my £10:
M.
-
-
Thursday 25th August 2016 10:49 GMT Lunatik
Devil's advocate
I know Excel/Microsoft is the boogeyman and we are all to pillory at every opportunity, but...
"Help! This software I'm using, which I think I'm competent in but in reality am duped by its omnipresence and apparent user-friendliness, is doing stuff that I don't understand and now everything is broken!"
With any other piece of software we'd be pointing and laughing at the user, but for some reason this is Excel's fault?
Assuming it's being used at some level for number-based statistical or analytical needs i.e. what someone might reasonably expect to use a spreadsheet for* then not knowing how to use the product is hardly a defence, Neither is Microsoft really to blame for creating a product that has such a low bar to entry yet contains powerful features and complex behaviour that you need to understand to use the product effectively.
If the academics were complaining that they couldn't couldn't get their email clients or printers working properly would that be the software's fault too?
* Given the user cohort I really hope we're not in the realms of PowerPoint-for-posters, Excel-for-printed-forms here
-
Thursday 25th August 2016 11:05 GMT Steve Davies 3
Re: Devil's advocate
You make some good points. Many of the posts here show frustration with MS for making the tool try to do your job for you when if it just left well alone we'd (or most of us) would be a lot happier.
A case in point is the use of the ' at the start of a value.
Great but... when you use that data after some manipulation in excel as input for another application is when the problems really start.
All many of is need is a way to stop Excel from making any assumptions on the data we are importing/using. Don't drop leading zero's, don't convert numbers to scientific notation (normal people don't understand that anyway) etc etc
If MS were to do that an awful lot of people here and in other forums would go away happy.
It is not too much to ask is it? It is our data after all or has MS laid claims to that now?
-
Thursday 25th August 2016 12:27 GMT Anonymous Coward
Re: Devil's advocate
The main and default data type for a spreadsheet cell is number. Users have to understand this. That's why any spreadsheet I used (1-2-3, Quattro, Excel) attempts to treat everything as a plain number, upon which calculations can be performed. A 'classic' spreadsheet contains mostly numbers, and a few text labels. Then people started to use them to tabulate data (because the forgot what the TAB key on the keyboard was used for), a database replacement. You can blame the software if it can't infer what you're typing is not a mathematical number but a telephone number. You have to tell it. I'm still surprised how many computer users (and developers) don't understand the difference between a data type and its on screen display. A numeric cell is not just a simple string of numbers.
Dates are different thing, it's just a helper that people should learn how to disable when needed.
-
-
Thursday 25th August 2016 11:08 GMT Doctor Syntax
Re: Devil's advocate
The problem here isn't specifically Excel or Microsoft. It's autocorrect. When it works it's a time saver. When it doesn't it can require considerable effort to get what you want into the data. That can distract whoever's entering the data and lead to other errors elsewhere. We end up with something that's supposed to save time and avoid errors costing more time and introducing more errors than if it hadn't been there. And yet the little blighter insists on elbowing its way in where it's not wanted.
-
-
-
Friday 26th August 2016 16:10 GMT Stevie
Re: repeat after me
Actually it is, with a rather nifty GUI bolted on top. Excel uses the JET, doesn't it? If you say it isn't an enterprise-level database, well now you have my agreement.
Something many have missed here is that inside the workbook NOTHING HAS BEEN CORRUPTED.
That's because what you are looking at when you view an Excel (or OO or OL) spreadsheet is GUI PRESENTATION of underlying data. I'm reminded of the furor over what the control panel data/clock showed when Windows 95 debuted. Another day on which the IT community showed the world how to run around like a chicken with no head.
Hence, when the scientists' graduate student free help (the actual people who probably did the data entry) get a clue and read the manual, they can change the default "best guess what the average person needs" out-of-the-shrinkwrap behavior by flipping a few switches and everything looks right again - except for the "scientific" "peer-reviewed" papers which apparently made it into print without anyone actually reading the stuff in them.
Which proves one important thing: Geneticists find the write-up of their science as boring as everyone else does and so don't read it with their brains switched on.
-
-
Thursday 25th August 2016 11:38 GMT BoldMan
UTF-8 CSVs in Excell? You are 'avin a larf!
Excel has never been able to handle CSV with UTF-8 characters in it and very few people seem to realise this. I used to have to use CSVs to upload translated messagebases into a system and I learned quickly to install OpenOffice (or some variation - LibreOffice) and map CSV to ALWAYS open in that rather than Excel.
I've since spent years tryng to train my various colleagues to only open CSVs with LibreOffice...
-
Thursday 25th August 2016 11:41 GMT Paul Woodhouse
Re: UTF-8 CSVs in Excell? You are 'avin a larf!
to be fair I've seen LibraOffice do some creative mangling of CSV files too... not as bad as Excel but...
my favourite one was the CSV where someone had put a note in a col. many many screens away to the right of where the data ended, and then deleted it before saving...
yeah, that file was just a huge mass of many MB's of comma's....
-
-
Friday 26th August 2016 16:16 GMT Stevie
Re: Run the csv importer
Nonono Ac, This is a Microsoft product. It is supposed to run like a Swiss watch under any circumstances and read your mind as to what you actually want *without* all that tedious reading of fine manuals or looking at documentation.
From shrink-wrap to genetics lab, law office and doctor's billing department - all things to all people with nil effort and zero learning curve. Force(tm)-enabled to read your mind.
All this talk of learning how to use the product properly.What on earth do you think this is: Linux?
-
-
-
Thursday 25th August 2016 11:51 GMT revilo
mixing data and action
it is in general a good idea to separate data and code. Spreadsheet programs (Excel is only one of them) hide code or actions which make them hard to audit. If code and data are separated, one can run it with different programming languages, or test the code with extreme data cases (which are often also a source of errors). When used in numerical computations, spreadsheets are often opaque also about at which instances rounding are done. This can produce substantial errors. One must understand vendors to leave the software as it is as any small change can break other things in unexpected cases. In a scientific environment especially, it is a good idea to be vendor independent, have data stored in an open format and being able to do the analysis in different programming languages/software as the later evolve over time. The error rate reported here for gene research is actually pretty low. It is estimated that over 90 percent of corporate spreadsheets are buggy producing huge costs. This was an estimate from 2005 ( http://www.theregister.co.uk/2005/04/22/managing_spreadsheet_fraud) but it probably remained the same, as habits are hard to change.
-
-
Thursday 25th August 2016 14:19 GMT DJO
It's not changing it to a time of day, it's just that it's impossible to know if two digits separated by a : are HH:MM or MM:SS so it defaults to HH:MM.
Enter unambiguous values such as (to use your example): "0:34:20" - a bit more typing but after a while it (should) become second nature.
-
-
Thursday 25th August 2016 12:34 GMT Fading
The problem is not autocorrect.....
I've all the autocorrect turned off but the conversion to dates still occurs.... If you accidently open a .csv or .txt rather than import, even with all the auto corrections off, you still get some interesting data entries (new one for me the other day was 10 P being turned into 10:00 PM - did not see that one coming) .
Just being able to turn off all the conversions would be good but even in Excel 2016 this is not an option (I've been complaining to MS since Excel 5).
-
Thursday 25th August 2016 12:51 GMT Mario Becroft
Killing spreadsheets for fun and profit
Spreadsheets just need to die.
If you think you need one you probably should be instead using a database that actually stores your data in a well-specified, consistent schema with named tables and attributes (not meaningless columns names like A, B, C) and type safety. SQLite is ideally suited to small-scale ad-hoc data storage/analysis tasks.
-
Thursday 25th August 2016 14:32 GMT Ken Hagan
Re: Killing spreadsheets for fun and profit
You appear to be assuming that everyone using Excel is using it as a database. Actually, some of us use it as a spreadsheet and it works fine.
But the auto-formatting, auto-correct, auto-as-you-type... Yeah, they all need to die. They also need to die in word-processors. (End-users should learn how to spell and computers should stop trying to enforce one rather ignorant person's grammatical and stylistic prejudices.) Actually, the only instance I can think of where I am occasionally grateful and never actively narked by a computer trying to change what I type is Google's search engine.
-
Friday 26th August 2016 00:45 GMT Pompous Git
Re: Killing spreadsheets for fun and profit
But the auto-formatting, auto-correct, auto-as-you-type... Yeah, they all need to die. They also need to die in word-processors. (End-users should learn how to spell and computers should stop trying to enforce one rather ignorant person's grammatical and stylistic prejudices.)
Having generated many thousands of documents with a computer versus hundreds with a Remington manual typewriter, I am appalled by your suggestion that we should treat computer document creation at the same level as that old manual typewriter.
My writing is usually technical in nature and I need to frequently type the same phrases again and again. Except I don't because of Autocorrect. I can store several hundred characters under a TLA (three letter Autocorrect entry) such as zxc. I type zxc and when I hit the space bar, the computer types what I stored for me. It seems to me that is what computers are for, automating repetitive tasks. What possible benefit is there to manually typing out the same phrases again and again and again?
Then there are typos. For some reason my fingers insist on typing Tasmainia, rather than Tasmania. Autocorrect fixes that for me. Ditto for certain other words. What possible benefit is there to manually correcting these, or worse, as many people do, leaving them to confuse the reader?
-
Saturday 27th August 2016 08:34 GMT Anonymous Coward
Re: Killing spreadsheets for fun and profit
I am not convinced that moving to computers from typewriters has made any improvement at my university. Certainly they are good for doing repetitive tasks but they make creating documents so easy that people send me thousands of pages of waffle which probably contains between 10 and 100 pages of information that matters and the rest is just filler and repetition.
Moving back to manual typewriters where there is an actual effort to created documents rather than just thought free cut and paste would eliminate much of the garbage and make life much harder for the NSA.
-
-
Friday 26th August 2016 13:13 GMT Anonymous Coward
Re: Killing spreadsheets for fun and profit
But the auto-formatting, auto-correct, auto-as-you-type... Yeah, they all need to die.
If you use Worrd, then Installing a Japanese language pack because you need to write Kanji occasionally does that for you. MS doing anything for their customers!
-
-
Thursday 25th August 2016 13:48 GMT W4YBO
If all you have is a hammer...
...everything looks like a nail.
Excel is more like a Leatherman. Plenty of tools, none of which are as good as a dedicated tool, but can produce acceptable results if used properly and carefully.
I've pinched a blood blister on my hand more than once using a pocket multi-tool. I've had similar results if I didn't pay attention to the vagaries in Excel.
-
Thursday 25th August 2016 13:54 GMT theOtherJT
If excel is the answer...
...you probably didn't understand the question.
I have never seen excel used for anything that wouldn't have been better done with either (in order of frequency)
* Raw text.
* About 4 lines of bash script.
* A word document.
* A graphics package.
* An actual database.
I'm not even kidding about the graphics one. I've seen so many people use this abomination as a way to generate pretty looking timetable type graphics to be printed out and handed to students... which immediately goes wrong because the printed form doesn't look the way it did on their screen.
Still get at least 2 a year try it tho.
I'm sure there _is_ a use case for excel... but I've never actually met it.
-
Thursday 25th August 2016 15:32 GMT MCMLXV
How to prevent autocorrect when importing a CSV
Save your CSV file with a .txt extension instead of .csv. In Excel, go to File -> Open, select your .txt file, and Excel will recognise it's a CSV and start the field/column selection wizard, but it doesn't do any autocorrecting. Numbers keep their leading zeroes, postcodes retain their postcodey charm, dates aren't magicked out of genes, etc. Works a treat.
-
Thursday 25th August 2016 15:43 GMT Anonymous Coward
The problem here is clearly, and unequivocally the researches, and not Excel (or equivalent applications). For normal business, and home use of Excel people expect something entered that looks like a date to be treated as a date, and something that's a number treated as a number, etc....
If the researchers have so little understanding of the tooling they're using to manage their data it brings into question the quality and validity of their entire research, i.e. if they don't understand that you should format the data in Excel if it's critical to the outcomes what else are they blindly just assuming does what they think it does in the entire end-to-end of their research.
-
Friday 26th August 2016 00:55 GMT Pompous Git
The problem here is clearly, and unequivocally the researches, and not Excel (or equivalent applications).
Having trained computer users who were researchers and lecturers, you would not believe the level of ignorance they had of the applications they were using (Word and Excel). Presumably this was why they were having the training provided by their employers. Sad thing was they didn't <i.want</i> to learn; they couldn't understand why it was necessary. "It's a computer that replaced my secretary; she used to do all that stuff for me."
Real world example from Phil Jones (CRU @ University of East Anglia):
I’m not adept enough (totally inept) with excel to do this now as no-one who knows how to is here.
He was confessing to Bob Ward that he didn't know how to add a trend line to an Excel chart.
-
-
Thursday 25th August 2016 18:27 GMT Ken Moorhouse
Spreadsheets are a useful Prototyping Tool...
...but as can be seen from the original article plus ensuing discussion thread...
...they are no good for Production Data.
That's what Databases are for.
One of my clients receives a multitude of Production data using XLS XLSX and CSV formats generated by a multitude of software on a daily basis from its own Customers. All different column definitions, inconsistent Attributes (e.g., Units of Measure), sometimes involving subsets of data. My client is expected to slipstream their own data into that format and send it back in that same format. Some of the data could affect people's lives if not sent back correctly. My role is to automate most of that process using database techniques and then populating a clean spreadsheet to return to the Customer (and yes, sometimes I'm asked what happened to their fancy formatting and macros).
The likes of Joe Celko have tried over the years to try and nail down some kind of standard for Attributes, but it is a moving target. Someone on a nearby Reg thread (to do with which hemisphere Melbourne resides in), mentioned a similar issue.
Whilst people use Spreadsheets for Real Data, people are going to have to live with the possibility that someone will screw up metres and miles.
-
Thursday 25th August 2016 22:48 GMT Anonymous Coward
Re: Spreadsheets are a useful Prototyping Tool...
Spreadsheets are tools to make many kind of calculations, especially financial ones, easier. Their ability to recalculate on the fly after a changes is what made the so useful to the finance depts. of most companies (and everybody involved in calculations involving money), propelling the PC business in the beginning.
Remember it was the spreadsheet to take the personal computers from the amateur hands and put in on companies desks - even word processing had no all the advantages a spreadsheet had over paper (the original "spread-sheet") and desktop calculators. And the beancounters could easily get approved the high prices then commanded by PCs and spreadsheets (IIRC a spreadsheet software could easily cost $499 or more in the '80s) much more easily than the average typewriter user,
I remember my grandfather when he was filling those papers and then performing calculations on his Olivetti Tetractys 24 (back then a true state of the art commercial calculating machine, still in my studio), storing all those paper strips to double check the entries and the results (my grandmother often did it), and the need to start anew if an error was found.
Databases are a better tool today for ledgers or the like (but entering data may be less comfortable without a good frontend - web applications just made it worse, not better, for personal use), but when it comes to planning and forecasting, and you need to create models quickly, a spreadsheet is better than a database, the latter being not flexible enough.
There was a time when dBase, Paradox and Access saw much more use. Then everybody turned to spreadsheets...
-
Friday 26th August 2016 01:03 GMT Pompous Git
Re: Spreadsheets are a useful Prototyping Tool...
Spreadsheets are tools to make many kind of calculations, especially financial ones, easier.
We used to call them ledgers when they were paper written on with ink, or more likely pencil. In the mid-70s I used to spend a goodly part of every Sunday tediously checking my week's business activity with a four function calculator. In the late 80s I acquired a Tandy 200 with Multiplan in ROM. I got my Sundays back!
-
Friday 26th August 2016 07:30 GMT Ken Moorhouse
Re: planning and forecasting
I agree if the data is not already there. As I implied before though however, if data when recorded is not "granular" enough, it is going to be a struggle to get useful information from a system. Relational databases tend to focus attention on a better discipline of design, which means a well-targeted query can home in on the precise information required, even if it is held in a multitude of disparate, yet relationally connected tables.
-
-