Everyone should read this...
... https://www.amazon.co.uk/How-Lie-Statistics-Penguin-Business/dp/0140136290/ref=sr_1_1?ie=UTF8&qid=1487625972&sr=8-1&keywords=how+to+lie+with+statistics
How could El Reg ignore this? – two University of Washington professors have assembled a course to teach students to identify bullshit. Biology professor Carl Bergstrom and assistant professor at the university's Information School Jevin West have devised the “Calling Bullshit in the Age of Big Data” course as their small …
It's not about lying with statistics. It's about how statistics can lie.
The first is about how you present data to support your opinion. The second is about how there can be false correlations in data that look significant, but aren't.
We arrived here due to data discovery nonsense just rummaging about looking for insight rather than starting with a hypothesis and using the data to prove or disprove it.
Don't get me started on significance and degrees of freedom...
It would be well worthwhile to make Marketing, Statistics and Economics compulsory study topics for high school kiddies so they could tell when they were being lied to.
Unfortunately that's exactly why it'll never happen, too many politicians have too much skin in the games to ever allow it.
Grumpy wrote:
"worthwhile to make Marketing, Statistics and Economics compulsory study topics for high school kiddies so they could tell when they were being lied to."
You do have to start somewhere, but I think an honest BS Detection course appearing in early primary school curricula is something to strive for.
> You do have to start somewhere, but I think an honest BS Detection course appearing in early primary school curricula is something to strive for.
I've had exactly that, when I was 11-12 or so. On that year we were living in a third world country ruled by a ruthless dictatorship. Yet we had that critical reading course where we were taught *how* to read journalistic discourse, distinguish between fact and opinion, identify bias, etc.
And no, this wasn't some fancy elite school. My dad, always the leftist, decided that if public school was good enough for the locals, it was good enough for me. He wasn't wrong, for once.
"It would be well worthwhile to make Marketing, Statistics and Economics compulsory study topics for high school kiddies so they could tell when they were being lied to."
Many people are fooled every day by simple percentages and that's been part of school education for a long, long time and at quite an early stage of the education process.
CANCER RISK UP BY 600%!!!!! type stories are common in the newspapers and it seems that many readers believe them. Technically, they are true, of course, but without actual numbers it's meaningless.
Ah, well that's because they aren't being taught the right thing. The school teaches them "percentages" and then stops. It has to go on to teach "how to lie with percentages".
In the UK, teenagers are almost taught the right things in English. There's a bit in the GCSE course about "persuasive writing" (see how they bottled it?) which is almost "how to lie with words". This is a good start and needs to be encouraged.
They are also almost taught the right thing about IT. All that stuff about "online safety" clearly *implies* that there are people out there who will steal all your personal data, lie about their age, trick you into sex, etc... but I fear that once again this isn't quite explicit enough.
Until you teach "how to be the bad guy *yourself*", you can't really expect people to be able to defend themselves properly against other people doing the same things. It's learning by doing rather than learning by listening to some boring adult droning on about it.
Someone decide to teach critical thinking at the college level? Inspired by perhaps a visit to the nearby MS campus?
On the non-snarky side... this is a good thing. Maybe some of the students once they graduate can help us old cusses show others how to see through the marketing BS... like IoS.
I don't think number lie, unless i've taken a wrong turn somewhere and ended up in bizzaro something.
Do not the good folks, for nefarious ends, presenting the numbers lie?
Still, spotting a bullshitter and bull shit related methods must help and have productive outcomes for society, especially when applied to the political vomit-sphere.
All power to you.
Now if we can just get some practical skills into secondary schools: like how to do you taxes and balancing a credit cards, along side running a household, we'll be on to a winner.
> I don't think number lie, unless i've taken a wrong turn somewhere and ended up in bizzaro something.
> Do not the good folks, for nefarious ends, presenting the numbers lie?
Sometimes the numbers do lie all on their own. See The Deluge of Spurious Correlations in Big Data (pdf) by Cristian S. Calude and Giuseppe Longo, where the abstract, in part, reads:
For example, we prove that very large databases have to contain arbitrary correlations. These correlations appear only due to the size, not the nature, of data. They can be found in “randomly” generated, large enough databases...
Sad to see that I'm now so old that stuff I learnt at Uni is becoming fashionable again! Degree in Actuarial Science featured a uniton Medical Statistics. Looking at heart attacks by month of birth. I can't remember which month it was, but one month had significantly higher death rates. It was easy to create a narrative to explain why (month of conception time of year), but it was just a quirk in a very large data set.
>Tugging strongly on The Register's heart-strings, they call out some of the world's richest known sources of bullshit: “the TED talk you watched last night … the latest New York Times or Washington Post article fawning over some startup's big data analytics”...
>We almost weep with joy to see them ask “Can you tell when a clinical trial … is trustworthy, and when it is just a veiled press release for some big pharma company?”
Actually, that was the subject of a series of TED talks, "Bad Pharma" where they look at how clinical trials are manipulated to the point of being "less than optimal."
There are a couple of issues. The first is morality. There is a difference between showing off your thing in a good light and completely manipulating the message for the purposes of misrepresentation. We seem to have a distinct lack of morality in our vendors and a surprising willingness on the part of customers to tolerate it rather than blacklisting them.
The second issue is the models. The problem with stats is that you have so many variable factors that your models can easily fail to reflect reality. There are also so many shades of meaning attached to data before it goes into the lake that get lost by the time the data comes out.
Big Data is what big companies do when they can't be bothered to get to know their customers/internal management and their customers problems. There is certainly a place for stats and multi-discipline analysis, but I'm not sure that throwing it all together into chaos and hoping order comes out is a model worth banking on. Even matching up things like network traffic and latency issues can be a non-trivial exercise.
The Reg says:
> “the TED talk you watched last night … the latest New York Times or Washington Post article fawning over some startup's big data analytics”.
> “Can you tell when a clinical trial … is trustworthy, and when it is just a veiled press release for some big pharma company?”
It should have been called "How not to invest in Theiranus".
That course will render most of Microsoft's marketing strategy pointless. If you need any sample BS material, get hold of any Microsoft sales presentation - the higher up the target, the easier it is to identify the misuse of statistics.
Further sample material can be obtained from the Donald Trump press team. Just ask Sean Spicer or, for material that is out of date or otherwise no longer current, KellyAnne Conway...
:)
I fired a doctor because he couldn't tell me the difference between absolute risk and relative risk.
One of my favorite sources to send someone to about the massively nasty way people misuse statistics in order to fool people into thinking they're telling the truth about something is a Youtube video by Tom Naughton, creator of the documentary "Fat Head", debunking the documentary "Supersize Me". The video is called "Science for Smart People". It's worth a look. https://youtu.be/y1RXvBveht0
When I was at university, but, you know, after they switched the language of instruction from Latin to English, there was a department called "Philosophy". Although I was a computer science major, the classes in rhetoric and epistemology were among the most beneficial. Those and the "intro to logic" (not circuits, but syllogisms, and a splash of statistics) taught in the philosophy department by a professor who also taught my calculus class.
Of course, by then I had been marinating in television, radio, and newspaper advertising long enough to be familiar with the concepts, if not the terms.
Book recommendation: besides "How to lie with statistics", I strongly suggest the books by Tufte on data visualization. Once you can recognize "chartjunk", you have a real leg up on BS detection.