Just sharing some details from the paper
Hello, Register community.
I'm JV Roig, the paper author. I just spent the past two days answering some of the more interesting comments, and I haven't really had the time (until now) to actually make a post that I wanted - a post that shares some interesting things about the paper itself and how it came about.
First, let me go ahead and share what I believe is the actual interesting findings (which the article does not share). This is the table of results when users with breached passwords are categorized per GPA tier:
GPA tier |
total students here |
students using breached passwords |
% of students here with breached passwords]
3.5 - 4.00 | 39 | 5 | 12.82%
3.0 – 3.49 | 203 | 32 | 15.76%
2.5 – 2.99 | 446 | 66 | 14.80%
2.0 – 2.49 | 464 | 92 | 19.83%
1.5 – 1.99 | 99 | 19 | 19.19%
1.0 – 1.49 | 1 | 1 | 100.00%*
*The single student who had a lower than 1.5 GPA also happened to use an unsafe password. At least he/she is consistent, eh?
This table shows students sorted according to tiers of GPA. The top GPA tier is pretty low at 12.82%, and gets up to 19% near the bottom. The rise isn't perfectly linear, but you can clearly see that the trend goes up.
I wish the article itself showed this (instead of the data snippet that ended up being featured instead), but Tom Claburn (due to timezone difference with me) did not have enough time to reach me before needing to publish. I woke up with this article already done and published, so all I could do was join the forum for some interesting discussions.
****** ABOUT THE EXPERIMENT ***********
This experiment took us over 3 months to finish data collection, since we basically had to let the midterms and finals periods conclude. Those are the peak seasons of system activity (professors submitting grades, students checking grades online, etc), and so most users are captured by our login hooks during those periods.
***** WHAT I ORIGINALLY THOUGHT THE DATA WOULD SAY **********
Another thing I'd like to share is that well before I got data to crunch, I already thought I knew how this would turn out - with absolutely no difference between the highest-GPA group and the lowest-GPA group. In fact, my abstract then (I prepared in advance because I thought I knew what the data would say) concluded with:
"Correlating these with academic performance data from each student’s grade history, the researchers found no relationship between a student’s academic performance and the likelihood of using a weak, compromised password. Our results suggest that weak passwords aren’t because of any level of intelligence, but simple disinterest or lack of awareness. Relevant password policies should be formulated taking this into account."
Once the data was crunched, I had to revise the abstract as necessary.
******** GOING FORWARD *************
I still believe, however, that this result is a fluke due to how much fewer students are in the top GPA tier, and the overall low population of students sampled. There's also concern that the metric used here is too blunt (all these concerns are noted in the paper's conclusion).
A far bigger sample of students (from our sister school) will shed more light to it, which is already a planned follow-up study. In the end, repeated experiments and studies (across more institutions) would likely converge on my original (planned) conclusion - the reasons for weak passwords are more psychological than intellectual.
But, until those studies come in, I really had no choice but to describe the data as it arrived. There's a small difference found here, but this really should be taken as more of a curiosity and a first step in what should be a series of more experiments.