Re: Pass the salt...
Hi, I'm JV, the paper author.
The methodology is described in detail in the paper, but the cliffs are:
1.) The process is no different to how a normal login works: only hashes are stored in a database, so whatever you typed is just hashed first before being compared in the authentication database. This is how any modern login system never really needs to process a plaintext password, other than to hash it to start the actual comparison.
2.) In our case specifically, since we control internal systems, whenever a user submits a password to login, we basically just hash it according to Troy Hunt's list standard (SHA-1). That SHA-1 of the password is then compared to Troy Hunt's list. If we get a match, then we know it's a breached/compromised password. We don't know what password that was, though. To make sure we handle all data ethically, we never store plaintext passwords within the research infrastructure. We also don't even store usernames - we also did some sort of hashing of usernames themselves, so that even the user identities are anonymized within the research databases (yes, despite not even storing passwords in the first place)
In general, what we did was really no different than what the NIST June 2017 guidelines mandate. We just figured that process would contain interesting information, so we just tried to see what interesting data can be uncovered.