back to article Have I Been Pwned to go open source – 10bn credentials, not so much, says creator Hunt

Credential breach website Have I Been Pwned (HIBP) will be going open source, site creator and maintainer Troy Hunt has told the world. The site, at the time of writing, hosts details of roughly 10 billion hacked accounts from 473 separate websites. You input your email address and HIBP tells you whether or not the address …

  1. Anonymous Coward
    Anonymous Coward

    In fairness, it's not exactly the most complex codebase in the universe. You got a list of compromised credentials A. A list of people subscribed to tell you when their email appears in a list of compromised credentials B. When new intersections of A and B appear, send an email saying 'you've been pwned'. If it's more complex than that, he's doing something majorly wrong.

    1. EvilGardenGnome

      At the core, sure, but he has other systems tied into it (password checker and API, for instance). I imagine it's not too complex, but the dust that collects on a personal project is high, as is the anxiety/embarrassment of showing off your dirty underwear. I personally understand wanting to have trusted people look first.

      * cautiously eyes own private repos *

    2. grizewald

      Fair comment. I also don't see why the sale needs to include any stolen credentials in a useful form. Hash the mail addresses and only publish the hashes. It wouldn't change the ability for the site to tell you if you've been pwned or not.

      As to finding anyone trustworthy and dedicated enough to run, maintain and most importantly, update the site that Troy created so that it retains its reputation is probably the hardest part of trying to pass it on.

      I'd say that some things people create on the Internet are much like children: once you have given them life, you have an inherent responsibility for them. This responsibility may include giving them accommodation at a hotel (your house) for many more years than you may have expected!

      The only hope I'd see for the site is if a truly independent non-profit organisation with the right competence and drive offered to take it over. EFF comes to mind as they already publish quite a few tools to help people avoid some of the more common dangers on the Internet. This kind of resource should be right up their street.

      1. KorndogDev

        no no no

        "Hash the mail addresses and only publish the hashes"

        NO. Such hashes would be broken in hours. New video cards can generate billions of hash values per second. And email addresses are NOT built from completely random characters, which makes the whole process much easier. Simply brute forcing them with some not-so-clever rules (e.g. string must end with '@gmail.com') is a task for a high school student.

      2. Charlie Clark Silver badge

        There's no real need to conceal the e-mail addresses as these are already publicy available.

        But the database is not the code and there is no need to make it available with it – there is no benefit and it's probably significantly larger.

        1. disgustedoftunbridgewells Silver badge

          The database is the valuable bit.

    3. Bitsminer

      If it's more complex than that...

      Troy has published several blog posts outlining how he has taken full advantage of numerous content-delivery-networks to reduce the cost of supporting millions of lookups per hour on this database. The intersection of A and B is conceptually easy, making it cheap is not.

      Also, data quality has been a big time consumer; the dumps of data provided stolen by hackers never does meet the expected standard. Pikers.

      I expect the governance model to take some time to get right, and he and co-contributors should plan for at least one big failure event.

  2. Mark192 Bronze badge

    What a nice guy <-- massive understatement

  3. overunder Silver badge

    Huh, does it log searches?

    If you search for your email and that search is logged, the logs become $$$.

    1. chivo243 Silver badge

      Re: Huh, does it log searches?

      Even with the best of intentions from the site owners, I always wondered about entering your personal info on a site like this or like this non-sense:

      https://www.theregister.com/2020/07/30/genderify_shuts_down/

      1. 142

        Re: Huh, does it log searches?

        > I always wondered about entering your personal info on a site like this

        It does require trust, but I think you can usually tell by how they talk about the potential issues. This guy's always been open about that worry, and it's always been clear he actually understands people's concerns in that respect.

        HIBP's policy:

        > When you search for an email address

        > Searching for an email address only ever retrieves the address from storage then returns it in the response, the searched address is never explicitly stored anywhere. See the Logging section below for situations in which it may be implicitly stored.

        > Logging

        > Only the bare minimum logs required to keep the service operational and combat malicious activity are stored. This includes transient web server logs, logging of unhandled exceptions using Raygun, Google Analytics to assess usage patterns and Application Insights for performance metrics. These logs may include information entered into a form by the user, browser headers such as the user agent string and in some cases, the user's IP address.

        Ok, you still have to trust him that's true, but I've met plenty of people who would gleefully hoard people's data, and they'd never in a million years phrase their lies like that.

        It's when they talk vaguely, or that dismiss concerns outright that I'm wary of. I don't even have to go looking into that Genderify outfit to know their privacy statements would have been meaningless waffle, exaggerated promises, or doublespeak...

        1. EnviableOne Silver badge

          Re: Huh, does it log searches?

          Troy said he came to realise that a large percentage of the value in HIBP was the trust the community put in him specifically.

          From the detail he puts into his blog posts about every change he makes and to explain why he has made each decision, and the fact he is still running this in his spare time on a not-for-profit like basis, combine to increase that level of trust.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020