back to article 35m Google Profiles dumped into private database

Proving that information posted online is indelible and trivial to mine, an academic researcher has dumped names, email addresses and biographical information made available in 35 million Google Profiles into a massive database that took just one month to assemble. University of Amsterdam Ph.D. student Matthijs R. Koot said he …


This topic is closed for new posts.
  1. Mikel
    Black Helicopters

    Quite the point of the thing, what?

    They used to distribute these bulky books with people's names and phone numbers - sometimes even their physical addresses. Just printed in a book for all to see: and they gave the book to everybody who would take one. Millions of trees were pulped to make this possible.

    The point of publishing information online is, well, to put information online. The fact that this fine article - and even this comment - is out there on the Interwebs all indexed and indelible isn't destroying the author's privacy at all. Putting stuff online is what these services are FOR.

    1. Anonymous Coward

      "Think of the trees"

      and ignore the elephant in the room.

      Question is why is Google so labouriously compiling all this information? This is not white pages stuff, were personal photo albums ever on the White pages? Slideshare presentations? Twitter posts?

      Think how easy it's now to target people who don't agree with something you like?

      How easy it is for spammers to target gmail addresses with name and lots of info?

      Most people don't even know Google is keeping this about them.

      WHY are they doing this? This is their battle against Facebook, a service people actually have to sign up for. Google can't get to that data and that is driving them mad, there can't be ANY information out there that Google can't turn into money!

      So they instead create their own service and sign everyone one they know up automatically.

    2. KitD


      In the 21C, there is little real value in an index that has to be perused by human eye/fingers. Machine searching & indexing changes the range of potential (ab)uses of "private" information such that misuse can be done by a hacker in a few minutes and affect millions of people. Unthinkable in the days of phone books.

      1. Danny 14


        this stuff is only compiled because people are stupid enough to put it there in the first place.

    3. Anteaus

      Go... as in go read.

  2. David Hicks

    Can we all agree now?

    That if you put information online in a way that it's accessible to anybody else at all, you've made it public?

    And that even if you haven't made it accessible to anyone else, you've probably made it public anyway?

    And that if you don't want information to get into the hands of hackers/your boss/your mum, it's best not to publish it in the first place?

    1. Anonymous Coward

      That is far too sensible an outlook

      ...for most users of Web2.0 services!

  3. Eugene Crosser


    Isn't this the responsible behavior - to make the information that the user wanted to be public really public? I.e. easily searchable and aggregatable by any member of the public?

    As opposed to only by the owner of the system, for the purpose of their own profit?

  4. J 3
    Paris Hilton

    whether the scraping violates the company's TOS

    Yeah, shoot the messenger... As if the scammers and other such low-lifers did read, let alone respected, anyone's terms of service.

    1. Anonymous Coward
      Anonymous Coward


      Google makes it's living by scraping and data mining the internet on an epic scale on a daily basis. Would be slightly hypocritical of them to stop others doing the same to their sites.

      Google profiles are private unless you consciously make them public. If you don't want people to see it or scrape it why make it live?!

      1. Anonymous Coward
        Anonymous Coward

        Google profiles ARE public

        There's no private profiles in Google, you can only choose if they can be searched (eg by name) or not.

        If "you've been writing reviews on Google Maps, posting buzz on Google Buzz, creating articles on Google Knol, sharing Google Reader items, or adding books to your Google Book Search library" you'll have a profile already.

        Even if you can't search for them if you have the users gmail account the profile is pretty easy to find.

        1. Kay Burley ate my hamster

          I don't have one

          I use a large amount of Google services, but I don't have a Google Profile, these 35M users have the option of giving a shit about their online data like I do, they can read before they click, and they can secure their online data.

          As other people have said, if you enter your information into a website you should expect it to be on a website, or you check how to make it secure.

  5. Anonymous Coward


    It 'might' violate Google's 'terms of service' and would breach 'facebook's policies'.

    Phew, I thought they was nothing to stop people doing this for a minute...but the threat of breaching somebodies policy or terms of service means we can all sleep safe again - after all the threat of never being able to use google or facebook again (without changing your IP or username) is more than enough to put nasty people off the idea...isn't it.

    Mind you, it won't belong before somebody tries to make it a criminal offence to breach terms of service..then we will have corporations making laws....oh, hang on a minute...they already do (but it's called lobbying at the moment).

    Serious I bovered (nah, not daft enough to have my info up there in the first place).

  6. Shannon Jacobs
    Big Brother

    EVIL is the keyword for Google these days

    I used to believe that Google was sincerely trying to avoid being evil. However, now I believe they are just going with the flow, and unfortunately the flow of the American legal system is to make companies evil. Anything less, and your shareholders (or other parties) are apparently entitled to sue you to death.

    Who to blame? Not really Google or even the lawyers. I say it's the crooked businessmen who bribe the professional politicians to write the laws. There are plenty of honest and moral businessmen out there, but they are NOT the ones who are donating to the political campaigns or high-pressure PACs. It's the crooked businessmen who want to (1) legitimize their previously shady deals or (2) create new shady regions. Unfortunately, it's also a stable situation in the worst way. The more money they get by 'investing' in these evil methods, the more money they can get to buy more politicians.

    It isn't actually a crime to be a statesman in America. However, you will only get elected by accident and the big money will drive you out of office before you can really change anything for the better.

    In conclusion, Google is just playing the game. Too bad for us that the rules of the game call for more and more EVIL.

    1. Anonymous Coward
      Anonymous Coward

      Crooked businessmen

      So you mean like Eric Schmidt and many Google execs (who are now professional politicians too)

      Plenty of crooks at Google's address. No need to bring in external parties to blame for their evil.

  7. Mark 65

    Terms of Service

    "A Google spokesman said he was exploring whether the scraping violates the company's terms of service."

    Like anyone gives a shit - it's not like a criminal would pass up the opportunity just because of a terms of service breach. What a twat.

  8. MarkieMark1


    I instinctively checked my google/buzz profile, settings, etcetera, while re-reading the article, then noticed that, well, in fact the information I had put there, as well as the searchability setting, had been for professional reasons, so no need to make it more private at all :-D

  9. supervan

    That paper deserves an 'F'

    We should be shocked that public information is public?

  10. multipharious

    Legality for Google

    Without an explicit Opt-In (not implicit the way these cavalier post-privacy advocate posters mean) then it violates German Datenschutz laws to expose this type of data. Have at 'em boys.

    And for f%ck's sake Google, it's a bit of internet public policy called a robots.txt. It matters little (I read the script) but it at least shows a policy decision and is accepted protocol.

  11. petur

    Shannon Jacobs FAIL

    Not only it your post completely OT, it seems you even completely missed what TFA was all about.

    Somebody managed to write a script that scrapes *PUBLIC* profiles, and didn't realize that posting this as security research made him look like a complete idiot.

    Nothing to see here, move along...

    To counter your pointless negative post, I shall now mention the fact that Google Summer of Code is underway again, an initiative where Google pays students to write open source. This goes on the whole summer, as if they had a real summertime job. Surely there must be something evil in that too...

    1. Anonymous Coward

      Shill FAIL

      In other news BP just gave candy to school children last week.

      Surely they are not that evil for spilling a few millions of barrels of oil?

  12. Robert E A Harvey
    Big Brother

    Knock Knock

    I'm waiting for the news that the proffesor is subject to a McKinnon-style ectradition order to face trial in the usa, probably under the DCMA.

    Making an american company look like pillocks is a capital offence, surely?

  13. Anonymous Coward

    tbh, tl;dr

    Did he just crawl the sitemaps?

    wow, impressive ... erm... wait... no...

  14. Anonymous Coward
    Anonymous Coward

    Do people really not understand this?

    Let me make it clear. Quoted text is from Google themselves:

    "If you've been writing reviews on Google Maps, posting buzz on Google Buzz, creating articles on Google Knol, sharing Google Reader items, or adding books to your Google Book Search library, you may already have a profile."

    WHY? I share Google Reader items with friends, why do I need a public profile created automatically?

    "At a minimum, your first name, last name, and photo will be public on the Internet. You can then provide a variety of additional information about yourself in your profile"

    At a minimum?!?

    Then they link YOUR profile to direct AND indirected connections based on YOUR use of their products. The famous Social Circle:

    There is no point to any of this outside of Google's desire to become the OPT-OUT, default, Facebook of the Internet.

    It's madness and it needs to stop before they go down in tears, they've had ample warnings already. Really inform yourselves a see how can you still condone their current behaviour.

  15. Tony Barnes

    Agree with the above sentiments - its public information

    This is not a security problem in any way, shape, or form. He collected information that people have happily allowed to be public. If you check out what your Google profile says about you, it will likely, like mine, say basically nothing bar my name. If you've added other information to it, there is probably a reason you have - as above for your work being one of them.

    Though it is interesting that he managed to grab 35GB of info this way, I'm not sure that anyone in their right mind would have tried to put up security to prevent someone accessing public information? Sounds like a waste of time to me...

    1. sabroni Silver badge

      look where your post is though..

      ...directly below one explaining how use of google services can unwittingly result in you creating a profile whether you intend to or not.

      Yeah, it's in the Ts&Cs somewhere for the service you used, so strictly speaking you gave permission for this, but to clearly state "this is not a security issue" seems a little premature.

  16. Doug Glass

    Boy am I glad ...

    ...I'm somebody else.

    1. mitch 2

      No- I'm somebody else!

      The post is required, and must contain letters.

  17. peddley

    I agree, it's public information

    I agree that if the information is marked as being publicly available the it's right that it should be indexed.

    I think as with most things, what is needed is better user education, perhaps something that says:

    Tick this Box if you don't want a spammer or scammer to include your details in a massive database of 35M records.

  18. Anonymous Coward

    Low marks for both the study and the article

    While there are issue around this topic, I feel that this article isn't very good.

    The information on Google is a public profile and is explicitly public: I have no public profile, perhaps google needs to make it clearer to some users that the profile data is all fully public.

    The information on FB is controlled by settings that have been repeatedly reported as difficult to understand and manage;data that users do not intend to be public becomes public.

    These are both fruit, but one is like an apple and the other is like a banana.

  19. Anonymous Coward

    Wait a tick

    You expect people to actually read a EULA ???? What a concept.

  20. NoneSuch Silver badge

    Just checked my info

    and it is all there for all to see.


    Sean Massive Liar Esq.

This topic is closed for new posts.

Other stories you might like