back to article Google Oompa Loompas cloaking user agents?

Over 10 per cent of Google's internal machines are hiding their software makeup from the outside world, according to data collected by Net Applications, a web analytics outfit that captures user traffic on more than 40,000 sites across the net. When visiting webpages, the firm says, 11 to 13 per cent of internal Google …

COMMENTS

This topic is closed for new posts.
  1. Gavin McMenemy
    Stop

    So why is Net Applications seeing what it's seeing?

    What they want to see?

  2. J
    Joke

    Hm...

    They are using Windoze but feel ashamed of that (or afraid of being fired?), that's why...

  3. filey
    Thumb Down

    because...

    it sounds more menacing and scary to say they are cloaking

    total non-story

  4. Simon
    Black Helicopters

    Catching the bad guys

    There might be a non-sinister explanation. A lot of the work Google does is related to quality, so anyone in the anti-spam team, AdWords quality, fraud detection, search quality, etc. At http://www.justlanded.com we have seen Google servers pretend to be different O/Ss and user agents from the same IP at the same times. Yahoo do the same stuff and from time to time we have to unblock their IPs when they use things like Wget to pull down thousands of pages (typical email harvesting or scraper behaviour).

    People will see black helicopters everywhere... that's not to say they won't be launching a distro or some other MicrosoftWhack though...

  5. Charlie van Becelaere

    Perhaps

    the user agents are really there, it's just a bit too cloudy to see them properly?

    <getting coat>

  6. Henry Wertz Gold badge

    Just someone who is "paranoid"?

    I wonder if it's just someone that is "paranoid". You know, flash off, no java, no javascript, clear the cookies and browser history all the time. Some people at google might just be "Ha! I'm going to clear the user-agent string too!" Or testing anonymization tools. Something like that. I just don't see google hiding a new OS by clearing it all out entirely. Maybe it's even just an internal test build of Chrome that had a bug making it leave the user-agent blank 8-).

  7. joe_bruin
    Go

    Rampant speculation

    The most likely reason for blank user agents: the Googlers have decided that they want to encourage websites to be standards compliant instead of detecting the browser type and building a page for that one. This sounds pretty consistent with a company that has just released a minority browser platform.

  8. James
    Gates Horns

    An innocent proxy?

    As I recall, blanking or replacing the user-agent string is a standard feature of Squid (and presumably other proxy servers as well). OpenBSD's pf firewall has a "modulate state" option, which does something similar on a TCP/IP level (randomising all the parameters, making it hard to identify the OS generating the traffic).

    If I wanted to hide my secret OS/browser, I'd have it report itself as something like a Subversion build of Firefox/Gecko running on WinXP - looking normal in logs, while having an obvious explanation for any odd behaviour server admins might notice (it's a work-in-progress version of an open source browser, of course it's not acting in exactly the same way as the last released version!).

    Blanking the user-agent, on the other hand, would make sense in two ways: first, as a paranoid sysadmin wanting as little information getting out as possible (so you blank user-agent and probably have a firewall randomising parameters too) - second, to help catch sites which are running spider-traps which serve up pages of link-spam to anything other than IE. (In fact, the comment about these 'appearing to be real people not spider activity' could be exactly the point: comparing the pages seen by real people - and their proxy - to the pages served up to Googlebot.)

    Or option 3: they don't want the world knowing that for all the hype about using 'Goobuntu' and having their own web browser, 90% of their staff are still using IE 7 on XP!

  9. Simon Painter
    Dead Vulture

    what a total load of fud

    There is no way that google don't proxy outbound traffic and knowing them they are not buying stuff off the shelf when they can make it themselves. It's eminently possible that they have their own proxy that doesn't put in a user agent.

    Where's the 'slow news day' icon?

  10. Anonymous Coward
    Alert

    when are they going to grow some

    and come up with a "windoze" compatible op and put MS to the sword?

  11. Anonymous Coward
    Stop

    Surely they're just trying to catch cheaters?

    Some cheaters send one set of content if they see Googlebot in the UA string, and another if they see any other agent. So it makes a lot of sense to me if google double-checks the googlebot results by fetching the same URL again with a different UA string and seeing if they get the same results returned.

  12. Anonymous Coward
    Anonymous Coward

    Mobile optimisation?

    I've seen quite a few of these from Google, and reached the conclusion it was requests from mobile users passing through google where the pages are optimised for display on a phone. The name of the system escape me this early in the morning, but there was a lot of discussion of it some time ago on webmasterworld etc.

  13. Ken Hagan Gold badge
    Happy

    Re: Rampant speculation

    "The most likely reason for blank user agents: the Googlers have decided that they want to encourage websites to be standards compliant instead of detecting the browser type and building a page for that one."

    My thoughts entirely. Perhaps the world would be a better place if *everyone* did that and the broken sites discovered that they weren't getting customers anymore.

  14. Jon
    Thumb Down

    ng ng ng...

    They're using a proxy you idiot

  15. ryan
    Black Helicopters

    @James

    or that even Google aren't using Chrome...

  16. Anonymous Coward
    Anonymous Coward

    Let the people browse

    I see google non spiders come and look at some of my sites, I sort of expect people in google to use the web themselves, and yes the header is fairly stripped, there are some blank ones, but we can all do that if we like.

    The reason is the same as the no_exist-google87048704 they are seeing what the site returns to a browser with no user agent, or it is just some of them don't wish to mention the user agent, there is no law requiring it :)

  17. Anonymous Coward
    Boffin

    use of User-Agent

    "A browser user agent not only identifies the browser a machine is using, but also its operating system."

    Not necessarily. RFC 2616 only specifies that "User agents SHOULD include this field with requests. The field can contain multiple product tokens and comments identifying the agent and any subproducts which form a significant part of the user agent." (§14.43) Operating system is not a significant part of the browser.

    As several people have mentioned, Google may be testing the behavior of sites when they are not given a known User-Agent header. Sending an empty string instead of forging it (e.g. "fake user-agent") could be an attempt to make it less noticeable in logs. They're clearly up to something, and trying not to leak any information about it.

  18. charles uchu

    maybe these oompas are their new "consensus engine"

    So, mysterious blank user agents

    Coinciding with some human ranking component to google search results

    Makes for a nice little no-useragent widget of googoompa-loompa desktops that has them clicking happily all day, voting for search results, and improving their chocolate...i mean search results.

    Hmmmm...

This topic is closed for new posts.

Other stories you might like