Works pretty well for me:
Likelihood of you being FEMALE is 0%
Likelihood of you being MALE is 100%
One of the problems that's plagued netizens since the inception of the world wide web that their browsers have a habit of leaking every site they've visited in the recent past. A quick stop at Blowupdolls.com, Mysecretbusinessproject.net or any other site is available to any webmaster with rudimentary coding skills. Now the …
I'm quite sure that guys are worried about their money as much as gals are (esp. gals who're more then happy to spend their guy's dosh ;) -- but online banking sites turn up at < 1.0 M/F ratio. As do Gmail, Amazon, and the works.
Guess I should start visiting more manly sites ...
Likelihood of you being FEMALE is 50%
Likelihood of you being MALE is 50%
Likelihood of me setting Firefox to keep history for 0 days, not remember what I enter in forms and search bar, delete all history when I close Firefox, have a blank homepage, have 0 bookmarks except for del.icio.us bookmarklets, having Adblock Plus, filterset.g and customizegoogle add-ons cranked up to max is 100%
Likelihood of a snooper finding out anything useful about me is... I dunno. But doing the above leaves me with a few less things to worry about.
There's a zen-like re-assurance when you have to log in to each site on your first visit since closing and opening up the browser. This whole 'remember me on this computer' and 'remember my password' stuff just dupes and encourages complacency in the average 'mum and dad' level user.
Check me out - living dangerously by not posting anon. :-)
Because you have just got me writing code to check this out.
Ever since I have been coding JavaScript I have never been able to get at the history urls in the history object, and most texts will claim you cannot.
Now there is a trick that involves going to a domain that matches, you could detect that.
So, what we are saying here is a site loads up a number of URLs from hotswedishnannieswithpompoms.com to allgoodchistainsunite.org (both available for anyone complaining you can't get a domain nowadays), then it checks them off, and prevents the page moving or perhaps uses multiple frames.
Unless I am missing something here, if you can read the history object in JavaScript then it is broken, it should be unreadable, therefore if it is readable then it will be fixed.
So ok, possible brute force the urls out, that could be stopped, I don't know anyone who really uses that feature anyhow. Though now I think about it, it has a use:- oh what was the last page on theregister.co.uk was I on history.go('www.theregister.co.uk'); that will move window focus to the last page on theregister you were on, even if run on a site 10 hops away. But, still it is not great.
Now, hmm could this be combined with a search engine, possible but probably not, you can be sent back to google, but then you have to get that url and you cannot unless you use a cross site scripting attack (which is a flaw which should be plugged).
Ok, I have gone to the site now, I had to take down the security, it was being blocked. He uses 10K of URLs and combines that with the history.go, possibly css computed values and a detector of some sort (exceptions, frames etc).
Hmm, not really that worrying, and sort of detectable he will have a huge chunk of data somewhere and you will see connections made back to each site.
I think you are hyping this one a bit, if you could read the history object now that would be covert, quick read and a X req call back to your server.
As to these other gaping holes, well there is only the reference one, and that is one deep and helps to determine who linked to you, it can be turned off as well.
Now, something like google urchin and phorm is a different matter, those do work.
But this, well it is obvious, and noisy - hackery but not crackery. It is like saying that safe is not secure because I can point a tank at it and blow the bloody doors off. Still, people should be aware of how it can be used and should watch out for it, but its general use is more benign, maybe a quick reordering of links depending if they have already gone there.
But it is misleading to say 'Mysecretbusinessproject.net' if they don't specify the URL then they won't know it.
A good habit to get into is to do your stuff in batches, and make sure the browser history is cleared.
I read the JS code for that. Awesome!
I had to use the "Send feedback..." link in my browser to report this. Privacy or not, it has to be fixed because it's a resource intensive dictionary attack. It's already bad enough that I have to keep building new rules to exclude abusive Flash ads. Now advertisers are going to scan their list of 100000 most interesting URLs on every page load.
perhaps browsers should be made to allow to impose an upper limit of how many times any given script can check the history against arbitrary URLs, or how many times per second. This limit could be made available in a browser's preferences. A reasonable default value might be 5 or 10. This script couldn't check 10.000 URLs then.
which is me rethinking.
It can go covert, I haven't run it yet, but if just the css computedValues are being used on a generated link, then there are a number of ways to keep the noise down on that.
So, I will retract the 'over hyping', and the 'hackery not crackery' statements :)
Just a reminder that I posted this *for fun*. The point was to demonstrate the vulnerability, not to provide a tool on how to do this. Quoting myself...
"Kind of cute right? Don’t worry — I am not storing your history in any way, this is purely for fun. [...] In case it isn’t obvious — please don’t do this for real."
Theres a little feature I highly recommend : every time you shut down firefox, it wipes your history, cache, passwords etc...
So in this example, it guessed 95% male based on my current browser tabs. God help me what it would have guessed if it knew what I was looking up last night...
Likelihood of you being FEMALE is 40%
Likelihood of you being MALE is 60%
Site Male-Female Ratio
google.com 0.98
telegraph.co.uk 1.5
Does this mean that El Reg isn't in the Quantcast top 10K or whatever? If they're only looking at US sites why is the ET in there?
Oh, and since I've stopped reading the ET since the format change I'll be 50-50 pretty soon.
If you look at the JavaScript source (http://www.mikeonads.com/gender/SocialHistory.js) you can see that rather than querying the browser history directly, it creates a link for every site in its dictionary, then checks whether the browser has given it the "never visited before" or "visited before" link colour in order to determine if you've been there in the past.
It's quite a clever idea, but it's still having to brute-force its way to your history, and with that many sites it's very obvious that something's happening from the way the browser hangs, so I can't imagine any other site using it with any more than a handful of URLs.
Hmmm , it seems safecache on FF2 has it confused too even in IE pretender mode too , but I did notice at one time CPU hit 100 % for a minute or two trying to run and that script and that evil pesky M$ error send an email popped up as well on the first attempt as the system literally froze and almost went BSOD too !
95% Male ... so it got my gender bang to rights. The wife might say that my feminine side shows more than that .. but then this is my work machine. Testosterone bursting from every pore in the competitive workplace ..
I am going to have to try this on my home machine to see if my gender bias varies by which machine I use. Will I be more androgenous using the Mac at home?
Likelihood of you being FEMALE is 57%
Likelihood of you being MALE is 43%
I thought that, this being my computer at work, it couldn't miss with sites like El Reg and a bunch of technical sites (no porn, though), but Gmail had me nailed. :D
Although, I must admit that a friend psychologist told me that, according to Bem's scale, I am mostly feminine.
"Mozilla, Microsoft and the rest of the gang have long refused to do anything about it because fixing the problem would make it hard for users to tell sites they've visited from those they haven't."
I appreciate I know nothing about the specifics of this, but exactly how hard is it to ensure a piece of data is usable by the person sat at the PC, but doesn't get sent out to the outside world ? If not trivial, then what sort of crap system has the industry developed?
I've never felt so manly.
I was surprised that the most male-tilted site on my list was a music site (Harmony Central) and that search engines and email are slightly girly.
And it's good to know that occasionally looking at allegedly-amusing pictures of cats over at icanhascheezburger.com makes me more of a man, not less. I'd always assumed it would be the other way round.
Like most problems of this nature, it is easily avoided - simply run everything in RAM (Bart PE/Golden Dragon or similar) or reimage fixed disk using ghost or zenworks or similar weekly/daily. Problem solved. If reimaging a fixed disk OS weekly/daily then also full format using Darik boot and nuke or similar multipass eraser/formatting tool. Use someone elses internet connection or a free wifi spot if you want to be all legit.
If that's too much effort for you, then you're probably not doing anything that interesting to anyone else, other than marketing chimps and that ilk, and who cares what they try to sell you!? Be my guest, spend all your time and money trying to sell me shit which I will never buy, regardless of my sex. Surely it's easier to tell someones sex by their name? Or by calling them and speaking to them? I mean, why would anyone invest such effort in such a long clunky way of determining something as inane as someones sex?!
Likelihood of you being FEMALE is 50%
Likelihood of you being MALE is 50%
This is only because FireFox remembers nothing about me or where I've been, like several other commenters.
I carefully set it up this way, therefore I'm 100% likely to be male, as ladies don't surf dodgy sites that require such an approach. Do they?
Paris, because her history is known to everybody.
... run two versions of Firefox, one for the day to day stuff which sits on your HDD as usual and the other one Portable Firefox which you run from a USB stick and which doesn't cache anything and wipes cookies and history when you quit so you can browse the more interesting stuff safely...
... Erm, allegedly!
It appears that just because I handle the finances in my family, that this algorithm has decided there's a 93% probability of me being female, rather than the real answer of male. I guess the fact that drudgereport.com and slashdot.org were in the list weren't enough to sway the decision the other way.
Likelihood of you being FEMALE is 29%
Likelihood of you being MALE is 71%
so 2/7ths of the time I'm female?
Although as I use FF and IE, I checked FF as well (much quicker) and I got 85% MALE, looks like IE makes you effeminate, actually my FF config avoids my work proxy filter, which really means that the sites I'm normally allowed to look at @ work are more girly.... interesting
"Even when you turn off Javascript, they have other tricks up their sleeves that are much harder to foil, says wally of wally corp, who brought the tool to our attention."
The button did absolutely nothing on IE with Javascript turned off.
Standardized LARTmeter is hitting 11/10 on the BS index.
Privoxy blocked this I declined the offer :)
"Request for blocked URL
Your request for http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/ was blocked.
See why or go there anyway."
Nah, I went anyway and clicked on the button and nothing happened. No results :( I am not using any Mozilla or Microsoft related browser.
Gave errors: boo hoo
JavaScript - http://www.mikeonads.com/PRIVOXY-FORCE/2008/07/13/using-your-browser-url-history-estimate-gender/
Inline script thread
Error:
name: ReferenceError
message: Statement on line 3: Undefined variable: urchinTracker
Backtrace:
Line 3 of inline#4 script in http://www.mikeonads.com/PRIVOXY-FORCE/2008/07/13/using-your-browser-url-history-estimate-gender/
urchinTracker();
stacktrace: n/a; see 'opera:config#UserPrefs|Exceptions Have Stacktrace'
JavaScript - http://pagead2.googlesyndication.com/pagead/show_ads.js
Linked script not loaded
JavaScript - http://www.mikeonads.com/PRIVOXY-FORCE/2008/07/13/using-your-browser-url-history-estimate-gender/
Event thread: click
Error:
name: ReferenceError
message: Statement on line 1: Undefined variable: SocialHistory
Backtrace:
Line 1 of inline#3 script in http://www.mikeonads.com/PRIVOXY-FORCE/2008/07/13/using-your-browser-url-history-estimate-gender/: In function startAnalysis
function startAnalysis() { user = SocialHistory(); var listOfVisitedSites = user.visitedSites(); document.getElementById('analyze').style.display='block';document.getElementById('analyze').src="http://www.mikeonads.com/gender/analyze.php?sites="+listOfVisitedSites;}
Line 1 of function script
startAnalysis()
...
stacktrace: n/a; see 'opera:config#UserPrefs|Exceptions Have Stacktrace'
It's fairly trivial to do the calculation server-side, for each site you wish to check add style rules such as this:
a.site0001:link {
background-image: url(/check.pl?userid=666&siteID=1&visited=0)
}
a.site0001:visited {
background-image: url(/check.pl?userid=666&siteID=1&visited=1)
}
then in the body of the page
<a href="http://www.theregister.co.uk/" class="site0001">x</a>
Then, upon page load, the client will request one of the two background images (without needing to run any javascript). The check.pl server side script then can store in a database whether the user given the id of 666 (some kind of session identifier) has visited the url http://www.theregister.co.uk/ or not.
Likelihood of you being FEMALE is 23%
Likelihood of you being MALE is 77%
The BBC website helped me out though, having a ratio of 1.44 male2female ratio! However the www.aa.com let me down..Stupid Car safety company type thingy!
Hooray! and Just for the Record, i am male ;)
Well, worked here, 99% male (which I am too, apparently).
Funny that the music sites are so skewed towards male. That newegg.com, macrumors, and ubuntuforums.org would have the highest m/f ratios (all above 2, I think) is not that surprising though... I had lots of sites with low m/f ratios (search, email, financial, etc.), but I guess these gave it all away.
Curious to try this at home and see if it changes though...
Likelihood of you being FEMALE is 17%
Likelihood of you being MALE is 83%
Site Male-Female Ratio
nytimes.com 1.13
dell.com 1.04
bbc.co.uk 1.44
ebay.co.uk 1.17
bikebandit.com 1.74
google.co.uk 1.35
tiscali.co.uk 1.08
Supose it was bikebandit that done it(online motorcycle spares)
Black chopper 'cos if THEY start to do this for real we're screwed
Nothing to see here in the way of "exploits".
It's a very simple script that does NOT check your browser's history. All it does is spew out a bunch of links into a temporary iframe, check to see which of them has adopted the "visited" link color, and then returns a count of the links that matched that simple test. Sure, the browser (only IE or FF with this script) does rely on your history to determine which of the test URLs have been "visited", but this script (or any script) cannot actually read your history. And it certainly doesn't send any data back to the mothership, so it's a one-sided exercise executed exclusively in your own browser.
This isn't a problem or a "security flaw" or something that needs "fixing" with web browsers ... it's nothing to be alarmed about. It's just a game and cannot do you any harm. Web browsers are doing what they are designed to do. You can easily disable the history in any browser to cripple this type of shenanigan. But, again, there is no reason to worry about this particular exercise.
It is fun to see the purported "experts" pump up the FUD factor, tho'.
Looks like I'm in trouble - in work I'm only 53% man but at home I'm worse:
Likelihood of you being FEMALE is 60%
Likelihood of you being MALE is 40%
Site Male-Female Ratio
google.com 0.98
microsoft.com 1.08
download.com 1.27
amazon.co.uk 1.11
gmail.com 0.9
channel4.com 1.11
plaxo.com 0.82
freerice.com 0.61
lastminute.com 0.67
google.co.uk 1.35
I suppose I'll have to stop visiting those charitable sites that turn me into a lady.......
Despite having history set to 0 days, all cookies blocked except for specified sites, it seems that Opera is quite happy to give away my private data to all and sundry.
Even worse, "Delete Private Data" doesn't stop the leakage: you'd have to close Opera and then re-open it. That's utterly outrageous !
The dreaded IE and Safari weren't as bad: they both leaked, but not after "Delete Browsing History" and "Clear History" were used.
So goodbye Opera, it's Firefox for me from now on. It didn't leak, even before I clicked on "Clear Private Data".
>It's a very simple script that does NOT check your browser's history.
Not directly, but it does discover your browsing history, which is intended to be private.
>And it certainly doesn't send any data back to the mothership,
Of course it could though, you'd have to be pretty thick not to see that.
>This isn't a problem or a "security flaw"
If I were to target your bank account I would initially want to know what your bank is, a few people have already pointed out that in the results it has your banks name.
Then we know what kind of fake banks to send....
>Web browsers are doing what they are designed to do.
Web browser are terribly designed, if you look at the system as a whole it's a hodge podge of standards and workarounds.
Since the browser history is normally unavailable, it's safe to say the intent of the designer was that the browser history cannot be discovered here it can, so here there is a flaw (in the design - the program does the right thing).
>It is fun to see the purported "experts" pump up the FUD factor, tho'.
You're very casual about this, it's not us that will get caught out by this thing, it's your gran who gets a well crafted email purportedly from her bank..
It's hardly panic stations though.
I have been tinkering around with this.
And I have just noticed that I have not got history selected, which probably accounts for a:visited never appearing correctly in my browser. I often have to tweak that when I see it on another's computer.
So, noscript is defeated, chuckle, good because JavaScript functionality is quite nice for the web experience.
I think you can just turn history off and it works:
Edit -> Preferences -> Privacy -> Uncheck remember pages visited.
Back and forward still work for me as they have always done.
Mozilla likes to do things like this, they keep things in multiple locations. Now I am not 100% sure but if others check as well, then I think just doing the above sorts it out.
Of course a lot of people don't play with those settings, but that is their lookout, Mozilla appears to offer a fix and have done for quite sometime.
The browser's history has never been "secure". I can't think of any reliable source that makes such a claim. If people have believed that it was "secure", then they have been fed some misinformation.
In this script's case, if your history does not contain any matches to the quite limited list included as the data array, than there are no matches. I see that the bank I use, for example, is not included in the list. Therefore, this script could not have been used to even guess at my bank.
And with regard to sending info back to the mothership; the data you could send includes everything you give up to every web server you hit, anyway. The only addition is that your history may have matched some of the elements of the test array. There's nothing sensitive, here, unless you've stumbled upon a child pron honeypot and have leaked details about your latest expedition. There's certainly not enough additional info being generated to be sending phishing emails to my grandmother.
I agree with you in that this is definitely nothing to panic about ... or even concern yourself with. Still classifies as FUD.
It's like someone being concerned that their screen resolution has been detected and collected. Who cares? If anyone is truly nervous about their history, there are oodles of suggestions on how to eliminate that "threat", both noted above and in basic Browser 101 (i.e. set your History to clear itself whenever you close the application, and don't go to unknown sites during the same session as when you go to your banking website.)
>The browser's history has never been "secure".
No, but it's _supposed to be_, otherwise why hide the history from js at all?
>I see that the bank I use, for example, is not included in the list.
Mine is and other peoples are, besides, _this list_ is chosen for picking genders.
>Therefore, this script could not have been used to even guess at my bank.
Not with _this list_.
>There's nothing sensitive, here,
Who my bank is, is sensitive information.
>unless you've stumbled upon a child pron honeypot
Yes, because everyone who wants their browsing history secret is a pervert.
>There's certainly not enough additional info being generated to
>be sending phishing emails to my grandmother.
Although if she kept getting e-mails from _her bank_.
>Still classifies as FUD.
Yes, it's not scary.
>It's like someone being concerned that their screen resolution
No, it's not, its personal info, it's almost as bad as Phorm.
>History to clear itself whenever you close the application, and
>don't go to unknown sites during the same session as when
>you go to your banking website.)
Yeah, my gran understands that, besides history is useful while browsing, it makes the back button work. I want history without tellign everyone else what it is.
Which is what it's _designed to do_.
>It's still not a "flaw"
Yes it is, because it's not supposed to do it.
>and it does not need to be "fixed" ...
Yes, it does, there shoudl be hope of fixing all flaws.
> and the browsers are still working as intended.
NO THEY AREN'T! The intention is that your history is secret.
>It's JUST A GAME ... harmless.
This one is.
that's my bet.
And to be fair most of us will probably roll something like this, it is quite easy to create, and it could yield some interesting AI for the site.
But, people should be made aware if you are using it, and given the option to opt in to it.
You can access the history object through a signed script, I think it would be better to allow access via a dialog. That way the user can say yeah or nay, and the code can be written properly.
To fill this security bug though, and keep your history, then they would have to limit the a:visited capabilities. Perhaps actually just getting rid of it, but I find that easy to say because I don't often code with a:visited being distinct, other may do and for good reason.
Unless your grandmother's email address is @herbank.com, then how the hell is an email going to get to her? SHE'S IN NO DANGER!!! Is her user name visible in her history? No, it isn't.
Now when they start reading COOKIES, there will be security implications, but this is just a silly parlor trick and cannot POSSIBLY result in any harm to anyone at any time. All of your examples of how you think this could be dangerous involve elements other than the browser history and knowledge beyond that which the history could provide.
Give a legitimate security risk as an example (aside from the "it's my data and I don't want anyone to see it" canard that doesn't constitute any real danger, but rather a very mild privacy invasion) and maybe you can be taken seriously, pervert or not.
You are aware that the back button functions perfectly well when the history is set to be cleared upon application exit? And that setting the history to 0 days still maintains the history for the current session? Has your back button ever worked beyond the current session? I think not.
And you are aware that every one of your packets is being inspected by commercial organizations at the ISP level from literally every location you could use? Not Phorm or NebuAd, but surfing data aggregate companies, like Hitwise.
I'd say that's a real, if neutered, threat ... not a TOY threat like this history-sniffing Javascript. Even the massive quantities of data accidentally released by AOL some many months ago could only provide guesses as to the identities of any of the surfers who were captured by it. This little script, even if it were supplied with a huge list consisting exclusively of bank domains, could not conceivably do any harm to anyone who encounters it. If you think it's so dangerous ... tell us how. I'm willing to be persuaded
This piddling exercise is neither a threat nor a flaw in need of fixing ... unless someone can show a true danger.