back to article DNS lookups can reveal every web page you visit, says German boffin

Domain-name lookups only reveal websites visited, not individual pages viewed, right? Wrong: the interaction between a user and the DNS is more revealing than previously believed, according to a paper from German postdoc researcher Dominik Herrmann. In work published at pre-print server Arxiv (in German – thank you, Google …

  1. Pomgolian
    Big Brother

    Now we know..

    ..why Google has a free public DNS. So much for blocking cookies, DNT headers and NoScript. The bar stewards are still watching you.

    1. mr. deadlift

      Re: Now we know..

      you pop chrome on that stack and it's game over.

    2. Anonymous Coward
      Anonymous Coward

      Re: Now we know..

      ..why Google has a free public DNS. So much for blocking cookies, DNT headers and NoScript. The bar stewards are still watching you.

      To be honest, I'm surprised that this is reported as news. That's been an obvious one for, umm, pretty much as long as the Internet exists. In my (long gone) younger years when I worked for an ISP I even had zone transfers secured because in those days it would have given away who our customers were (these days that's a default, but we're talking long ago, when USENET was still usable).

      But, even if you didn't know that you should have had an inkling that something was up if Google offered it for free.

      1. Anonymous Coward
        Anonymous Coward

        Re: Now we know..

        > To be honest, I'm surprised that this is reported as news. That's been an obvious one

        As "obvious" as the Sun going around the Earth and not vice-versa?

        There is a difference between hypothesis and research. Where are your Arxiv papers?

    3. big_D

      Re: Now we know..

      Just use your own local DNS server.

      1. Arthur the cat Silver badge
        Facepalm

        Re: Now we know..

        Just use your own local DNS server.

        I do. The staticly assigned IP addresses are a bit of a give away though.

      2. Pu02

        Re: Now we know..

        And then prevent any internal clients from talking to Google's DNS.

        Which breaks media boxes, televisions, all manner of IOT devices and software apps, practically all media streaming apps that enforce DRM, eg. Netflix, etc.

        Blocking these hosts effectively without causing failures (perhaps I should say, to prevent an impact) is not trivial even if you have the infrastructure in place to do this across your network.

        And even then, they'll be watching your samrtphone, which if it has a third party app installed or is an android might be behaving most promiscuously with many of the Google inquisition's global public (if not private) nodes.

    4. Mage Silver badge

      Re: Now we know..

      It's not news and I've been telling people not to use Google's DNS. Use your ISP's DNS, they can know every page you look up anyway.

      1. Gene Cash Silver badge

        Re: Now we know..

        Use your ISP's DNS

        Oh yeah, the assholes that hijack DNS failures to serve ads, causing SAMBA and other scripts to crap out.

        They also have pretty much given me an assigned IP. Neither bouncing my cable modem nor releasing my DHCP lease gets me a new IP. I've had the current one for a year now.

        1. Ken Hagan Gold badge

          Re: Now we know..

          "Use your ISP's DNS.

          Oh yeah, the assholes that ..."

          Careful now. The advice was sound. Your ISP is, technically, a good place to look for DNS services. If it isn't a good idea in your particular case, it is because your ISP is crap. That's a separate problem.

          1. big_D

            Re: Now we know..

            Careful now. The advice was sound. Your ISP is, technically, a good place to look for DNS services.

            Except in the USA they were given carte blanche last week to use any and all data on their customers and to sell it to third parties as they see fit (FTC ruling).

            1. Ken Hagan Gold badge

              Re: Now we know..

              Sigh. So all US ISPs have been given carte blanche to be crap. That is, as I said, a separate problem.

              Your ISP already and inevitably (because there is only one wire out of your house) gets to see all your traffic, so there is no new opportunity for a privacy breach and if DNS data depends on where you pick it up from then the DNS is technically broken, so ... The advice was sound. Your ISP is, technically, a good place to look for DNS services.

  2. frank ly

    Explanation please?

    I still don't understand how analysis of DNS records for my IP address can reveal that I looked at en.wikipedia.org/wiki/Alcoholism (or whatever). Are they saying that Wikipedia's responses are different in such a way that the page can be distinguished from other pages?

    1. Charles 9

      Re: Explanation please?

      Yes. Most of the images don't come from wikipedia itself, for example, but from the Wikimedia Commons (another domain, another lookup). How many pictures does that one page contain compared to other pages on the 'Pedia? What distribution of 'Pedia/Commons requests are made?

      Put simply, because of all these side requests, just one page can create a fingerprint that can be combined with other pages to create a distinct trail. And unlike what the article says, many of us have longer-term IP allocations (otherwise, home servers don't work so well). Worst part is that this sniffing is all done via basic Internet protocols; trying to mask them will require changing the protocols which may not be efficient or even possible.

      1. big_D

        Re: Explanation please?

        @Charles9 if you are already using your own home server, add a DNS service to it, to serve local devices, then the problem goes away.

      2. Anonymous Coward
        Anonymous Coward

        How does DNS work in the real world?

        OK, I understand how the browsers DNS lookups might in theory help form some kind of clue.

        Now my recollection of how DNS works in the real world is that there's potentially quite a lot of caching between me and (e.g.) my ISP's DNS server (assuming that's the one I use). So 'my' DNS lookups visible to the outside workl at my ISP may or may not match the DNS lookups in an uncached world.

        I'm ignoring the possibility that someone's been bodging about with DNS-related stuff I nominally control. If they can do that they likely have easier and more effective ways of snooping on me than this idea.

        [And then for a different approach to killing this idea, e,g, by data poisoning, there's stuff like Trackmenot or logical successors].

        What's wrong here? The 'research'? The article? My limited understanding of DNS?

        [edit: big_D seemingly has a similar train of thought at the same time as me]

        1. SImon Hobson Silver badge

          Re: How does DNS work in the real world?

          Now my recollection of how DNS works in the real world is that there's potentially quite a lot of caching between me and (e.g.) my ISP's DNS server

          Actually, you'd be surprised how little caching there is between a user and the caching resolver they use - and many routers will default to handing out the ISP supplied DNS resolvers to internal clients.

          Form reading it, it's clear that this technique will instantly lose most of it's potency once you are separated from the client by a decent cache - hence some suggestions to run your own internal DNS cache/resolver. If you do that, then unless you set your ISPs resolver as a forwarder for your local resolver, they would have to sniff traffic to get your DNS queries - and they will be vastly less useful due to the caching.

      3. John Smith 19 Gold badge
        Gimp

        "Put simply, because of all these side requests, just one page can create a fingerprint "

        I'd guess the pix in particular may well be near unique to each Wiki page.

        2 good rules of thumb are

        a) If Google supplies it how does it allow them to extract more knowledge about you (because if Google supplies it it always will)?

        b) Don't use Google.

    2. Anonymous Coward
      Anonymous Coward

      Re: Explanation please?

      I still don't understand how analysis of DNS records for my IP address can reveal that I looked at en.wikipedia.org/wiki/Alcoholism (or whatever). Are they saying that Wikipedia's responses are different in such a way that the page can be distinguished from other pages?

      No, the issue is that a page is very rarely just local HTML. You will have scripts, fonts (again a Google hit, which is why we avoid Google fonts - and thus do not run Wordpress), images (which could have very meaningful titles in themselves), etc etc - each of which is likely to require a lookup that cannot be served out of cache.

      In short, you're looking at another data source to feed Big Data based profiling.

  3. Anonymice
    IT Angle

    A bit short on details...?

    Come on guys, this ain't the BBC, where're the technical details?

    You rope us in with a headline about supposed privacy leaks in DNS, and then spend the entire article talking about old-hat browser fingerprinting & behavioural analysis. That was news 15 years ago!

    “Many websites produce a so distinctive DNS retrieval pattern” that requests can be recognised “more or less unequivocally.”

    How does the content on a *website* produce a distinctive enough pattern to identify specific pages?

    "IT?" 'cause who the freud do you think your readership are?

    1. JLV
      Black Helicopters

      Re: A bit short on details...?

      It's explained fairly well near the end.

      Each page has links, any links to another domain will require dns. If you know what links are on say 500 pages then as someone reads those pages their dns will fire in predictable patterns and you can guess where they are likely to be.

      Although host-based blacklists on the client would befuddle things somewhat. Maybe.

      Reminds me of Rainbows End when the avatar is randomizing his response delays so that they can't "geolocate" via timing patterns ;-)

    2. Anonymous Coward
      Anonymous Coward

      Re: A bit short on details...?

      > Come on guys, this ain't the BBC, where're the technical details?

      Here: https://arxiv.org/abs/1703.05953 as was pointed out in the second paragraph of the article.

  4. Franklin

    So does that mean...

    ...running a client on your computer that makes DNS queries and sending page lookups to random (legitimate) Web sites in the background will confuse the trail?

    1. Adam 1

      Re: So does that mean...

      Another way would be to have a collection of DNS servers configured locally that get round robin'd for each request, since profiling requires combining the pattern of DNS lookups from specific pages.

      That, or if you're feeling like a real crazy cat, use an ad blocker and VPN.

    2. Anonymous Coward
      Anonymous Coward

      Re: So does that mean...

      "running a client on your computer that makes DNS queries and sending page lookups to random (legitimate) Web sites in the background will confuse the trail?"

      You might possibly think that, I couldn't possibly comment.

      See e.g. the Trackmenot browser extension. Only been around since 2006 or so. Surprisingly few people know about it.

  5. Anonymous Coward
    Anonymous Coward

    Simple fix

    1) install DD-WRT on your router

    It has a caching DNS server built in and running by default, so you won't keep looking up DNS names you recently resolved downstream over and over again, where your ISP (or whoever) can get at the patterns to figure out where you've been.

    Honestly, every router vendor should include a simple caching name server. Then not just the small minority who reflash their routers can benefit, and ISP DNS servers will see far less traffic.

    As always, those stupid enough to reconfigure their PC to use Google's DNS servers deserve what they get.

    1. Charles 9

      Re: Simple fix

      And if that's not possible (I can't on my R7000 because the latest versions still don't support WiFi)?

      1. big_D

        Re: Simple fix

        If your router can't do it, then set up an old PC on the inside to act as your DNS resolver, probably a better bet than using the router DNS cache, long term.

        1. Anonymous Coward
          Anonymous Coward

          Re: Simple fix

          "If your router can't do it, then set up an old PC on the inside to act as your DNS resolver, probably a better bet than using the router DNS cache, long term"

          Isn't this one of the things that PiHole does, with less space and power than either a router or a PC?

          https://pi-hole.net/

          (edit: I see others have also suggested this particular solution already. Must trype faster.)

    2. TRT

      Re: Simple fix

      And I suppose one could also spread the DNS loading out to different servers across many vendors as a software function in the client or the client's trusted (local) DNS relay. No one DNS gets the whole of the fingerprint.

    3. Anonymous Coward Silver badge
      Boffin

      Re: Simple fix

      Most SOHO/home routers include that anyway. At least 90% of the ones I've looked at in the last 20 years have, but I guess your experiences might be different.

      1. Anonymous Coward
        Anonymous Coward

        Re: Simple fix

        Most SOHO routers can be remotely compromised.

        Use a VPN service, and associated DNS.

        And read up on browser fingerprinting. If you are serious about privacy, use TOR with your VPN, and read the TOR project notes on ways you can compromise your privacy.

    4. Anonymous Coward
      Anonymous Coward

      Re: Simple fix

      Not optimal.

      Use the DNS server provided by your VPN supplier.

  6. itzman
    WTF?

    Since ISPs have to use dynamic IP addresses to cope with the IPv4 address shortage, a user's address changes, making it harder to track them over time.

    ER, what? The days of dial up modems are long gone squire, everybody is on;lune 24x7 these days, so you need as many IP addresses as there are customers.

    There is no logic to using dynamic IP addresses for most ISPs.

    1. Charles 9

      Unless you're using a CGNAT...

    2. Adam JC

      Whilst there's no 'logic', I'm afraid they do.. All the big players do... (Bar Zen/aaisp) - In fact, BT/Sky won't even give you a static IP on a residential line if you PAY them. You'll need to switch to a business package for that (Of which Sky don't even offer, might I add).

    3. Ken Hagan Gold badge

      "There is no logic to using dynamic IP addresses for most ISPs."

      You're thinking like an El Reg reader. Try thinking like a money-grubbing bar steward. The logic is that it means a static IP can be a chargeable extra.

  7. Anonymous Coward
    Anonymous Coward

    Ehh, enforced IP address changes by German ISPs are silly enough as is. (Every midnight, the ISP actually cuts your connection and waits for the modem reestablish it!)

    Forcing the change *hourly* would be mad. Imagine all your downloads failing, SSH connections freezing, VoIP calls cutting, and so on – as soon as the clock ticks to :00.

    And besides that, most people use the DNS resolver hosted *by* their ISP by default – the same ISP which assigns them the IP addresses in the first place. So forcing IP addresses to change would be just theater, it wouldn't prevent the ISP from correlating the DNS log and the IP log whenever they wanted.

  8. Charles 9

    Guess it's time for yet another use for a Raspberry Pi:

    Use your raspberry Pi as a DNS cache to speed up your internet

    1. thondwe

      Add an Ad Blocker on the router too?

      So Cache DNS, Add some sort of DNS based Ad blocker to the system too, so reduces the DNS lookuups per page quite a lot? Then randomize DNS lookups across a range of root DNS Servers.

      Aside - My ISP gives me a "dynamic" IP which hasn't changed in 4 years! So why should I pay them more for a static one?

      1. This post has been deleted by its author

      2. Me too

        Re: Add an Ad Blocker on the router too?

        Use Pi-hole. DNS, local DHCP (so if you're using a BT router where you can't alter DNS settings, you can turn DHCP off and use the PI instead), and ad-blocking.

        It's awesome.

        It's here

      3. Steve Evans

        Re: Add an Ad Blocker on the router too?

        https://pi-hole.net

        Blackholes many adverts, and caches for 300 seconds iirc.

        Doesn't need a Pi, it runs nicely on lots of the little boards.

        Got mine running on an old Odroid C1 (Native 1Gb LAN).

        1. Anonymous Coward
          Anonymous Coward

          Re: Add an Ad Blocker on the router too?

          Can't use DNS blocking since some sites MUST be whitelisted (my credit union ended up on the blacklist because it's connected to the government, being a MILITARY credit union). Plus as noted, to deal with this problem, you need a more persistent DNS cache.

      4. Anonymous Coward
        Anonymous Coward

        Re: Add an Ad Blocker on the router too?

        Run your own ISC BIND server internally and use Response Policy Zones to blacklist all known advertising/tracking sites so all the crud on the web page will never be resolved.

        Even if you run your own DNS Server remember it still has to go out and perform the initial query which is then cached for the TTL of the DNS record. So it is visible upstream if the ISP is performing DNS traffic recording. What will be more difficult is to determine how many times you access an individual page on the web site.

      5. Anonymous Coward
        Anonymous Coward

        Re: Add an Ad Blocker on the router too?

        > Aside - My ISP gives me a "dynamic" IP which hasn't changed in 4 years! So why should I pay them more for a static one?

        Having a dynamic (or better yet, NATed) IP address is a great privacy advantage for both residential and business users. This does not exclude the option of also having a fixed address used for inbound requests.

        In my configuration, dynamic NATed IPv4 is used for browsing and most other activities, while fixed IPv6 is used for inbound and a limited number of outbound connections (mostly SSH).

    2. Wensleydale Cheese

      Careful with that Raspberry Pi example

      "Use your raspberry Pi as a DNS cache to speed up your internet"

      Careful with the instructions at that link, for they contain the lines:

      server=8.8.8.8

      server=8.8.4.4

      which are Google's DNS servers.

      P.S. In the past few years I've come across a lot of people who should know better recommending Google's DNS servers. Even in the workplace.

  9. poohbear

    Dumb question

    Okay, having run a DNS server in the distant past, I fail to see why it is necessary to log all the requests. Seems like a massive waste of disk space.

    Or am I missing something? (My question is aimed at normal DNS providers, not people interested in deep analysis of your browsing habits to serve you better ads).

    1. big_D

      Re: Dumb question

      The problem is, today, that those hosting the DNS are also interested in deep analysis of browsing habits, generally speaking.

      In the US, the ISPs have just received the right to sell any and all information gathered about their users, so DNS logging and patterning would make a nice little earner, to bolster profits.

      Disk space is cheap and selling browsing habits is lucrative.

      Set up your own DNS server and cut them off at the pass.

  10. Anonymous Coward
    Anonymous Coward

    There seem to be several potential technical spoilers to this theory.

    1) Going to the DNS for every request is inefficient. Caching for some period would be assumed to be standard. That period could be assumed to be more than a few minutes.

    2) ISPs with IPv4 using NAT may have several external IP addresses on multiple load sharing boxes. The ISP users' browser connections can be multiplexed on any of those IP addresses by dynamic source port assignments. A different destination IP address will have to open a new connection. There is no guarantee it will be multiplexed onto the same external IP address as previous ones from that user - even in the same session.

    1. Charles 9

      "1) Going to the DNS for every request is inefficient. Caching for some period would be assumed to be standard. That period could be assumed to be more than a few minutes."

      Unless, of course, it's the ISP that's doing the caching. Then they can still track you.

    2. Badvok

      Yes, I think pretty much every OS you could possibly be running has a local DNS cache along with pretty much every home router.

      They must have explicitly turned off all caches to get this analysis to work.

      1. Charles 9

        No, the caches are just too short. Even ip-hole only caches for five minutes.They're saying you can solve it by keeping your cached entry for MUCH longer: say until it no longer works.

  11. alain williams Silver badge

    www.theregister.co.uk

    I got different from the one in the article, but they use cloudflare - a CDN.

    More importantly: the addresses are IPv4 ones, when is El-Reg going to go IPv6? Even Virgin Media say they will support IPv6 - so why not El Reg ? I have had it at home for 6-7 years.

  12. Anonymous Coward
    Anonymous Coward

    Didn't we already know this though? How else would Google Analytics allow you to see which actual pages people had visited, with so much detail about them? And that's from the outside... wouldn't we expect to be able to get everything off the DNS from inside?

    (Although I suppose this may explain why I was met with blank looks five years ago when I asked Infrastructure to pull a less detailed version of this info off the servers for marketing purposes...)

    1. Rich 11

      Didn't we already know this though? How else would Google Analytics allow you to see which actual pages people had visited, with so much detail about them?

      GA doesn't use DNS analysis. The webmaster places a scrap of JavaScript on each of their pages they want usage figures for, so it's the client which directly tells GA what they're looking at and for how long.

      1. Anonymous Coward
        Anonymous Coward

        it's the client which directly tells GA

        ... luckily you can turn that (s)crap off by blocking GA scripts in your browser, using browser extensions that block scripts, if your browser does not do it natively, which they all should- unless they explicitly gained your conscious agreement to wholesale and ongoing activity theft.

        But with no limits on script sources and pay-load, these protections are in no way fool-proof.

  13. K

    But there is a key ingredient missing..

    The "spy" would have to know the content and links from each page to pattern match your request and identify if you visited it.

    1. Anonymous Coward
      Anonymous Coward

      Re: But there is a key ingredient missing..

      > The "spy" would have to know the content and links from each page to pattern match your request and identify if you visited it.

      The premise is that the spy doesn't care about which page you visited, all it wants is to associate your IP to you via a DNS-based fingerprinting approach.

  14. bin

    Surely switching my modem "on and then off again" will do nothing other than leave me with no connection?

    1. Anonymous Coward
      Anonymous Coward

      > Surely switching my modem "on and then off again" will do nothing other than leave me with no connection?

      Yes, but it will frustrate most tracking attempts with a high probability of success.

      1. Charles 9

        No it won't, as the article notes. They can re-establish your trail, especially if it's the ISP (who ASSIGNS your IP) doing the tracking.

        1. Anonymous Coward
          Anonymous Coward

          > No it won't

          "No it won't" what exactly? Would you terribly mind quoting whatever you might be replying to, if it's all the same to you?

          1. Charles 9

            No, changing your IP won't deter the snoops for very long.

            Quote: "However, Herrmann writes, someone with access to the infrastructure can easily watch a user's behaviour while they have one IP address, create a classifier for that user, and look for behaviour that matches that classifier when the IP address changes."

  15. Anonymous Coward
    Big Brother

    DNS lookups reveal website visits

    "The fix is simple: turn your modem on and off again to get a new IP address. Or ask your ISP to assign them more often"

    How do you defeat against your own ISP recording your browsing history.

    'List of authorities allowed to access Internet connection records without a warrant'

    1. Paul Crawford Silver badge

      Re: How do you defeat against your own ISP recording your browsing history?

      very simple: use a VPN provided from another country, ideally one without odious retention policies.

      Don't use the PPTP protocol as its pants in security, ideally use OpenVPN. Then check the VPN is doing its job by visiting one of the test sites (such as ipleak.net or check.ipredator.se etc)

      But as others have pointed out, using DD-WRT or similar on your router plus ad-blocking will go a long way for this particular attack. You can even buy routers pre-configured with DD-WRT and VPN in there so all of your home devices get privacy (not too cheap though).

      1. Charles 9

        Re: How do you defeat against your own ISP recording your browsing history?

        But can you REALLY trust those VPN providers to actually have the servers located in the countries listed AND not talk to Five Eyes on the sly?

        And some of us can't use ad-blocking domain lookups due to false positives.

        1. Paul Crawford Silver badge

          Re: How do you defeat against your own ISP recording your browsing history?

          "But can you REALLY trust those VPN providers to actually have the servers located in the countries listed AND not talk to Five Eyes on the sly?"

          In any absolute sense - no

          But the probability that they do honour the privacy guarantee is much higher than the probability of my ISP preserving my privacy.

          Also I don't really have much to fear from the "five-eyes" style of secret service spying, but I do have much to consider if I end up in some dispute with some petty local bureaucrat who can access my web history and I can't access theirs. That is the whole point - to reset that asymmetry in power that the snooper's charter provides.

  16. Anonymous Coward
    Anonymous Coward

    RaspberryPi + PiHole

    sorted.

    1. Charles 9

      Re: RaspberryPi + PiHole

      Not so sorted. PiHole only caches for five minutes.

      1. Paul Crawford Silver badge

        Re: RaspberryPi + PiHole

        Configurable, surely?

  17. RichMcc

    I use this service: - https://dns.watch/index

    No logging, no filtering etc...

    I have suspected Google DNS to be a security/privacy risk for a long time, especially if coupled with your search history.

    Also the Pi will cache, but still needs to perform lookups for sites you've not previously visited. Which would then default to your ISP or Google depending on your setup so its still bad.

    1. Charles 9

      Still, you have to wonder how long they'll be able to stay afloat with just one sponsor and what operationally amounts to a money sink.

  18. Anonymous Coward
    Anonymous Coward

    Use DNScrypt, makes it much harder to do MITM type analysis!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like