Fire and brimstone coming down from the skies! Rivers and seas boiling!
40 years of darkness! Earthquakes, volcanoes!
The dead rising from the grave!
Human sacrifice, dogs and cats living together, Apple email The Register!
Responding to concern that its Safari browser's defense against malicious websites may reveal the IP addresses of some users' devices to China-based Tencent, Apple insists that Safari doesn't reveal a different bit of information, the webpages Safari users visit. Apple may deny users in China VPN protection, it may deny Hong …
This post has been deleted by its author
How?
The first 32 bits of a sha256 hash are sent to an api, this is sent with your public ip address.
The api returns all the sites which start with that hash. Then the browser compares the list to the site you are trying to visit, if it is malicious then it flags an alarm. The service provider (google or tencent ) does not see the full url you are trying to reach as it doesn’t leave the machine. I’m not saying the site you go to isn’t tracking you, or the Chinese state won’t be but in this case the service isn’t.
You may as well say that the entire dns system is a gdpr breach. As that sends the url of the site you are looking for along with your IP address to allow the reply to come back to you.
Everywhere you go, your IP address is logged, stored, directed, hashed, rehashed, stamped and stored.
If this is where people draw the line then they're on for some severe disappointment. Every web page with adverts knows significantly more about you than this.
GDPR violation? Gimmie a break! Let's lock up all the advertisers on the web first (actually no, seriously, they're the scum of the Earth and do little or nothing to safeguard our privacy or safety).
The problem is that this is data that Tencent / Google wouldn't normally be receiving.
And despite only being the first quarter of a URL, they get a list of possibilities. Some basic analysis will tell them exactly which of those possible URLs a user visited by looking at the other URLs that person visited in short order.
So let's say I visit a webpage, its going to make the call to the first resource on the page, then make several more calls in rapid succession as the browser loads all the additional objects on the page (Images, scripts, ads, etc). So all someone would have to do is see what URL appears most often in each list of visited URLs to reverse engineer the originally called URL. That isn't difficult since even looking at the front page of El Reg encompasses more than 50 URLs, each with its own hash. It'd be trivial for someone with just the truncated hash to figure it out with 50 individual calls.
And since TenCent and Google are search engines, it is likely that they have compiled a list of the hashes of URLs, -and- the hashes of every URL it references. So while the hash 'deadbeef-xxxxxxxx-xxxxxxxx-xxxxxxxx' could reference a near infinite number of URLs, if my browser then downloads a refenced image from the URL "c0ffee24-xxxxxxxx-xxxxxxxx-xxxxxxxx', that severely narrows down the possibilities down to a few thousand, then if the page then references an image at "beefcafe-xxxxxxxx-xxxxxxxx-xxxxxxxx" that narrows it down even further.
Normally search engines will only see the URLs it displayed as results and possibly the URL that you selected, but nothing beyond that. The DNS system only sees the domain component of the URL. Your ISP would only see your IP and the IP of the server you are getting data from (Assuming you aren't using their DNS servers).
And this is ignoring that a surveillance state could easily look for people who browse pages that that have hashes that collide with a URL that is verboten, that would narrow down who to spy on and either keep tracking to see if a user visited other pages with hashes that collided with hashes of other verboten URLs. Enough of that and you can whittle down a list to a small handful of people to implement much more intensive spying techniques against those people.
"What do you mean, the IP address is data Google/Tencent would not normally be receiving ? You send a data request, they have to know where to send the answer."
Yes, I am aware of how basic networking operates. What I'm saying is that the safe browsing feature is making a call to Google/tencent that would otherwise not have been made so that the safe browsing feature can function. The download from Google's servers is quite likely to also include such data as my user-agent string and/or advertising ID, again, information that would not otherwise be sent to Google.
I see at the very least one huge gap in your reasoning.
who would compile that list of hashes of verboten URLs, as you put it? Even if Tencent does have a list of all the URLs in the world (mighty unlikely), somebody would need to go around tagging those the government doesn't want accessed.
Unlike for phishing websites, it's not an easily crowdsourced job since, by definition, the same government is already doing its best to block access to those sites, via radically simpler means. And, obviously, there are many more of those websites in the first place, since they're not illegal in the whole world like scams are..
So, while I do see the GDPR issue, and agree that should be fixed, I don't see a credible security issue, because there's no way that would be any better than what governments *already* use.
> Every web page with adverts knows significantly more about you than this.
Whatabouttery isn't a legitimate defense under GDPR.
That others collect more doesn't have any bearing on whether something is, or isn't a GDPR violation. GDPR specifically lists IP addresses as PII, and Google/Tencent wouldn't normally receive these requests unless the safe-browsing functionality is implemented. Ergo, it's not _required_ in order to provide the service your browser is primarily there - browsing.
They almost certainly need active user consent under GDPR.
Oh, and those adverts you mentioned? They generally need consent too.
> That's nonsense. To provide the safe-browsing service, your IP address IS required, and so if you choose to use that service, no further consent is required.
I think you're misunderstanding how this works. The *service* in this context is your web-browser, not the safe-browsing service.
You got a browser from (in this case) Apple, in order to browse the net.
Apple's default settings cause these requests to be made, you didn't "choose" to use it. Sure, you can opt-out, but GDPR has something to say about Opt-in vs Opt-Out too.
Those requests are, essentially, optional. Your browser still works as a browser with safe-browsing turned off. The problem is GDPR requires that you explicitly consent to it, Opt-Out isn't OK.
Your interpretation is one of the tail wagging the dog.
Of course Tencent are going to see your IP when you connect to them, but the point is that without consent you shouldn't be connecting to them in the first place.
I happen to think the net is a better/safer place *with* safe-browsing, but that doesn't mean it's automatically compatible with GDPR. This is, in fact, just one of the many conflicts that most predicted would arise from GDPR
I think you're misunderstanding how this works. The *service* in this context is your web-browser, not the safe-browsing service.
I really think that you are confused. If you insist that the service is your web browser, then by your logic, the browser should request your explicit consent before sending your IP address to any third party.
By it's nature, a web browser sends your IP to any entity it connects to, whether that be a DNS server, a website, or a safe-browsing service.
This is opt-out - you can choose not to browse any sites, but then having the browser running is sort of pointless really.
China actively tracks people's online activities and makes decisions about what they can do in "real life" as a result.
I think you're misunderstanding how this works. The *service* in this context is your web-browser, not the safe-browsing service.I really think that you are confused. If you insist that the service is your web browser, then by your logic, the browser should request your explicit consent before sending your IP address to any third party.
By choosing to visit theregister.co.uk, I am explicitly telling my browser that is the site I wish to visit. It could be I've seen a link in the results to a specific search query, or someone has sent me a link to an article in response to requested information etc - whatever the case I explicitly choose to visit El Reg.
By my choosing to visit El Reg I give that permission quite explicitly.
However, El Reg also calls to "admedo" who I do not wish to communicate with, and it would do so automatically if I had PiHole/NoScript turned off (not sure if admedo is in PH, must try to remember to check/add it). I have NOT consented to this (not under GDPR as not in EU but beside the point).
What the stuff in this article does is worse still - it requests data from Google/Tencent about the site (or about a portion of the URL). That gives McSlurpies 1) Where I am (esp if I am using a tablet with GPS turned on (or turned OFF but, well, you know, McSlurpy think I actually might really want them to know after all), what time I am browsing, how long/how many sites I visit during a browsing session, and with the collectng and analysing of requests as covered above, what site(s) I am visiting.
This is NOT something I would consent to, and as it's built in to Safari (and, I assume, chrome) and turned on by default, they are being given information I not only did not consent to but probably (given the knowledge of the average user in these cases) did not know I was giving out and wasn't even told about (given the way google tries to hide what they're actually doing behind reams and reams of legalspeak mixed in with tons of feelgood gobbledegook)
Yep, falls foul of NZ's privacy act and likley GDPR as that's a stronger law.
The api returns all the sites which start with that hash
Remember this is a list of malicious URLs. Unless there are significantly more than 4 billion there will not be many, if any, dupes for each 32-bit hash prefix. You're only trackable if you visit a site on the list, but if you do, chances are the checking system can guess the URL visited.
The danger is that there's nothing to stop the CCP asking Tencent to add a bunch of hash prefixes for sites it wants to monitor accesses to, to the list (e.g. freetibet.com), and then getting Tencent's log of IP addresses that visit those ones. Or getting the log of IP addresses versus hash prefixes requested and generating its own list for comparison.
I always switch those checks off.
I expect it does contain original, unhashed URLs (at least somewhere); I don't think I claimed it did not? However, the source database would be unneccessarily and impractically large to hash all sites URLs. Also, that's not needed. No one cares if you're visiting MrsMurgatroydsMacrameMushroom.com.
Bad actors just need to poison the database with the more modest number of URLs for sites that worry them. They can avoid showing their hand by flipping a bit past the first 32, so they get the check request for logging, but the browser gets back a negative hit and the user see a malware report for the URL.
*and the user won't see a malware report for the URL.
Also consider the value of URL visit info vs. dest IP only when HTTPS in use, as described by Crazy Operations Guy further down. This list can be used to trigger a check transaction by adding URL hash prefixes to the list downloaded in advance.
"The danger is that there's nothing to stop the CCP asking Tencent"
Apart from the fact that they would get much more reliable data from China Mobile or whoever.
Tencent would get the IP address of the cellphone tower, not the individual subscriber.
You may as well say that the entire dns system is a gdpr breach. As that sends the url of the site you are looking for along with your IP address to allow the reply to come back to you.
it doesn't send the URL to the DNS server, it sends the Fully Qualified Domain Name that you are accessing to the DNS server, i.e. everything before the first '/'.
e.g.
The URL I am using to post this reply is:
https://forums.theregister.co.uk/post/reply/3893296
the FQDN component of the URL, forums.theregister.co.uk will be sent, not the context path, /post/reply/3893296, to DNS severs.
Yes and no. As long as the DNS server doesn't log your IP address, that is fine.
If the IP address is stored, for example in a log file or a database, that would contravene GDPR.
There might be a get-out for ISPs, if it is in the T&Cs that customers sign when they get broadband, but it wouldn't cover 3rd party DNS servers, such as Google, Cloudflare etc.
"EUs laws do not apply to the entire world"Yes and no. When you are processing data belonging to EU-citizens (like IP), EU data protection legislation applies.
Location of said processing is irrelevant: it's about whose data you have which is significant.
Not quite true. I have managed sites that are not in EU jurisdiction and are not relevant to anything EU - no trading with EU people, no content of specific interest to EU people, nothing.
If someone from the EU visits their interactions with the site get managed just like every one elses. If they behave then their details sit in log files till rotated out. If they misbehave, then whatever info I can get is passed on to whoever can help.
The businesses aren't based in the EU, and don't trade with the EU, therefore the rules of the EU (or yanks, or ozzies, or whoever) don't apply.
And who is a member of this "Privacy Community" he is referring to? Clearly it doesn't include myself or the many privacy advocates I associate with.
I see little utility in it anyway. Malicious websites tend to disappear as quickly as they appear, by the time that Google is made aware of it, the attacker has probably already abandoned it. Besides, I protect myself in other ways like using Privoxy to strip away scripts from websites that aren't on my trusted list, I keep my software up-to-data, run as an unprivileged user on a hardened OS, my data is backed up on read-only media, I use a different device for financial management (Loading money into a paypal account, which I then use to actually pay for things), and so on.
"Clearly it doesn't include myself or the many privacy advocates I associate with."
I work in the computer security field myself, and it doesn't include me or literally any of my colleagues as far as I'm aware.
"The privacy community, he said, has mostly come to terms with the privacy trade-off"And who is a member of this "Privacy Community" he is referring to? Clearly it doesn't include myself or the many privacy advocates I associate with.
Add myself and those I talk with to the list as well.
While I do sometimes use their search engine, and I know lots of other sites feed in stuff where they can, I block as much of google as I can. I certainly would never trust any of their safe site systems. Nor would I trust them to honour 'right to be forgotten" even if the smallest breach came with a torturous sentence.
You only make a connection to Tencent if you are running with the region code set to China, otherwise it goes to Google. People inside China are likely having their ISP collect their DNS lookups, all IPs they visit, and more so there isn't even a reason for Tencent to forward this information to the Chinese government. They already get far more from the ISPs, what little they get from Tencent is duplicate information.
Apple could improve privacy a bit by having its servers act as an intermediary so no direct communication happens with either Google or Tencent. Of course THAT would also outrage the people who are fake-outraged at getting it directly from Google and Tencent, and they'd probably also be outraged at Apple if safe browsing was by default disabled. There really isn't a way to implement this without getting a list of sites from someone, unless they want to do it themselves (and probably get criticized that the list isn't as comprehensive or timely as Google's...)
If you don't like it, you can turn off safe browsing. Not sure how much the "safe browsing" stuff any browser does really helps given how many ways there are for sites to dodge this - i.e. have a main site that forwards you to the malicious site and when it gets onto the malicious list automatically rotate to a different site that you get forwarded to.
But the URL reveals more than just the domain the person visited. If the user is using HTTPs, the only way to figure out what page they visited on a server would be to get the server's logs, or the user's search history. As an example, if I were to browse a page on https://www.theregister.co.uk, all the ISP, DNS provider, and anyone spying would see is me communicating with that server but would be unable to see the actual page I requested. But if they had the truncated hash of the URL, they could compare the list of returned possible URLs and find the one that includes the domain "theregister.co.uk", which would come back with a much smaller list, assuming there is more than one at all.
The likelihood of two arbitrary URLs hosted on the same server having the same truncated hash is very low.
Your missing a critical piece. There is a list of hashed prefixes downloaded from Google (and/or Tencent if you are using the China region) which Safari compares to the hashed URL you're trying to visit. That long list is presumably refreshed every once in a while (daily, hourly, not really sure...) to keep it up to date. Only if the hash matches does Safari query the server to get the hashed list of full URLs that match the prefix hash, which Safari then compares and if found will trigger the warning. It doesn't know if you ended up matching one of those or not, just that you matched the prefix hash.
Presumably such a "hit" where you match the hash isn't going to happen too often. How often depends on the number of bits in the hash, but if it is only 16 bits then you'd have a hit only 1 in every 65,536 URLs you visit! While it is possible a visit to some URL beginning with www.theregister.co.uk will match a hash, there is absolutely no way for them to know that's the site you visited when the hash matches. All they will know is that you matched a hash to one of the dodgy sites with the same hash, which a nearly infinite number of other possible URLs would also match.
Yes and no, sending the information to Google is also not allowed, under GDPR, in Europe without an explicit opt-in from the user. If this is a default setting and it wasn't explicitly stated, when activating, what the data will be sent to Google, it would contravene GDPR.
If Apple acted as an intermediary, that would solve the problem as they are not passing on PII to a third party without explicit permission.
Surely the GDPR doesn't extend to hashed information. How's that personally identifying?
If it is "they get your IP address when you send that hashed information" then you'd violate the GDPR by connecting to a web site without a notice (somehow) prior to connection that your IP address would be shared.
No, the hashed information, on its own, is fine. It is when that is combined with PII (in this case the originating IP address). That is why Apple, acting as an intermediary and just passing on the hash and not the users IP address would be in compliance.
The IP address is required. That is why the site the user visits can have it. They just can't store it or pass it on to third parties without the express permission of the user. That is why the whole web advertising model is currently up in the air.
I don't know if there has ever been an opinion poll of the public asking how many people know what an IP or MAC address is. In the past decade I've tended to try in vain to train elderly folk who repeatedly ask, "What is a browser? What is a folder? What is a cookie"
I used to teach youngsters who'd all instantly assume they were hackers, that was worse.
I'm sure most if not all of us here are aware of what we give legal consent to online, but it feels akin to sexploitation of 98% of the population by our industry.
As the article notes, the Chinese government already controls ISPs, cell networks, etc. and presumably has access to all of their user-activity logs with much higher granularity. That means they are also positioned to MITM users' https connections. Given that threat environment, I'm not clear on why Tencent getting some hashes and IP addresses is such a BFD.
The Taiwan thing is not unique to Apple. A lot of major companies that do business in China are very circumspect in how they refer to Taiwan. At one Fortune 100 company where I worked as a tech writer, we were specifically forbidden to refer to Taiwan as a country under any circumstances, even indirectly or by implication. We were required to use "Country/Region" or "Countries/Regions" for any description or list that included Taiwan.
Word from on high was that any deviation from that (even unintentionally) could cause major problems for the company with the Chinese government, and immediate defenestration for the person who created the doc.
There are now so many intermediaries between my computer and the domains I try to contact that I never know whether I'm going to connect, either to web sites or by email.
I wish to exercise my own judgement about the validity of sites and content and I want to ensure my emails get through as my business depends on them, but the self-appointed nannies assume I'm not smart enough to make these decisions (or at least not as smart as them). Oh for the early days of the web when we still had control ourselves.
The corporations will probably respond that most people aren't capable of making such decisions. I answer - educate them then, rather than progressively concealing and obfuscating the information that would enable them to gain the requisite knowledge. But of course the last thing vendors want is an informed and questioning public.