ping
Are we aware of legitimate uses for "ping"?
Or, in other words, would anything not-evil break if we simply filter it out of all HTML behind, for example, mod_proxy_html in error-correcting mode?
Apple recently released Safari 12.1 for iOS 12.2 and macOS 10.14.4, bringing with it both privacy improvements and an unexpected regression. On Friday, Jonathan Davis, web technologies evangelist for Apple, highlighted a handful of feature additions, the most significant being version 2.1 of the company's Intelligent Tracking …
It seems Mozilla must have restored support for Ping at some point because this discussion in 2014 was about turning it on by default - "https://bugzilla.mozilla.org/show_bug.cgi?id=951104".
[browser.send_pings] is currently defaulted to false, but the discussion shows a desire to change that.
The lesson for me is - don't just trust in what someone did 11 years ago: check what they did yesterday.
I'm not surprised someone lobbied to have it enabled. Later someone asked instead to have sendBeacon() - a close relative of ping, and more powerful - disabled by default:
https://bugzilla.mozilla.org/show_bug.cgi?id=1454252
But with browsers increasingly the offspring of pure ads operations, and with these features probably buried deep in the core, only proper and enforceable legislation could stop the tracking - no technical spec can achieve it, as it can be circumvented.
That's why CEOs now pretend to be "privacy conscious" - what they don't want is any law imposing privacy in a way they can't easily bypass.
I use Linux Firefox on my PC and checked and send_pings is at the default false, but there isn't any way to know if its absence causes problems without someone telling us what a legitimate use of it might be? Not all sites work perfectly in every browser, if something works differently between Firefox on Linux vs Windows, or vs Chrome how would I know it is because of ping or something else?
If it doesn't do anything useful then Apple ought to default it to off, and if anyone complains they can add the preference back for the macOS version (AFAIK there isn't any way to set preferences like that on the iOS version, other than maybe with a jailbreak)
I'd prefer Apple to err on the side of caution for EVERYTHING that can used for tracking. There will still be ways to track, I'm not under the illusion that Safari can give me perfect privacy. But I want those bastards trying to sell stuff to sweat bullets and have to waste a lot of effort finding workarounds (and then for Apple to plug those workarounds and repeat the cycle!)
I believe "non tracking" uses are quite limited - I could think using them in some test situations to have easier to use telemetry or something alike, external auditing for some kind of accesses, but it's quite evident it was designed for tracking. Ironically, it was thought as a "privacy feature" because it would have made it transparent. and should have an option to be disabled. Just like any mechanism that can be easily circumvented, it is quite useless.
Reading the spec the request "may be done in parallel with the primary fetch, and is independent of the result of that fetch [...] User agents must ignore any entity bodies returned in the responses. User agents may close the connection prematurely once they start receiving a response body."
So nothing should break if the user agent disables the feature "User agents should allow the user to alter this behavior. For example, in conjunction with a setting that disables the sending of HTTP referrer (sic) headers. Based on the user’s preferences, user agents may ignore the ping attribute completely, or selectively ignore URLs (for example third party URLs)"
So anything that relies on it to work is breaking the specification - but of course entering a marketer mind is always a dangerous attempt because all of the stinky garbage you'll encounter, so it's impossible to rule out that someone made something stupid.
I think Apple found it left a privacy option available, albeit it a hidden way, to peones, and removed it - it's probably now available only in internal builds for the anointed ones...
"but it's quite evident it was designed for tracking"
There is a difference between tracking traffic patterns, and tracking individuals. Every time I visit a page on a website, the server logs the event and webmasters use it to track page popularity, pathways through sites, etc. Do I have a problem with that? No. Tracking isn't inherently evil or a privacy violation. Many times when I go into a shop, an electric eye at the door counts me in. That's not invading my privacy. I'm just an anonymous statistic. It only becomes creepy once they start using recognition systems, whether its a facial-recognition camera on the door, or a cookie on my PC, and start profiling me.
When I buy a product in the shop, the shop likes to know what people are buying so that it can get a better stock selection next time, but it doesn't need to track people individually to do that. Likewise, if I am offering a selection of links to external resources, it is good to know which ones people collectively find most useful so I can improve the website content.
So yes, ping might well be designed for tracking, but tracking is not synonymous with privacy invasion.
But this "ping" feature allows for third party tracking of a user clicking a link. For each click it can contact n other servers which may be outside the actual domain. And because links can be custom generated, it allows for individual tracking as well. And it may happen without user knowledge.
Initially I was shocked to hear about this "ping URL, and was about to go into all "Daily Mail" mode...
Though thinking about it, as long as the 'ping' ability doesn't have a way of setting local data (cookies etc.), then they could gather the exact same data without ping.
Any website subscribed to some service that used it would instead only have to forward their access log to the real url onto the advertising service.
Sure, it would require a bit more collating and organising of data sever-side, but it wouldn't be much.
What am I missing? Don't get me wrong, they should get rid of it altogether - it should not be up to the clients to aid tracking of any sort... But without it, what more data could they actually retrieve?
"Are we aware of legitimate uses for ping?"
One instance springs to mind where I'd welcome outbound link tracking. Sometimes I have found something on a website and need to click to pay, but they use an external payment service, like Paypal, or Sage. There have been times when I've clicked on the link, fully intending to make the purchase, or to register for a conference, or to sign up to a mailing list, but the external service has failed for some reason, or I've reached the external service and thought it looked suspicious, or whatever. If the site owners are not tracking outbound links in any way, they will never know the difference between people who were only window shopping, and people who wanted to buy but couldn't because of the third party service letting them down. Again, it needn't involve any privacy violations to see this sort of metric.
I wouldn't say that.
Where something like ping might have a legitimate role is in some application none of us (reg commentards here) has thought of. Such things may be out there, without us ever having encountered them.
We can draw analogies from the past. For example, when Sun first came out with the (Java) applet in the mid-1990s, they had some silly/pointless demos ("duke" or something?). Lots of third-parties also used it to produce toys and eye-candy and probably cat videos, and it was widely dismissed as fluff. Meanwhile some of us were producing serious applications for the real world: in my case, providing interactive access to explore satellite image datasets, including some quite sophisticated GIS and visualisation tools.
But obviously that's a specialist area, and most web users would never encounter it.
Hence my question above. I can contrive legitimate use cases for this ping. What I struggle to see is a use case that isn't at least open to abuse and more-or-less sure to be used abusively if released into the wild. But I wouldn't dismiss it just because my own imagination ain't what it used to be.
"Where something like ping might have a legitimate role is in some application..."
Consider a website which is a directory of suppliers, and you want to know how many of your visitors are following the links from your website to the suppliers' websites. This isn't at all concerned with tracking the individuals but instead tracking the total amount of traffic sent to each supplier. All you are trying to record is that someone, and it doesn't matter who, has clicked on an outgoing link. Lots of people do something similar, though I accept that many of them very much want to link that to the individual user.
The ping option would be ideal for this if it was a standard feature of HTML. It isn't, so people work around it either by sending to an intermediate url which logs the hit and then refreshes to the external site, or they use javascript to add onClick options to the links, both of which are a bit clunkier than a simple ping option on the links but achieve much the same thing. All of them can be a privacy violation mechanism, just like cookies can be (and often are) used for privacy violation, so it seems strange to single out ping and turn a blind eye to other technologies which can do the same thing.
I can also imagine that some sort of outbound click tracking would be useful for better understanding how people are using the website and improving the usability, to help you understand if people are visiting the page to read the info, or visiting the page to find the link to a source document. You can track the path of people through the site without needing to know anything at all about them as individuals, just like road traffic managers map the paths of cars through junctions, to better understand the popular routes and bottlenecks, etc, without needing to know the identity of the individual cars, just the total numbers.
And again, people who say "you don't need ping for that, you can do all that and more with javascript" are missing the point. The ping mechanism isn't the privacy problem, its the people who think tracking is acceptable who are the problem. In fact, the ping mechanism is a lot more open and up front than some of the javascript solutions which need a lot of code analysis to work out what is going on.
Maybe the ping option could be restricted to urls on the parent website, i.e. no pinging external websites when someone clicks on a link, and maybe it could also not pass cookies to the pinged url, as a way of reducing tracking, but let's face it, that doesn't stop people adding the session id to the url as a parameter etc. Whatever we do, the problem remains that we keep trying to come up with technical solutions to privacy, people with no regard to privacy find workarounds, and big companies reward them for finding loopholes in these pesky privacy limits.
"Singling out ping doesn't mean turning a blind eye to the rest. As they're all implemented separately they have to be challenged separately."
Fair point, but on the other hand, no-one is challenging the use of onClick to do the same thing. When you look at the links on a page of Google search results, they all have click events attached to them, and that's the way Google tracks our interests to profile us, tracks which adverts we click on etc. Whilst people complain about the privacy intrusion, I don't recall them complaining about the mechanism.
...potentially break the connection between ads and customer conversions driven by those ads if the conversion event (e.g. purchase) ..."
People actually buy stuff from ads?
If I see an ad for something I want, I check all the usual sellers for the best deal. I have NEVER followed an ad link.
If targetting worked they would know that all ads should be totally null.
Whenever I drop by and check my Google profile it only shows one interest area, computer hardware which happens to be completely accurate. And, on occasion I do click on such an ad. Almost always seen on this site so the odd fraction of a cent in the direction of El Reg, sometimes also Ars Technica.
Now when on my tablet, well Amazon shows a complete mess but I've so thoroughly confused its recommendation engine, it's no surprise the ads on sites are just as hopeless.
Targeting can work with a good dataset. On me, Google has a good dataset. They should, we've known each other since the beginning and, no, I don't have a problem with that. Of course..., there's a ton of stuff I do that they don't have data on. That requires real work but isn't too hard. My setup for that hasn't had to change much over the years, though.
Side note but back on topic. As soon as I caught the ping issue, Firefox went back into the mix of browsers here since the Chromium based ones all have the same issue. The UI still pisses me off, but....
My google profile is satisfyingly empty but the ads I see are an interesting mix. A fair bit of electronic parts, which is relevant. They spent a long time trying to sell me gene sequencing machines. Alongside ads for yarn and knitting needs (Also relevant). What I really want to know, though, is why they think if they just keep trying, eventually I will cave and decide that expensive underpants aren't ridiculous and buy a whole whack of 'em.
I learned that Apple is sending more than 5MB data to Google everyday as Google is default search engine in iOS, it’s creepy! See https://e.foundation/privacy-facts/ . This is not what privacy conscious people want, and not what i want! We need a more ethical mobile OS like what the eFoundation offers. I’ve been testing the beta version of /e/ (e.foundation) on my Samsung Galaxy S9 and it works great. Go and check it out on their website: https://e.foundation
"Specifically, the latest Safari caps the expiration of persistent cookies created client-side via the Document.cookie property to seven days, because lengthy cookie lifetimes lead to less privacy."
Does this actually make any difference? Surely it's trivial to just create a new cookie for every session that copies all the information from your old cookie. As long as they visit you at least once a week, which will probably be the case most of the time, it doesn't matter if older cookies expire, you still get unbroken tracking. The only way to avoid tracking is to remove cookies after every session, otherwise there will always be persistent data that can be linked between sessions.
I think it would make a difference if they used that logic for all cookies, or if there were an option inside the browser to manage cookies better. Just limiting it to javascript-created cookies isn't really a limit. Like you said, it is trivially easy to get around. So many sites create cookies with lifespans of two years or more, sometimes much more, and that's seriously creepy. For instance, the Register has set a cookie which will not expire until 2028. That is as good as forever and will outlive the life of the PC. If the browser would cap ALL cookies at a lifespan of, say, one month, that cuts out long-term gratuitous tracking. Sites which I genuinely want to be remembered at, such as, say, the bank, or my utility company, use a log-in system to recognise me, and don't depend on a cookie to keep track of me.