"Version numbers, platform details, model information, etc. [..] with every request"
And they were doing that when we were all on 56K dial-up.
Bastards.
Google's Chrome team has delayed its User-Agent Client Hints (UA-CH) makeover until at least 2021 due to the impact of the COVID-19 coronavirus on the web development ecosystem. "While work on UA-CH continues, we don’t currently know in what ways or for how long COVID-19 will impact the web ecosystem’s ability to test and …
I've often wondered if there shouldn't have been a HTTP-header-compression scheme that tokenized things like "Mozilla/5.0" since pretty much everything says that since... for-ev-ah... and we all know it's a lie! :D
Which is not to say that it should be done. The problem is with web sources that use other than html+css and don't provide a fallback if the other is not available at the receiving end, not with the browser.
Page provider's argument: I can do all these exciting things if I know more about your browser.
My argument: you don't need to know also the colour of my socks. Code to the minimum required *by the customer, not your marketing department* and add bells and whistles optionally.
Ever since researchers produced a whitepaper on how to fingerprint a browser using fonts, HTML5, vector drawings, PDF's, hardware etc. It's darn near impossible to block fingerprinting.
https://arstechnica.com/information-technology/2017/02/now-sites-can-fingerprint-you-online-even-when-you-use-multiple-browsers/
Case in point: I received a text message that contained a shortened URL link that was (allegedly) from the data provider of my cellphone.
I ran the shortned URL through urlscan[.]io and found an interesting bit of hex delivered by Akami that used every single fingerprinting technique listed by the researchers in the ARS article above.
(Pastebin of obfuscated browser fingerprint code)
https://pastebin.com/2tW06app
Google knows. Since it does control most of the browser market today, it really doesn't need it anymore, so it can start to starve competitors reducing the number of data points they can get.
Not that this is a bad idea, IMHO instead of user agent names they should announce web standard compliance (and web standards should ease it with clear identifiers) - just it should not be Google to decide how it works - otherwise it's just the new IE.
> Not that this is a bad idea, IMHO instead of user agent names they should announce web standard compliance (and web standards should ease it with clear identifiers) - just it should not be Google to decide how it works - otherwise it's just the new IE.
Came here to say much the same thing but wondering if web standards compliance is enough? Perhaps there should be something about whether the output display is touch enabled or not? And screen dimensions + pixel density perhaps?
It should definitely include a disability string to indicate things such as screen reader in use; contrast enhancement in use; hearing impairment, so go easy on the sound effects etc.
Quote: "Not that this is a bad idea, IMHO instead of user agent names they should announce web standard compliance (and web standards should ease it with clear identifiers) - just it should not be Google to decide how it works - otherwise it's just the new IE."
Seems the minimum with US-CH is "brand (i.e. browser)"; v="significant version"
e.g.
Sec-CH-UA: "Browser"; v="73"
Everything else, full browser version number, platform, platform version, architecture etc. is optional, and has to be specifically asked for by the server, and the client chooses whether to provide those details or not.
I also noticed this in the spec: Quote:
"User agents SHOULD keep these strings short and to the point, but servers MUST accept arbitrary values for each, as they are all values constructed at the user agent's whim."
One thing I did notice that seems to be specifically missing is anything related to what the browser capabilities are, e.g. HTML version etc.
Which seems an odd choice to me, as won't that mean servers will need to keep track of browsers and version numbers, in order to know what standards they can utilise?
The draft spec isn't all that long, and can be found here: https://wicg.github.io/ua-client-hints/
Looking at https://github.com/WICG/ua-client-hints
Snip: "For that use case to work, the server needs to be aware of the browser and its meaningful version, and map that to a list of available features. That enables it to know which polyfill or code variant to serve."
So seems they do expect all servers to have a list of browser versions and capabilities.
From the 'Browser bug workaround' section below that, seems they expect web servers to already be doing this anyway, in order to work around existing browser bugs.
Exactly. Some data defined there are just silly and useless.
I would expect a browser declare what it does support, not what browser is. And I do expect that if you declare you support say "HTML 7.5" you fully support that standard, or you can't declare it - it should be mandatory to declare the most recent standard you fully support.
Why web applications should take into account brand, device models, etc.? It has to understand what features are supported, and not by having a database of brands and models. Of course devices information like screen size, etc. can be useful.
Even any OS specific "translation" should be done by the browser.
But this is exactly why we can't let Google decide how the web should work - its interest are often not alinged with those of developers and users.
I would expect a browser declare what it does support, not what browser is.
Take a look at my post on Feature Detection. This is already possible - but it only works client-side (with a library like Modernizr). There's no way a browser is going to send a list of capabilities with every single request. That means detecting features on the server-side is hard. Which is why the user agent string has been used as a means to send a browser identifier to the server, and then the server can generate a response it thinks is appropriate. The problem with this is that maintaining a list of browser features on a server is far from ideal. It's prone to problems if browsers get updated and the corresponding code that generates the server side response isn't updated inline with it.
@LDS
I agree. The current proposition seems to mean that someone, somewhere, i.e. the web developer, the server devs, or more likely the browser devs, needs to maintain a list of their browser versions and the capabilities at each version, in a format that can be imported into and parsed by a web server, and this list will need to be maintained.
What are the rules if the web server doesn't recognise the browser? Such as one of the smaller player, with say a high security browser? Do you drop back to a basic web page, with no extras? If that's the case, then as this 'brand' field (i.e. the browser) is free text, then you'll end up with smaller browsers spoofing as 'Chrome' or something else, in order to get the 'real' page.
There may well be valid reasons for asking for the browser name and version at times, but to me those should be in the optional section, with only the basic browser capabilities being in the mandatory section, (i.e. HTML version supported), with other capability checks being optional.
There may well be valid reasons for asking for the browser name and version at times, but to me those should be in the optional section, with only the basic browser capabilities being in the mandatory section, (i.e. HTML version supported), with other capability checks being optional.
Yes, you've just stated what the reason is in your first paragraph. If the list of browser capabilities is stored on a web server, how would the server identify which browser the user had without an identifier being passed? That's what the user agent string does.
Your comment about "HTML version supported" suggests you think this is a case of "yes" or "no" to a question like "does the browser support HTML 5?". What you actually need to know is that these specifications encompass lots of different features, and it's whether a browser supports individual features that you often need to know. If you take a look at https://caniuse.com/ you'll understand this more. Some browsers on there technically support CSS 3, but if it comes to using a particular feature you'll see that there is huge variation in browser support.
If you read my post on Feature Detection you'll see that this is entirely possible to do already using a client-side library like Modernizr for example. The issue is that it cannot be done (easily) server-side. The current method relies on maintaining a list of capabilities on a server - as you've suggested in paragraph 1. But you of course have to pass something from the client to the server to match the two things. That's the user agent string. The reason it's been done like this is because passing a relatively tiny User-agent string within the request data is much more favourable than passing a huge object - on every single request - of all the features a browser has.
Which seems an odd choice to me, as won't that mean servers will need to keep track of browsers and version numbers, in order to know what standards they can utilise?
That's pretty much what they do now, isn't it? My user agent string looks like "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/XXX.XX (KHTML, like Gecko) Chrome/80.0.XXXX.XXX Safari/XXX.XX" good luck figuring out what standards that means...
good luck figuring out what standards that means...
Well you say that but it's actually pretty trivial. It's possible to parse that string for known browser names (e.g. "Chrome", "Safari" etc) and then extract the version by looking at what appears after /
In your case you're using Chrome v80.
Let's say I want to write some code that relies on knowing whether your browser supports the Battery Status API.
Yes, it does: https://caniuse.com/#feat=battery-status. Chrome v80 fully supports it.
It's not very difficult to build a table of that information, parse the User Agent string, and look up whatever features I want to check your browser supports. Indeed, this is happening now for a lot of websites you use whether you're aware of it or not.
"standards compliance" is a a set of moving goal posts. Featue detection has been the preferred approach since Google started using the UA string to tell IE users that they were going to lose support for YouTube. But the downside of feature detection is that it is granular enough to be usable for fingerprinting.
"If this change is implemented, then advertisers will no longer be able to verify their adverts were served to humans when displayed in this manner by publishers," wrote James Rosewell, CEO of mobile detection biz 51Degrees last week in a GitHub issues post for the UA-CH spec. "Advertisers will direct their advertising spend directly to publishers and platforms that can provide that verification."
They will if they use Adwords :D
Libraries like Modernizr (https://modernizr.com/) already have a halfway solution to this. They work by testing what features a browser has, and then a developer can provide a fallback if it doesn't have a feature they checked for. For example you can ask the browser if it supports geolocation, and the code for doing this is the same for every browser.
Coding - (if browser == 'Internet Explorer 7') is a lot less desirable than coding (if browser.supports('geolocation')). Because if IE 7 receives an update and geolocation (hypothetical example) capabilities get added then that first code is broken.
The issue is Modernizr is a client-side library and relies on JS. That's why I call it a "halfway" solution because it doesn't work if you want to detect features server-side and then have the server render a particular response.
But that idea has been a pipe dream for years. Nobody is going to make browsers that send a list of supported features with every request. Essentially the sending of a user agent string means you could have that list of browser features on your server and then use the UA string as an identifier to look up the capabilities. It's not ideal, but to some extent it does work. And it's been done in so many applications that the legacy of "un-doing" this will be a nightmare in itself.
Alternate suggestion:
some reputable site(TM) like canIuse, or even W3C defines a (yikes, horror!) BITMAP of features for each relevant feature of the different standards.
Then the your new broswer CrystalBall sends "HTML3.14 CSS2 CSS4=#ffff7ffffffffffffff0" meaning that it can do everything in HTML3.14 all of CSS2 and most of CSS4 including the new 'rotate the user in hyperspace' functions, but that echo location and all mouse actions except squeek have been disabled (the cat is currently pinning it down).
Browser makers can then say 'supports CSS4 to all 256bits!'
This approach isn't really practical unfortunately - as with any complex feature in any software, support is rarely a 1-or-0 state.Most browser engines support "white-space: pre-wrap" for instance, but none support it completely correctly. Second, any claim to support is something that goes out of date each nightly build, and rapidly becomes out of date.
The good news is preventing browser fingerprinting tends to get a lot of recognition from the relevant working groups, and the browser manufacturers are (from what I've seen) very hot on getting these features implemented once specified.
If a website fails because its tried to sniff something, they've built it wrong. There's very little core layout you can't do safely in a cross-browser fashion these days, but if front-end devs want to get checking for "document.all" there's not much you can do to stop them.
This post has been deleted by its author
There's only a few bits of information that are actually important for a legitimate web developer to know about your system.
Which actual browser you are using shouldn't really be important. All that's important is to specify what level of HTML standards it is complying with.
There are few actual browser-specific differences between HTML5 implementations these days, and in general if your code is relying on sniffing the browser type to determine which HTML5 to generate either you're doing it wrong or there's a bug in the browser which needs fixing.
Similarly, I don't care if you're using Mac, Windows, Linux, Atari ST or CP/M as long as your browser is compliant.
What I *do* really care about more than anything else is the physical type of device you have. Is it a phone? a tablet? a computer? This is something that can be sniffed currently to some extent from user agents, but for me this is really the only thing that's really important.
The final bit of information that the browser headers can tell me which is really important is which languages you prefer the content provided in. I don't care *where* you are. But knowing what language you'd like me to serve your content would be great.
So "I'm using a phone, it's HTML5 compliant, and I prefer my content in English, but I can also read German" is really all the information we need.
Anything else is useful for the marketeers but not for me as a developer.
So delaying the shift until a less hectic time makes sense.
:-) And magics up more useful time for Future Beta Testing of Browser Assets ..... and a Granting of Convenient Time to Fine Hone Programs to as Close to Perfection as will ever be possible.
:-) As you can imagine, a Vast Endless Task to be Immersed In, that's for sure true.
"There’s quite a bit of information packed into those strings (along with a fair number of lies)
And when some of those strings tell no lies, what logically naturally follows ? ...... and what does an Optimised Google Search Engine Deliver/Reveal/Present ? .... are a few questions answered there?
Methinks that makes a JEDI type Program Practically Obsolete and Superfluous Surplus to Present Future Requirements .... and delivering a Furlough to Enjoy Future Specialist Training.
Well, you have to keep the military minded engaged in something worth laying down their lives for, surely.
Oh bollocks.
Just last week, my home server (running on a Pi) had the Googlebot trying various phpmyadmin hacks from a Chinese IP address. There's no question in my mind that the user agent was faked.
Myself personally, I sniff user agent strings to determine whether the device is a mobile, a tablet, or something else. Mobile gets simpler layout and reduced quality images. Everything else gets the normal content. Beyond that, I don't care what browser, version, platform, country, planet, or universe the user inhabits.
User agent ought to specify the browser name, the base version (like 72.3 without all the subversions), and what platform class the device is (mobile, tablet, desktop). That's all it needs, surely?
I don't know how much it actually protects anyone, but you can pretend to be a good citizen and report the address at https://www.abuseipdb.com [other blacklists exist]
If they're script kiddies from China, there may be a chance they'll actually loose some social credibility or whatever the term is for pretending to be imperialistic money grabbers.
Impersonating Google has been a standard trick to try to bypass access controls since sometime in the last millenium. I'm amazed that anyone actually parses UA strings, but of course they do. It's how Amazon and Lloyds bank both (wrongly) "know" that my Chromebook needs the phone version of their website.
Is there an icon for muppets getting nuked from orbit whilst being strung up by their saggy bits?