Client Hints

Client Hints
International standard
  • RFC 8942
  • User-Agent Client Hints - Draft Community Report
Developed byGoogle, W3C
Websitehttps://wicg.github.io/ua-client-hints/

Client Hints are an extension to the existing Hypertext Transfer Protocol (HTTP) that allows web servers to ask the client (which is usually a web browser) for information about it's configuration. The client can choose to respond to this request by advertising the requested information about itself through sending the data using a specific part of the HTTP protocol called HTTP Header fields or by exposing the same information to the JavaScript code being executed on a web page. This can then help the server tailor it's responses to the client, for example, a server can choose to send a smaller image if a client advertises that they have a very small screen.

Proposed by Google engineers in 2013, Client Hints were designed as a privacy-focused alternative to user-agent headers. This was done as part of an initiative by Google to create standards for websites to access user information without compromising privacy called Privacy Sandbox. User-agent headers are strings sent by a client to a server to identify the client. While initially intended for statistical purposes, these headers had increasingly became a tool for tracking users across websites. Client Hints aimed to address this issue by providing a more controlled way to share the same information. Despite the focus on privacy, the initial design of Client Hints faced criticism from other browsers. One of the primary concerns that was brought up was that the protocol could enable new forms of tracking by third-party domains. Third-party domains are web servers not owned by the website that load resources like images and script files. Despite these concerns, Chrome implemented support for Client Hints in August 2020. By May 2024, over 75% of web users used browsers that supported Client Hints.

Privacy researchers have since raised concerns that Client Hints are primarily being used by JavaScript code that was being used to track user. In 2023, a study from the from KU Leuven and Radboud University found that amongst the top 100,000 websites on the internet, most accesses of Client Hints came from JavaScript code used for tracking and advertising purposes.

Background

[edit]

In 1992, an extension to the HTTP protocol was introduced adding a User-Agent HTTP Header which was sent from the client to the server and contained a simple string identifying the name of the client and its version. The header was meant purely for statistical purposes and for tracking down clients that violated the protocol. Since then, User-Agent headers have become increasingly more complex, and has started containing significant uniquely identifiable information about the user. Often, this information is used to perform browser fingerprinting, allowing sites to track users across sites passively without having to load any JavaScript for the user.[1]

History

[edit]

The original draft for the Client Hint specification was proposed in 2013 by engineers at Google. The specifications became an Internet Engineering Task Force (IETF) draft in November 2015. Subsequently, in 2021, the specification was upgraded to the status of an experimental request for comment (RFC).[2] This designation indicated that the IETF had accepted the Client Hints specification as an internet standard, but it either still had unresolved questions or had not yet gained widespread adoption in the internet.[3] Around the same time, the specifications for how web browser would be handling HTTP Client Hints on the web was published as a draft in a W3C Community Group Report.[2]

In 2020, Google announced their intention to deprecate user-agent (UA) declaration by the browser. This deprecation was part of a broader initiative by Google to make changes to the web that allow websites to access user information without compromising privacy called Privacy Sandbox. They cited Client Hints as a privacy-preserving alternative to user-agent headers since they allowed for a more controlled way of sharing the same information.[1] The initial Client Hints proposal, however, was met with pushback from other browsers due to privacy concerns. In 2019, Brave raised concerns about the initial proposal, citing ways in which it could be used to track users on the internet.[4] Mozilla, the company that makes Firefox, initially classified the proposal as harmful, and Apple, the company that makes Safari also took a negative stance against the proposal.[1] Despite these concerns, Chrome implemented support for HTTP Client Hints in August 2020. While the deprecation of the UA strings was delayed due to the COVID-19 pandemic, this process was completed in February 2023.[1]

Since their initial opposition, Mozilla has updated their stance to neutral and Brave has synchronized its implementation of Client Hints with that of Chrome.[1] As of May 2024, over 75% of all web users use browsers that support Client Hints.[2]

Mechanism

[edit]

The Client Hints protocol defines two entities: a user agent (UA) (typically a browser) and a server. These two entities communicate with each other to negotiate what kind of content should be served to the user.[5] The process involves the server sending the UA a response with an Accept-CH HTTP Header, containing a list of Client Hint HTTP headers that it requires. Subsequently, the UA is expected to return the requested client hints with each subsequent response, provided it supports those hints. These headers are then used by the server to make decisions on what kind of content to serve the UA.[2] If the UA does not understand or support a particular client hint then the UA is instructed to ignore the particular client hint. In cases where a specific Client Hint cannot be cached, the server must specify the applicable client hints headers in a separate Vary header sent to the UA.[1] This ensures that caching mechanisms understand that responses can vary based on different client hint values.[6] For client hints that specifically identify a browser, additional random browser identifiers are included as grease in order to prevent users of the protocol from relying on browser specific idiosyncratic behaviours.[7]

For UAs that allow JavaScript, an additional option is available through the navigator.userAgentData JavaScript API. This API enables JavaScript to retrieve the same information as provided by the Client Hints headers.[1] The API separates the data it provides into two types: low-entropy data and high-entropy data. Low-entropy data corresponds to information that is likely to be similar across a large group of users, such as the platform on which the browser is running and the brand of the browser. In contrast, high-entropy data may vary significantly between users, including details like the exact version number of the browser and the model of the user's device. Low entropy data is included in the API as object parameters whereas high entropy data which can uniquely identify the user needs to be explicitly fetched by the client by calling the getHighEntropyValues() function in the API which allows the browser to ask for user permission or to perform additional checks.[8]

Example

[edit]

To initiate a content negotiation, a HTTP server appends the Accept-CH header to the response of a HTTP request:

HTTP/1.1 200 OK ... Accept-CH: Viewport-Width ... 

If the user-agent supports the view-port width client hint, the user-agent will append the Viewport-Width header in every subsequent request,

GET /gallery HTTP/1.1 ... Viewport-Width: 1920 ... 

the server can then use the information in the Viewport-Width header to make a decision about the kind of content to serve the client. For example, if the server has a particular image that is extremely large, the server can be configured to return smaller image if the image does not fit the viewport.[9]

Privacy concerns

[edit]

When the Client Hints proposal was originally published, it was met with significant privacy concerns. Browser vendors like Brave and Mozilla pointed out that a particular provision in the initial draft of the proposal allowed websites to instruct the browser to provide Client Hint data to third-party domains. Third-party domains are domains that do not execute any JavaScript code, but rather load resources like images and script files.[4] The provision in the initial draft would allow these third-party domains like content delivery networks (CDNs), which distribute website content across a network of geographically dispersed group of servers to improve the speed and reliability of the website and cloud service providers like Cloudflare and Google Cloud that offer services like data storage, computing power, and infrastructure for websites and applications to track users across the web by instructing the browser to send Client Hint information to their servers.[4][10] Additionally, concerns were also raised that the Client-Hint proposal was too permissive and explicitly allowed for new privacy compromising information that could not be obtained by simply reading HTTP Headers to be leaked to servers.[10] Additionally extensions that aim to preserve a user's privacy like the NoScript extension also opposed the proposal on the grounds that it would make it significantly harder to prevent sites from exfiltrating privacy-compromising information about users.[4]

Since the adoption of Client Hints by major browsers like Google Chrome and Microsoft Edge, privacy researchers have raised concerns over their real-world use for tracking.[2] A 2023 study by researchers from KU Leuven and Radboud University found that out of the top 100,000 websites, 60% of JavaScript files loaded by web pages accessed the Client Hints JavaScript APIs, with most being tracking and advertising scripts, many of which came from Google. Over 90% of these script files exfiltrated the obtained data to tracking domains.[1] A subsequent study in May 2024 by researchers from the Hochschule Bonn-Rhein-Sieg University of Applied Sciences noted that while overall adoption of Client Hints amongst websites on the internet was low, a significant number of third-party domains known for tracking accessed HTTP Client Hints data.[2]

See also

[edit]

References

[edit]
  1. ^ a b c d e f g h Senol, Asuman; Acar, Gunes (2023-11-26). "Unveiling the Impact of User-Agent Reduction and Client Hints: A Measurement Study". Proceedings of the 22nd Workshop on Privacy in the Electronic Society. ACM. pp. 91–106. doi:10.1145/3603216.3624965. ISBN 979-8-4007-0235-8. Archived from the original on 2024-06-26. Retrieved 2024-06-25.
  2. ^ a b c d e f Wiefling, Stephan; Hönscheid, Marian; Iacono, Luigi Lo (2024-05-22), "A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web", arXiv:2405.13744 [cs]
  3. ^ Hoffman, Paul E.; Harris, Susan R. (2006-09-01). The Tao of IETF - A Novice's Guide to the Internet Engineering Task Force (Report). Internet Engineering Task Force.
  4. ^ a b c d Cimpanu, Catalin (May 16, 2019). "Privacy concerns raised about upcoming Client-Hints web standard". ZDNET. Archived from the original on 2023-12-01. Retrieved 2024-06-02.
  5. ^ Grigorik, I.; Weiss, Y. (February 2021). HTTP Client Hints. IETF. doi:10.17487/RFC8942. RFC 8942. Retrieved February 11, 2021.
  6. ^ "HTTP Client hints". HTTP. MDN. 2024-03-05. Archived from the original on 2024-06-07. Retrieved 2024-06-02.
  7. ^ Taylor, Mike; Weiss, Yoav, eds. (1 April 2024). "User-Agent Client Hints § 6.2. GREASE-like UA Brand Lists". WICG. Archived from the original on 18 June 2024. Retrieved 26 June 2024.
  8. ^ "NavigatorUAData: getHighEntropyValues() method - Web APIs". Mozilla Developer Network. 2024-07-26. Retrieved 2024-09-21.
  9. ^ "Improving user privacy and developer experience with User-Agent Client Hints". Privacy & Security. Chrome for Developers. Archived from the original on 2024-06-02. Retrieved 2024-06-02.
  10. ^ a b "Brave's Concerns with the Client-Hints Proposal". Brave. 2019-05-09. Archived from the original on 2024-06-26. Retrieved 2024-06-02.
[edit]
[edit]