Internet-Draft | web-bot-auth-use-cases | April 2025 |
Hoyland & Hendrickson | Expires 18 October 2025 | [Page] |
TODO Abstract¶
This work originated from a discussion that occured at an IETF 122 Side Meeting on Authentication Bot Traffic on the web.¶
This note is to be removed before publishing as an RFC.¶
Status information for this document may be found at https://datatracker.ietf.org/doc/draft-jhoyla-bot-auth-use-cases/.¶
Discussion of this document takes place on the WG Working Group mailing list (mailto:WG@example.com), which is archived at https://example.com/WG.¶
Source for this draft and an issue tracker can be found at https://github.com/https://github.com/jhoyla/draft-jhoyla-bot-auth-use-cases.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 18 October 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Web is increasingly accessed not just by human users operating browsers, but also by automated, non-human clients. These range from traditional web crawlers to AI agent systems¶
Currently, authenticating the operator of these non-human clients often relies on IP address checking, reverse DNS, and inspecting the User-Agent
request header. These methods are known to be brittle (e.g. IP addresses rotating), easily spoofed (e.g. User Agents), and can negatively impact security and network operations (due to IP address sharing or rotation).¶
Any alternative authentication solution must operate within the context of the existing Web infrastructure, primarily designed for human interaction via web browsers. Solutions must be layered onto protocols like HTTP and TLS without breaking existing sites or causing detrimental user experiences (e.g., unexpected authentication prompts for humans).¶
This document is designed to serve as an informational starting point for potential future work in this area.¶
The document is specifically focused on authenticating non-human clients on the human-oriented Web. Note this may include automated systems acting on behalf of a human user.¶
This intentionally excludes:¶
Enable sites to reliably identify known, well-behaved crawlers used for purposes like search engine indexing, URL scanning services, web archiving, LLM training, and LLM inference. This allows sites or their subcomponents (e.g. captcha providers) to grant them different access levels or rate limits compared to unidentified traffic. This is currently implemented today through serveral common practices:¶
As mentioned in Section 1, common practices today include:¶
IP Address Allow/Block Lists: Maintaining lists of IP addresses or CIDR ranges associated with known bots. This is fragile due to dynamic IP allocation (e.g. on Clouds), a lack of consistent IP list discovery mechanism, stale IP lists, and shared IPs on CGNATs.¶
User-Agent String Inspection: Checking the User-Agent
request header for strings declared by known bots. This is easily copied and offers no cryptographic assurance.¶
These methods lack cryptographic robustness, are often inaccurate, do not offer an owner discovery mechanism, and can lead to both misclassifying human traffic as non-human, and traffic getting incorrectly classified.¶
Using HTTP Signatures [RFC9421] (or a similar JWT approach) allows a client to sign parts of an HTTP request using an asymmetric key pair, providing authenticity and integrity assurance at the application layer. Key considerations include:¶
Replay Attacks: Signatures could potentially be replayed across different hosts or contexts. Signing components that bind the signature to the specific request context (e.g., the Host
header, a timestamp, a nonce) is required for mitigation.¶
Key Management: Similar to mTLS, a mechanism is needed for discovering, provisioning, and managing the keys used for signing.¶
Using TLS client certificates (mTLS) [RFC8446] for authentication provides cryptographic authentication tied to the TLS layer. Key challenges include:¶
User Experience: Browsers often present confusing or intrusive certificate selection dialogs to users if mTLS is requested unconditionally. draft-jhoyla-req-mtls-flag was presented as a potential way to mitigate this by allowing servers to signal optional client certificate authentication support, potentially avoiding issues for users without (or clients that don't support) certificates. Support for this flag will be needed at all TLS Servers and Clients using this method to positively authorize an interaction.¶
Deployment: Provisioning, managing, and revoking client certificates at scale can be complex, and must be carefully considered by the client owner.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
TODO Security¶
This document has no IANA actions.¶
TODO acknowledge.¶