OCaml HTTP cookie handling library with support for Eio-based storage jars

rfc docs

+126 -7
+79 -2
lib/core/cookeio.mli
··· 45 45 - IP addresses require exact match only 46 46 - Path matching requires exact match or prefix with "/" separator 47 47 48 - @see <https://datatracker.ietf.org/doc/html/rfc6265> RFC 6265 - HTTP State Management Mechanism *) 48 + @see <https://datatracker.ietf.org/doc/html/rfc6265> RFC 6265 - HTTP State Management Mechanism 49 + 50 + {2 Standards and References} 51 + 52 + This library implements and references the following IETF specifications: 53 + 54 + {ul 55 + {- {{:https://datatracker.ietf.org/doc/html/rfc6265}RFC 6265} - 56 + HTTP State Management Mechanism (April 2011) - Primary specification} 57 + {- {{:https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis}RFC 6265bis} - 58 + Cookies: HTTP State Management Mechanism (Draft) - SameSite attribute and modern updates} 59 + {- {{:https://datatracker.ietf.org/doc/html/rfc1034#section-3.5}RFC 1034 Section 3.5} - 60 + Domain Names - Preferred Name Syntax for domain validation} 61 + {- {{:https://datatracker.ietf.org/doc/html/rfc2616#section-2.2}RFC 2616 Section 2.2} - 62 + HTTP/1.1 - Token syntax definition} 63 + {- {{:https://datatracker.ietf.org/doc/html/rfc1123#section-5.2.14}RFC 1123 Section 5.2.14} - 64 + Internet Host Requirements - Date format (rfc1123-date)}} 65 + 66 + Additional standards: 67 + {ul 68 + {- {{:https://publicsuffix.org/}Mozilla Public Suffix List} - Registry 69 + of public suffixes for cookie domain validation per RFC 6265 Section 5.3 Step 5}} *) 49 70 50 71 (** {1 Types} *) 51 72 ··· 265 286 266 287 Validation functions for cookie names, values, and attributes per 267 288 {{:https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1} RFC 6265 Section 4.1.1}. 289 + 290 + These functions implement the syntactic requirements from RFC 6265 to ensure 291 + cookies conform to the specification before being sent in HTTP headers. 292 + All validation failures return detailed error messages citing the specific 293 + RFC requirement that was violated. 294 + 295 + {2 Validation Philosophy} 296 + 297 + Per RFC 6265 Section 4, there is an important distinction between: 298 + - {b Server requirements} (Section 4.1): Strict syntax for generating Set-Cookie headers 299 + - {b User agent requirements} (Section 5): Lenient parsing for receiving Set-Cookie headers 300 + 301 + These validation functions enforce the {b server requirements}, ensuring that 302 + cookies generated by this library conform to RFC 6265 syntax. When parsing 303 + cookies from HTTP headers, the library may be more lenient to maximize 304 + interoperability with non-compliant servers. 305 + 306 + {2 Character Set Requirements} 307 + 308 + RFC 6265 restricts cookies to US-ASCII characters with specific exclusions: 309 + - Cookie names: RFC 2616 tokens (no CTLs, no separators) 310 + - Cookie values: cookie-octet characters (0x21, 0x23-0x2B, 0x2D-0x3A, 0x3C-0x5B, 0x5D-0x7E) 311 + - Domain values: RFC 1034 domain name syntax or IP addresses 312 + - Path values: Any character except CTLs and semicolon 313 + 268 314 These functions return [Ok value] on success or [Error msg] with a detailed 269 315 explanation of why validation failed. 270 316 ··· 277 323 Cookie names must be valid RFC 2616 tokens: one or more characters 278 324 excluding control characters and separators. 279 325 326 + Per {{:https://datatracker.ietf.org/doc/html/rfc2616#section-2.2}RFC 2616 Section 2.2}, 327 + a token is defined as: one or more characters excluding control characters 328 + and the following 19 separator characters: parentheses, angle brackets, at-sign, 329 + comma, semicolon, colon, backslash, double-quote, forward slash, square brackets, 330 + question mark, equals, curly braces, space, and horizontal tab. 331 + 332 + This means tokens consist of visible ASCII characters (33-126) excluding 333 + control characters (0-31, 127) and the separator characters listed above. 334 + 280 335 @param name The cookie name to validate 281 336 @return [Ok name] if valid, [Error message] with explanation if invalid 282 337 283 - @see <https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1> RFC 6265 Section 4.1.1 *) 338 + @see <https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1> RFC 6265 Section 4.1.1 339 + @see <https://datatracker.ietf.org/doc/html/rfc2616#section-2.2> RFC 2616 Section 2.2 - Basic Rules *) 284 340 285 341 val cookie_value : string -> (string, string) result 286 342 (** Validate a cookie value per RFC 6265. ··· 289 345 double quotes. Invalid characters include: control characters, space, 290 346 double quote (except as wrapper), comma, semicolon, and backslash. 291 347 348 + Per {{:https://datatracker.ietf.org/doc/html/rfc6265#section-4.1.1}RFC 6265 Section 4.1.1}, 349 + cookie-value may be: 350 + - Zero or more cookie-octet characters, or 351 + - Double-quoted string containing cookie-octet characters 352 + 353 + Where cookie-octet excludes: CTLs (0x00-0x1F, 0x7F), space (0x20), 354 + double-quote (0x22), comma (0x2C), semicolon (0x3B), and backslash (0x5C). 355 + 356 + Valid cookie-octet characters: 0x21, 0x23-0x2B, 0x2D-0x3A, 0x3C-0x5B, 0x5D-0x7E 357 + 292 358 @param value The cookie value to validate 293 359 @return [Ok value] if valid, [Error message] with explanation if invalid 294 360 ··· 301 367 - A valid domain name per RFC 1034 Section 3.5 302 368 - A valid IPv4 address 303 369 - A valid IPv6 address 370 + 371 + Per {{:https://datatracker.ietf.org/doc/html/rfc1034#section-3.5}RFC 1034 Section 3.5}, 372 + preferred domain name syntax requires: 373 + - Labels separated by dots 374 + - Labels must start with a letter 375 + - Labels must end with a letter or digit 376 + - Labels may contain letters, digits, and hyphens 377 + - Labels are case-insensitive 378 + - Total length limited to 255 octets 379 + 380 + Leading dots are stripped per RFC 6265 Section 5.2.3 before validation. 304 381 305 382 @param domain The domain value to validate (leading dot is stripped first) 306 383 @return [Ok domain] if valid, [Error message] with explanation if invalid
+47 -5
lib/jar/cookeio_jar.mli
··· 18 18 - Delta tracking for Set-Cookie headers 19 19 - Mozilla format persistence for cross-tool compatibility 20 20 21 - @see <https://datatracker.ietf.org/doc/html/rfc6265> RFC 6265 - HTTP State Management Mechanism *) 21 + @see <https://datatracker.ietf.org/doc/html/rfc6265> RFC 6265 - HTTP State Management Mechanism 22 + 23 + {2 Standards and References} 24 + 25 + This cookie jar implements the storage model from: 26 + 27 + {ul 28 + {- {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.3}RFC 6265 Section 5.3} - 29 + Storage Model - Cookie insertion, replacement, and expiration} 30 + {- {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.4}RFC 6265 Section 5.4} - 31 + The Cookie Header - Cookie retrieval and ordering}} 32 + 33 + Key RFC 6265 requirements implemented: 34 + {ul 35 + {- Domain matching per {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.1.3}Section 5.1.3}} 36 + {- Path matching per {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.1.4}Section 5.1.4}} 37 + {- Cookie ordering per {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.4}Section 5.4 Step 2}} 38 + {- Creation time preservation per {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.3}Section 5.3 Step 11.3}}} *) 22 39 23 40 type t 24 41 (** Cookie jar for storing and managing cookies. ··· 101 118 Cookeio.t list 102 119 (** Get cookies applicable for a URL. 103 120 104 - Returns all cookies that match the given domain and path, and satisfy the 105 - secure flag requirement. Combines original and delta cookies, with delta 106 - taking precedence. Excludes: 121 + Implements the cookie retrieval algorithm from 122 + {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.4}RFC 6265 Section 5.4} 123 + for generating the Cookie header. 124 + 125 + {3 Algorithm} 126 + 127 + Per RFC 6265 Section 5.4, the user agent should: 128 + 1. Filter cookies by domain matching (Section 5.1.3) 129 + 2. Filter cookies by path matching (Section 5.1.4) 130 + 3. Filter out cookies with Secure attribute when request is non-secure 131 + 4. Filter out expired cookies 132 + 5. Sort remaining cookies (longer paths first, then by creation time) 133 + 6. Update last-access-time for retrieved cookies 134 + 135 + This function implements all these steps, combining original and delta cookies 136 + with delta taking precedence. Excludes: 107 137 - Removal cookies (empty value) 108 138 - Expired cookies (expiry-time in the past per Section 5.3) 139 + - Secure cookies when [is_secure = false] 140 + 141 + {3 Cookie Ordering} 109 142 110 143 Cookies are sorted per Section 5.4, Step 2: 111 144 - Cookies with longer paths are listed before cookies with shorter paths 112 145 - Among cookies with equal-length paths, cookies with earlier creation-times 113 146 are listed first 114 147 115 - Also updates the last access time of matching cookies using the provided clock. 148 + This ordering ensures more specific cookies take precedence. 149 + 150 + {3 Matching Rules} 116 151 117 152 Domain matching follows {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.1.3} Section 5.1.3}: 118 153 - IP addresses require exact match only 119 154 - Hostnames support subdomain matching unless host-only flag is set 120 155 121 156 Path matching follows {{:https://datatracker.ietf.org/doc/html/rfc6265#section-5.1.4} Section 5.1.4}. 157 + 158 + @param t Cookie jar 159 + @param clock Clock for updating last-access-time 160 + @param domain Request domain 161 + @param path Request path 162 + @param is_secure Whether the request is over a secure channel (HTTPS) 163 + @return List of matching cookies, sorted per RFC 6265 122 164 123 165 @see <https://datatracker.ietf.org/doc/html/rfc6265#section-5.3> RFC 6265 Section 5.3 - Storage Model (expiry) 124 166 @see <https://datatracker.ietf.org/doc/html/rfc6265#section-5.4> RFC 6265 Section 5.4 - The Cookie Header *)