···11+ISC License
22+33+Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>
44+55+Permission to use, copy, modify, and distribute this software for any
66+purpose with or without fee is hereby granted, provided that the above
77+copyright notice and this permission notice appear in all copies.
88+99+THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
1010+WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
1111+MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
1212+ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
1313+WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
1414+ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
1515+OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+63-11
README.md
···11-# ocaml-publicsuffix
11+# ocaml-publicsuffix - Public Suffix List for OCaml
2233-Public Suffix List implementation for OCaml.
33+An OCaml library for parsing and querying the Mozilla Public Suffix List (PSL) to determine public suffixes and registrable domains. This library implements the algorithm specified at [publicsuffix.org](https://publicsuffix.org/list/) and provides efficient lookups using a pre-compiled trie data structure.
4455-Parse and query the Mozilla Public Suffix List (PSL) to determine public suffixes and registrable domains. Supports ICANN and private domain sections, wildcard rules, and exception rules per the PSL specification.
55+## Key Features
6677-## Data File
77+- **Complete PSL Support**: Handles ICANN and private domain sections
88+- **Full Rule Coverage**: Supports normal rules, wildcard rules (e.g., `*.uk`), and exception rules (e.g., `!parliament.uk`)
99+- **Efficient Lookups**: Pre-compiled trie structure for fast domain matching
1010+- **Punycode Support**: Automatic handling of internationalized domain names via the `punycode` library
1111+- **Type Safety**: Uses the `domain-name` library for validated domain representations
81299-The `data/public_suffix_list.dat` file contains the public suffix list data. This file can be fetched from:
1313+## Usage
10141111-https://publicsuffix.org/list/public_suffix_list.dat
1515+```ocaml
1616+(* Determine the registrable domain (public suffix + one label) *)
1717+let domain = Domain_name.of_string_exn "www.example.com" in
1818+match Publicsuffix.registrable_domain domain with
1919+| Some reg_domain -> Format.printf "Registrable: %a\n" Domain_name.pp reg_domain
2020+| None -> Format.printf "No registrable domain\n"
2121+(* Output: Registrable: example.com *)
12221313-To update the data file to the latest version:
2323+(* Find the public suffix *)
2424+match Publicsuffix.public_suffix domain with
2525+| Some suffix -> Format.printf "Suffix: %a\n" Domain_name.pp suffix
2626+| None -> Format.printf "No public suffix\n"
2727+(* Output: Suffix: com *)
14281515-```bash
1616-curl -o data/public_suffix_list.dat https://publicsuffix.org/list/public_suffix_list.dat
2929+(* Check if a domain is itself a public suffix *)
3030+let is_suffix = Publicsuffix.is_public_suffix domain in
3131+Format.printf "Is public suffix: %b\n" is_suffix
3232+(* Output: Is public suffix: false *)
1733```
18341919-## Building
3535+For domains with wildcards and exceptions:
3636+3737+```ocaml
3838+(* Example with wildcard rule: *.uk *)
3939+let domain = Domain_name.of_string_exn "example.uk" in
4040+match Publicsuffix.public_suffix domain with
4141+| Some suffix -> Format.printf "Suffix: %a\n" Domain_name.pp suffix
4242+| None -> ()
4343+(* Output: Suffix: uk *)
4444+4545+(* Example with exception rule: !parliament.uk *)
4646+let domain = Domain_name.of_string_exn "parliament.uk" in
4747+match Publicsuffix.registrable_domain domain with
4848+| Some reg_domain -> Format.printf "Registrable: %a\n" Domain_name.pp reg_domain
4949+| None -> ()
5050+(* Output: Registrable: parliament.uk *)
5151+```
5252+5353+## Installation
5454+5555+```
5656+opam install publicsuffix
5757+```
5858+5959+## Updating the Public Suffix List Data
6060+6161+The `data/public_suffix_list.dat` file contains the PSL data, which is compiled into the library at build time. To update to the latest version:
20622163```bash
2222-opam exec -- dune build @check
6464+curl -o data/public_suffix_list.dat https://publicsuffix.org/list/public_suffix_list.dat
6565+opam exec -- dune build
2366```
24672568## Documentation
6969+7070+API documentation is available via:
7171+7272+```
7373+opam install publicsuffix
7474+odig doc publicsuffix
7575+```
7676+7777+Or build locally:
26782779```bash
2880opam exec -- dune build @doc
···11+(*---------------------------------------------------------------------------
22+ Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>. All rights reserved.
33+ SPDX-License-Identifier: ISC
44+ ---------------------------------------------------------------------------*)
55+16(* gen_psl.ml - Generate OCaml code from public_suffix_list.dat
2738 This parser reads the Public Suffix List and generates OCaml source code
+5
lib/publicsuffix.ml
···11+(*---------------------------------------------------------------------------
22+ Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>. All rights reserved.
33+ SPDX-License-Identifier: ISC
44+ ---------------------------------------------------------------------------*)
55+16(* publicsuffix.ml - Public Suffix List implementation for OCaml
2738 This implements the PSL algorithm as specified at:
+5
lib/publicsuffix.mli
···11+(*---------------------------------------------------------------------------
22+ Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>. All rights reserved.
33+ SPDX-License-Identifier: ISC
44+ ---------------------------------------------------------------------------*)
55+16(** Public Suffix List implementation for OCaml
2738 This library provides functions to query the Mozilla Public Suffix List (PSL)
+5-6
publicsuffix.opam
···11# This file is generated by dune, edit dune-project instead
22opam-version: "2.0"
33-version: "0.1.0"
43synopsis: "Public Suffix List implementation for OCaml"
54description:
65 "Parse and query the Mozilla Public Suffix List (PSL) to determine public suffixes and registrable domains. Supports ICANN and private domain sections, wildcard rules, and exception rules per the PSL specification."
77-maintainer: ["Anil Madhavapeddy"]
66+maintainer: ["Anil Madhavapeddy <anil@recoil.org>"]
87authors: ["Anil Madhavapeddy"]
98license: "ISC"
1010-homepage: "https://github.com/avsm/ocaml-publicsuffix"
1111-bug-reports: "https://github.com/avsm/ocaml-publicsuffix/issues"
99+homepage: "https://tangled.org/@anil.recoil.org/ocaml-publicsuffix"
1010+bug-reports: "https://tangled.org/@anil.recoil.org/ocaml-publicsuffix/issues"
1211depends: [
1312 "ocaml" {>= "4.14.0"}
1414- "dune" {>= "3.0" & >= "3.0"}
1313+ "dune" {>= "3.18" & >= "3.0"}
1514 "domain-name" {>= "0.4.0"}
1615 "punycode" {>= "0.1.0"}
1716 "alcotest" {with-test}
···3130 "@doc" {with-doc}
3231 ]
3332]
3434-dev-repo: "git+https://github.com/avsm/ocaml-publicsuffix.git"
3333+x-maintenance-intent: ["(latest)"]
+5
test/psl_test.ml
···11+(*---------------------------------------------------------------------------
22+ Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>. All rights reserved.
33+ SPDX-License-Identifier: ISC
44+ ---------------------------------------------------------------------------*)
55+16(* psl_test.ml - Command-line tool for testing the Public Suffix List library
2738 Usage: