Phase 3, Issue 4: HTML Tree Builder#
Implement a basic HTML tree builder that constructs a DOM tree from tokenizer output.
Requirements#
Build a tree builder that processes tokens and constructs a DOM tree, handling the subset of elements needed for Phase 3.
Supported elements:
<html>,<head>,<body>— structural<title>— document title<p>,<h1>through<h6>— block-level text<div>— generic block container<span>— generic inline container<a>— hyperlink (inline)<br>— line break (void element)<pre>— preformatted text- Text nodes and comment nodes
Tree builder behavior:
- Maintains a stack of open elements
- Implicit element insertion (e.g., missing
<html>,<head>,<body>) - Void elements (
<br>) are immediately popped - Handles basic misnesting (e.g.,
<p>inside<p>closes the outer) - Foster parenting not required for Phase 3
<title>captures text content
API:
parse_html(input: &str) -> Document— convenience functionTreeBuilder::new() -> TreeBuilderTreeBuilder::process_token(token: Token)— feed tokens one at a timeTreeBuilder::finish() -> Document— return the built DOM
Acceptance criteria#
- Parses
<!DOCTYPE html><html><head><title>Test</title></head><body><p>Hello</p></body></html> - Handles implicit element insertion for minimal documents like
<p>Hello -
<br>is handled as void element - Nested elements create proper parent-child relationships
- Text nodes are created for character tokens
-
cargo clippy -p we-html -- -D warningspasses -
cargo test -p we-htmlpasses with tree builder tests - No unsafe code
- No external dependencies