### Beginner Reading Path Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This path guides new users through the documentation, starting with an overview and progressing to practical examples and core API references. ```markdown README.md ↓ quick-start.md ↓ api-reference-parser.md ↓ api-reference-handler.md ↓ event-reference.md ``` -------------------------------- ### Configuration Reading Path Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This path guides users through understanding parser configuration, starting with the main configuration file and referencing parser constructor and type options. ```markdown configuration.md ↓ api-reference-parser.md (Constructor) ↓ types.md (ParserOptions) ``` -------------------------------- ### Install htmlparser2 using npm Source: https://github.com/fb55/htmlparser2/blob/master/README.md Install the htmlparser2 package using npm. This is the first step to using the library. ```bash npm install htmlparser2 ``` -------------------------------- ### Stream User Reading Path Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This path focuses on using the library with streams, covering quick start examples, writable streams, and entry points relevant to stream processing. ```markdown quick-start.md (Streaming section) ↓ api-reference-writable-streams.md ↓ api-reference-entry-points.md ``` -------------------------------- ### oncdatastart() Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-handler.md Callback fired at the opening of a CDATA section. Indicates the start of character data. ```APIDOC ## oncdatastart() ### Description Fires at the opening of a CDATA section. Called when ` Xyz JS! Hooray! --> const foo = '<>'; That's it?! ``` -------------------------------- ### onprocessinginstruction Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when a processing instruction is identified. It receives the start and end indices. ```APIDOC ## onprocessinginstruction(start, endIndex) ### Description Processing instruction identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onprocessinginstruction(start: number, endIndex: number): void ``` ``` -------------------------------- ### Incremental Parsing Example Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/INDEX.md Demonstrates how to parse HTML input incrementally by writing chunks to the parser and then ending the parsing process. ```typescript parser.write(chunk1); parser.write(chunk2); parser.end(); ``` -------------------------------- ### onprocessinginstruction Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when a processing instruction is identified. Provides the start and end indices. ```typescript onprocessinginstruction(start: number, endIndex: number): void ``` -------------------------------- ### Create and Use Node.js WritableStream Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-writable-streams.md Instantiate a WritableStream to parse HTML content from a readable stream. This example demonstrates piping an HTML file into the parser stream and handling parsing completion. ```typescript import { WritableStream } from "htmlparser2/WritableStream"; import fs from "fs"; const parserStream = new WritableStream({ onopentag(name, attribs) { console.log(`<${name}>`); }, ontext(text) { console.log("Text:", text); }, }); const htmlStream = fs.createReadStream("./page.html"); htmlStream.pipe(parserStream).on("finish", () => { console.log("Parsing complete"); }); ``` -------------------------------- ### Browser/Deno API Response Streaming with WebWritableStream Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-writable-streams.md Utilize WebWritableStream to stream API responses in browser or Deno environments. This example fetches data and extracts trimmed text content. ```typescript import { WebWritableStream } from "htmlparser2/WebWritableStream"; const titles = []; const parser = new WebWritableStream({ ontext(text) { if (text.trim()) { titles.push(text.trim()); } }, }); const response = await fetch("https://example.com"); await response.body.pipeTo(parser); console.log("Titles:", titles); ``` -------------------------------- ### onopentagname Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when an opening tag name is identified. It receives the start and end indices of the tag name. ```APIDOC ## onopentagname(start, endIndex) ### Description Opening tag name identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onopentagname(start: number, endIndex: number): void ``` ``` -------------------------------- ### Add Source Position Tracking with withStartIndices Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/configuration.md Enable tracking of the start index for each node in the parsed DOM. Useful for debugging or analyzing source code locations. ```typescript const dom = parseDocument(html, { withStartIndices: true }); // Each node now has startIndex property ``` -------------------------------- ### onopentagname Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when the name of an opening tag is identified. Provides the start and end indices of the tag name. ```typescript onopentagname(start: number, endIndex: number): void ``` -------------------------------- ### Node.js File Streaming with WritableStream Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-writable-streams.md Use WritableStream to process large HTML files in Node.js. This example streams a file and collects element names and attributes. ```typescript import { WritableStream } from "htmlparser2/WritableStream"; import fs from "fs"; import path from "path"; const elements = []; const parser = new WritableStream({ onopentag(name, attribs) { elements.push({ name, attribs }); }, }); fs.createReadStream(path.join(import.meta.dirname, "large.html")) .pipe(parser) .on("finish", () => { console.log("Found elements:", elements); }); ``` -------------------------------- ### Tokenizer Constructor and Usage Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Demonstrates how to create a new Tokenizer instance with specified options and callbacks, and how to feed it HTML content for tokenization. Use this for advanced or custom parsing needs. ```typescript import Tokenizer from "htmlparser2/Tokenizer"; const tokenizer = new Tokenizer( { xmlMode: false, decodeEntities: true }, { ontext(start, endIndex) { console.log("Text token from", start, "to", endIndex); }, onopentagname(start, endIndex) { console.log("Tag name token from", start, "to", endIndex); }, onend() { console.log("Tokenizing complete"); }, } ); tokenizer.write("

Hello

"); tokenizer.end(); ``` -------------------------------- ### ondeclaration Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when a declaration (DOCTYPE) is identified. It receives the start and end indices. ```APIDOC ## ondeclaration(start, endIndex) ### Description Declaration (DOCTYPE) identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript ondeclaration(start: number, endIndex: number): void ``` ``` -------------------------------- ### Initialize htmlparser2 Parser Source: https://github.com/fb55/htmlparser2/wiki/Parser-options Instantiate a new HTML parser with a handler and optional configuration options. ```javascript const parser = new htmlparser.Parser(handler /*: Object */, options /*?: Object */); ``` -------------------------------- ### ondeclaration Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when a declaration (DOCTYPE) is identified. Provides the start and end indices. ```typescript ondeclaration(start: number, endIndex: number): void ``` -------------------------------- ### Instantiate Parser with Callbacks and Options Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-parser.md Create a new Parser instance. Callbacks and options are optional. Processing begins when `write()` is called. ```typescript import { Parser } from "htmlparser2"; const parser = new Parser({ onopentag(name, attribs) { console.log(`Found opening tag: ${name}`); }, ontext(text) { console.log(`Found text: ${text}`); }, onclosetag(name) { console.log(`Found closing tag: ${name}`); }, }); parser.write("
Hello
"); parser.end(); ``` -------------------------------- ### oncdata Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked for CDATA section content. It receives the start and end indices, and an offset. ```APIDOC ## oncdata(start, endIndex, offset) ### Description CDATA section content. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). - **offset** (number) - Required - Offset within the buffer. ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript oncdata(start: number, endIndex: number, offset: number): void ``` ``` -------------------------------- ### onattribdata Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked for attribute value data. It receives the start and end indices of the data. ```APIDOC ## onattribdata(start, endIndex) ### Description Attribute value data. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onattribdata(start: number, endIndex: number): void ``` May be called multiple times for a single attribute if the value contains entities or special sequences. ``` -------------------------------- ### oncdata Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called for CDATA section content. Provides the start and end indices and an offset. ```typescript oncdata(start: number, endIndex: number, offset: number): void ``` -------------------------------- ### onparserinit(parser) Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-handler.md Fires when the parser is initialized or reset. Useful for resetting handler state. ```APIDOC ## onparserinit(parser) ### Description Fires when the parser is initialized or reset. Useful for resetting handler state. ### Method Signature ```typescript onparserinit(parser: Parser): void ``` ### Parameters #### Path Parameters - **parser** (Parser) - Required - The parser instance that was initialized. ### Example ```typescript const handler = { onparserinit(parser) { console.log("Parser initialized"); }, }; const parser = new Parser(handler); ``` ``` -------------------------------- ### oncomment Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when comment content is identified. It receives the start and end indices of the comment, and an offset. ```APIDOC ## oncomment(start, endIndex, offset) ### Description Comment content identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). - **offset** (number) - Required - Offset within the buffer. ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript oncomment(start: number, endIndex: number, offset: number): void ``` ``` -------------------------------- ### Initialize WebWritableStream Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/INDEX.md Adapter for the Web Streams API, enabling HTML parsing within modern web stream environments. Requires a Handler and optional ParserOptions. ```typescript new WebWritableStream(cbs: Handler, options?: ParserOptions) ``` -------------------------------- ### onattribname Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when an attribute name is identified. It receives the start and end indices of the attribute name. ```APIDOC ## onattribname(start, endIndex) ### Description Attribute name identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onattribname(start: number, endIndex: number): void ``` ``` -------------------------------- ### Configuration Options Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/COMPLETION_SUMMARY.txt Details on the various options available for configuring the parser. ```APIDOC ## Parser Options ### Description Configuration options that can be passed to the Parser constructor or entry point functions. ### Options - **xmlMode** (boolean) - If true, operates in XML mode. - **decodeEntities** (boolean) - If true, decodes HTML entities. - **lowerCaseTags** (boolean) - If true, converts tag names to lowercase. - **lowerCaseAttributeNames** (boolean) - If true, converts attribute names to lowercase. - **recognizeCDATA** (boolean) - If true, recognizes CDATA sections. - **recognizeSelfClosing** (boolean) - If true, recognizes self-closing tags. - **Tokenizer** (custom class) - Allows providing a custom tokenizer. - **[Other 4 configuration profiles]** ``` -------------------------------- ### onclosetag Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when a closing tag is identified. It receives the start and end indices of the tag name. ```APIDOC ## onclosetag(start, endIndex) ### Description Closing tag identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onclosetag(start: number, endIndex: number): void ``` ``` -------------------------------- ### onprocessinginstruction(name, data) Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-handler.md Callback for processing instructions. Handles XML processing instructions and DOCTYPE declarations. ```APIDOC ## onprocessinginstruction(name, data) ### Description Fires for processing instructions. Called for XML processing instructions (e.g., ``) and DOCTYPE declarations (e.g., ``) ### Parameters #### Path Parameters - None #### Query Parameters - None #### Request Body - None ### Method Callback ### Endpoint N/A ### Request Example ```typescript const handler = { onprocessinginstruction(name, data) { if (name === "?xml") { console.log("XML declaration:", data); } }, }; ``` ### Response #### Success Response (200) - None #### Response Example - None ``` -------------------------------- ### oncomment Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when comment content is identified. Provides the start and end indices of the comment, and an offset. ```typescript oncomment(start: number, endIndex: number, offset: number): void ``` -------------------------------- ### onattribname Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when an attribute name is identified. Provides the start and end indices of the attribute name. ```typescript onattribname(start: number, endIndex: number): void ``` -------------------------------- ### Tokenizer Instance Properties Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Details the properties of the tokenizer instance. ```APIDOC ## Instance Properties ### `running` Indicates whether the tokenizer is currently processing tokens. Set to `false` when `pause()` is called, `true` when `resume()` is called. ```typescript running: boolean ``` ``` -------------------------------- ### onclosetag Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when a closing tag is identified. Provides the start and end indices of the tag name. ```typescript onclosetag(start: number, endIndex: number): void ``` -------------------------------- ### resume() Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Resumes tokenization after it has been paused, setting the `running` property back to `true` and continuing processing from where it left off. ```APIDOC ### `resume()` Resumes tokenization after being paused. ```typescript resume(): void ``` **Description:** Resumes processing from where it was paused. Sets `running` back to `true`. ``` -------------------------------- ### Parser Constructor Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/README.md Instantiates a new Parser. You can provide optional callbacks and options to customize the parsing behavior. ```APIDOC ## Parser Constructor ### Description Instantiates a new Parser. You can provide optional callbacks and options to customize the parsing behavior. ### Signature ```typescript new Parser(cbs?, options?) ``` ### Parameters - `cbs` (object) - Handler callbacks (optional) - `options` (object) - Parser options (optional) ### Options - `xmlMode` (boolean) - Set to true for XML parsing. Defaults to false. - `decodeEntities` (boolean) - Set to true to decode HTML entities. Defaults to true. - `lowerCaseTags` (boolean) - Set to true to convert tag names to lowercase. Defaults to true. - `lowerCaseAttributeNames` (boolean) - Set to true to convert attribute names to lowercase. Defaults to true. - `recognizeCDATA` (boolean) - Set to true to recognize CDATA sections. Defaults to false. - `recognizeSelfClosing` (boolean) - Set to true to recognize self-closing tags like ``. Defaults to true. - `Tokenizer` (object) - Allows providing a custom tokenizer. ``` -------------------------------- ### Entry Point Functions Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/COMPLETION_SUMMARY.txt These are the primary functions exposed by the htmlparser2 module for parsing HTML content. ```APIDOC ## parseDocument() ### Description Parses an HTML string and returns a DOM structure. ### Method ``` function parseDocument(html: string, options?: ParserOptions): DOMNode ``` ### Parameters - **html** (string) - Required - The HTML string to parse. - **options** (ParserOptions) - Optional - Configuration options for the parser. ``` ```APIDOC ## createDocumentStream() ### Description Creates a parser instance that can be used with DOM handlers, suitable for stream processing. ### Method ``` function createDocumentStream(options?: ParserOptions): { parser: Parser, dom: DOMHandler } ``` ### Parameters - **options** (ParserOptions) - Optional - Configuration options for the parser. ``` ```APIDOC ## parseFeed() ### Description Parses RSS, Atom, or RDF feed content into a structured object. ### Method ``` function parseFeed(feed: string, options?: ParserOptions): Feed ``` ### Parameters - **feed** (string) - Required - The feed content string. - **options** (ParserOptions) - Optional - Configuration options for the parser. ``` -------------------------------- ### ontext Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when text content is identified. It receives the start and end indices of the text within the buffer. ```APIDOC ## ontext(start, endIndex) ### Description Text content identified. ### Method Callback ### Parameters #### Path Parameters - **start** (number) - Required - Start index in the buffer. - **endIndex** (number) - Required - End index (exclusive). ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript ontext(start: number, endIndex: number): void ``` ``` -------------------------------- ### Create and Use WebWritableStream with fetch Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-writable-streams.md Instantiates a WebWritableStream with handler callbacks and pipes the response body from a fetch request to it for parsing. Requires a runtime with Web Streams API support. ```typescript import { WebWritableStream } from "htmlparser2/WebWritableStream"; const stream = new WebWritableStream({ onopentag(name, attribs) { console.log("Opened:", name); }, ontext(text) { console.log("Text:", text); }, }); const response = await fetch("https://example.com"); await response.body.pipeTo(stream); console.log("Fetch and parse complete"); ``` -------------------------------- ### ontext Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called when text content is identified. Provides the start and end indices of the text within the buffer. ```typescript ontext(start: number, endIndex: number): void ``` -------------------------------- ### Initialize WritableStream Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/INDEX.md Adapter for Node.js streams, allowing HTML parsing to be integrated with stream pipelines. Requires a Handler and optional ParserOptions. ```typescript new WritableStream(cbs: Handler, options?: ParserOptions) ``` -------------------------------- ### Basic HTML Parsing with Callbacks Source: https://github.com/fb55/htmlparser2/blob/master/README.md Demonstrates how to use the htmlparser2 Parser with a callback interface to process HTML content. Handles opening tags, text content, and closing tags. Note that text events may need to be aggregated. ```javascript import * as htmlparser2 from "htmlparser2"; const parser = new htmlparser2.Parser({ onopentag(name, attributes) { /* * This fires when a new tag is opened. * * If you don't need an aggregated `attributes` object, * have a look at the `onopentagname` and `onattribute` events. */ if (name === "script" && attributes.type === "text/javascript") { console.log("JS! Hooray!"); } }, ontext(text) { /* * Fires whenever a section of text was processed. * * Note that this can fire at any point within text and you might * have to stitch together multiple pieces. */ console.log("-->", text); }, onclosetag(tagname) { /* * Fires when a tag is closed. * * You can rely on this event only firing when you have received an * equivalent opening tag before. Closing tags without corresponding * opening tags will be ignored. */ if (tagname === "script") { console.log("That's it?!"); } }, }); parser.write( "Xyz ", ); parser.end(); ``` -------------------------------- ### onopentag(name, attribs, isImplied) Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-handler.md Fires when an opening tag is encountered, after all its attributes have been parsed. This is the main event for processing elements. ```APIDOC ## onopentag(name, attribs, isImplied) ### Description Fires when an opening tag is encountered, after all its attributes have been parsed. This is the main event for processing elements. ### Method Signature ```typescript onopentag( name: string, attribs: { [s: string]: string }, isImplied: boolean ): void ``` ### Parameters #### Path Parameters - **name** (string) - Required - The tag name (lowercased in HTML mode if `lowerCaseTags` is enabled). - **attribs** ({ [s: string]: string }) - Required - Object mapping attribute names to values. Empty if no attributes. - **isImplied** (boolean) - Required - `true` if the tag was opened implicitly (HTML mode only), `false` if explicit. ### Example ```typescript const handler = { onopentag(name, attribs, isImplied) { console.log(`<${name}>`, attribs, isImplied ? "(implicit)" : ""); if (name === "a") { console.log("Link href:", attribs.href); } }, }; ``` ``` -------------------------------- ### Example Usage of QuoteType Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/types.md Demonstrates how to check the quote type in the onattribute callback. The `quote` parameter can be undefined if the attribute has no value. ```typescript const handler = { onattribute(name, value, quote) { if (quote === QuoteType.Double) { console.log(`Double-quoted: ${name}="${value}"`); } else if (quote === undefined) { console.log(`No value: ${name}`); } }, }; ``` -------------------------------- ### Streaming with Pause and Resume Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/INDEX.md Shows how to use pause and resume methods for backpressure management in a streaming context, allowing control over parsing flow. ```typescript parser.pause(); setTimeout(() => parser.resume(), 1000); ``` -------------------------------- ### API User Reading Path Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This path is for users who need to interact with the library's API, covering parser, handler, entry points, types, and configuration details. ```markdown api-reference-parser.md ↓ api-reference-handler.md ↓ api-reference-entry-points.md ↓ types.md ↓ configuration.md ``` -------------------------------- ### Disable Entity Decoding Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/quick-start.md Prevent automatic decoding of HTML entities by setting `decodeEntities: false`. For example, `&` will remain as is. ```typescript new Parser(handler, { decodeEntities: false }); ``` -------------------------------- ### Parser Constructor Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-parser.md Initializes a new instance of the Parser class. It accepts an optional callback object for handling parsing events and an optional configuration object for parser options. The parser begins processing input only after the `write()` method is invoked. ```APIDOC ## new Parser(cbs?, options?) ### Description Creates a new parser instance with optional callbacks and configuration. The parser will not begin processing until `write()` is called. ### Method Constructor ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Request Example ```typescript import { Parser } from "htmlparser2"; const parser = new Parser({ onopentag(name, attribs) { console.log(`Found opening tag: ${name}`); }, ontext(text) { console.log(`Found text: ${text}`); }, onclosetag(name) { console.log(`Found closing tag: ${name}`); }, }); parser.write("
Hello
"); parser.end(); ``` ### Response #### Success Response (Constructor) N/A #### Response Example N/A ``` -------------------------------- ### Parser Constructor Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/README.md Instantiate the Parser with optional callbacks and options. Use options to configure parsing behavior like XML mode or entity decoding. ```typescript new Parser(cbs?, options?) ``` -------------------------------- ### onopentagend Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Callback invoked when an opening tag is complete. It receives the end index of the tag. ```APIDOC ## onopentagend(endIndex) ### Description Opening tag complete. ### Method Callback ### Parameters #### Path Parameters - **endIndex** (number) - Required - End index of the opening tag. ### Response #### Success Response - **void** - No return value. ### Request Example ```typescript onopentagend(endIndex: number): void ``` ``` -------------------------------- ### ontext(data) Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/event-reference.md Fired when text content is encountered. This event can fire multiple times for a single text node, and multiple calls should be concatenated to get the full text. ```APIDOC ## ontext(data) ### Description Fired when text content is encountered. This event can fire multiple times for a single text node, and multiple calls should be concatenated to get the full text. Entities are decoded if `decodeEntities` is true. ### Method `ontext(data: string): void` ### Parameters #### Parameters - **data** (string) - Required - Text content. Entities are decoded if `decodeEntities: true`. ### Example ```typescript let text = ""; const handler = { ontext(data) { text += data; // Accumulate }, onend() { console.log("Full text:", text); }, }; const parser = new Parser(handler); parser.write("

Hello "); parser.write("& "); parser.write("goodbye

"); parser.end(); // Output: Full text: Hello & goodbye ``` ``` -------------------------------- ### Parser Initialization Source: https://github.com/fb55/htmlparser2/wiki/Parser-options The htmlparser.Parser can be initialized with a handler object and an optional options object to customize parsing behavior. ```APIDOC ## Parser Initialization ### Description Instantiate a new HTML parser with a handler and optional configuration options. ### Method `new htmlparser.Parser(handler, options)` ### Parameters * `handler` (Object) - An object containing callback functions for parsing events. * `options` (Object) - Optional. An object containing configuration options for the parser. ``` -------------------------------- ### Find Element by ID using DomUtils Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/README.md This example shows how to find a specific DOM element by its ID after parsing an HTML string. It uses the `parseDocument` function and `DomUtils.getElementById` utility. ```typescript const dom = parseDocument(html); const element = DomUtils.getElementById("target", dom); ``` -------------------------------- ### Initialize Parser Event Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/event-reference.md Fired when the parser is created or reset. Use this to initialize handler state or store a reference to the parser instance. ```typescript const handler = { onparserinit(parser) { console.log("Parser ready"); // Reset handler state this.elements = []; }, }; ``` -------------------------------- ### onattribdata Callback Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-tokenizer.md Called for attribute value data. May be invoked multiple times for a single attribute if its value contains entities or special sequences. Provides start and end indices. ```typescript onattribdata(start: number, endIndex: number): void ``` -------------------------------- ### resume() Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-parser.md Resumes parsing after being paused. If more data was buffered while paused, it will be processed. ```APIDOC ## resume() ### Description Resumes parsing after a previous call to `pause()`. If more data was buffered while paused, it will be processed. ### Method resume ### Parameters #### Path Parameters - None #### Query Parameters - None #### Request Body - None ### Request Example ```typescript parser.resume(); ``` ### Response #### Success Response (void) - Returns void. #### Response Example - None ``` -------------------------------- ### Custom Void Element Handling Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-parser.md Example of extending the Parser class to add custom void elements. This is useful for frameworks or specific HTML dialects that introduce new self-closing tags. ```typescript class CustomParser extends Parser { protected isVoidElement(name: string): boolean { if (name === "custom-void") return true; return super.isVoidElement(name); } } ``` -------------------------------- ### Advanced User Reading Path Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This path is for advanced users interested in the internal architecture, tokenizer, and type definitions, including callback interfaces. ```markdown module-overview.md ↓ api-reference-tokenizer.md ↓ types.md (Callbacks interface) ``` -------------------------------- ### Documentation Directory Structure Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/STRUCTURE.md This tree displays the organization of the documentation files within the project. It serves as a map to locate specific documentation topics. ```markdown documentation/ │ ├── README.md ← START HERE │ Overview, quick reference, key concepts │ ├── INDEX.md ← NAVIGATION │ Complete index and cross-reference guide │ ├── STRUCTURE.md ← THIS FILE │ Documentation organization │ ├── quick-start.md ← TUTORIALS │ Practical examples and patterns │ ├── api-reference-*.md ← API DOCS │ ├── api-reference-parser.md │ ├── api-reference-handler.md │ ├── api-reference-entry-points.md │ ├── api-reference-writable-streams.md │ └── api-reference-tokenizer.md │ ├── types.md ← TYPES │ All exported type definitions │ ├── configuration.md ← OPTIONS │ All parser configuration options │ ├── event-reference.md ← EVENTS │ Complete event catalog │ ├── errors.md ← ERROR HANDLING │ Error conditions and patterns │ └── module-overview.md ← ARCHITECTURE Module structure and data flow ``` -------------------------------- ### Close Tag Handler with Stack Management Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/api-reference-handler.md Implement the onclosetag callback to handle closing tags, including implicitly closed tags. This example uses a stack to ensure proper tag nesting. ```typescript const stack = []; const handler = { onopentag(name) { stack.push(name); }, onclosetag(name, isImplied) { const popped = stack.pop(); if (popped !== name) { console.warn(`Mismatch: expected , got `); } }, }; ``` -------------------------------- ### Parser Class Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/COMPLETION_SUMMARY.txt Documentation for the core Parser class, including its constructor and methods. ```APIDOC ## Parser Class ### Description The main class for parsing HTML. It processes HTML input and emits events via a handler. ### Constructor ``` new Parser(handler: Handler, options?: ParserOptions) ``` ### Methods - **reset()**: Resets the parser to its initial state. - **write(chunk: string)**: Writes a chunk of HTML to the parser. - **end()**: Signals the end of the HTML input. - **[Other 4 methods]** ### Properties - **[Property 1]** - **[Property 2]** ### Protected Methods - **[Protected Method 1]** ``` -------------------------------- ### Default HTML Parser Configuration Source: https://github.com/fb55/htmlparser2/blob/master/_autodocs/quick-start.md Instantiate the parser with a handler for default HTML parsing, which includes lowercasing tags and decoding entities. ```typescript new Parser(handler); ``` -------------------------------- ### Parser Options Source: https://github.com/fb55/htmlparser2/wiki/Parser-options Configuration options to customize the HTML parsing process. ```APIDOC ## Parser Options ### `xmlMode` * **Type**: `boolean` * **Default**: `false` * **Description**: Indicates whether special tags (`