### Inline Code Example Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html This snippet shows a single line of code enclosed in backticks, intended for inline display within text. ```code This is a code block. That should be NOT be pre-formatted. It should NOT retain carriage returns, or all white space. or blank lines. Tabs tabs tabs tabs spac spac spac spac ``` -------------------------------- ### Run html4docx with BeautifulSoup HTML fixing Source: https://context7.com/dfop02/html4docx/llms.txt This command runs the html4docx converter with BeautifulSoup enabled for HTML fixing. Ensure you have html4docx installed via pip. ```bash python -m html4docx.h4d input.html output_report --bs ``` -------------------------------- ### Initialize HtmlToDocx Parser Source: https://context7.com/dfop02/html4docx/llms.txt Create an instance of the parser. Sensible defaults are applied if parameters are omitted. Custom style maps and tag overrides can be provided for advanced styling. ```python from html4docx import HtmlToDocx # Minimal — use all defaults parser = HtmlToDocx() # With CSS-class-to-Word-style mapping and tag overrides style_map = { 'code-block': 'Code Block', 'finding-critical': 'Finding Critical', } tag_overrides = { 'h1': 'Custom Heading 1', 'pre': 'Code Block', } parser = HtmlToDocx( style_map=style_map, tag_style_overrides=tag_overrides, default_paragraph_style='Body Text', ) ``` -------------------------------- ### Clone Parser Settings Source: https://context7.com/dfop02/html4docx/llms.txt Demonstrates using `HtmlToDocx.copy_settings_from` to propagate parser configurations like table style, `style_map`, `tag_style_overrides`, and `default_paragraph_style` from one parser instance to another. This is useful for reusing configurations across multiple documents. ```python from docx import Document from html4docx import HtmlToDocx # Master parser with full configuration master_parser = HtmlToDocx( style_map={'highlight': 'Intense Quote'}, tag_style_overrides={'h1': 'Title'}, default_paragraph_style='Body Text', ) master_parser.table_style = 'Light Grid' # Child parser shares settings child_parser = HtmlToDocx() child_parser.copy_settings_from(master_parser) doc = Document() child_parser.add_html_to_document('

Copied Settings

', doc) doc.save('copied_settings.docx') ``` -------------------------------- ### Command-Line HTML to DOCX Conversion Source: https://context7.com/dfop02/html4docx/llms.txt Shows how to use the `h4d` module directly from the command line for basic HTML to DOCX file conversions. You can specify input and output filenames. ```bash # Basic conversion python -m html4docx.h4d input.html # Specify output filename (extension .docx added automatically) python -m html4docx.h4d input.html output_report ``` -------------------------------- ### HtmlToDocx.__init__ Source: https://context7.com/dfop02/html4docx/llms.txt Constructor for the HtmlToDocx class. Allows for optional customization of CSS class-to-Word style mappings, HTML tag overrides, and the default paragraph style. ```APIDOC ## HtmlToDocx.__init__ — Constructor with optional style customization Create a parser instance. All three parameters are optional; omitting them applies sensible defaults (`Normal` paragraph style, no custom class/tag mappings). ```python from html4docx import HtmlToDocx # Minimal — use all defaults parser = HtmlToDocx() # With CSS-class-to-Word-style mapping and tag overrides style_map = { 'code-block': 'Code Block', 'finding-critical': 'Finding Critical', } tag_overrides = { 'h1': 'Custom Heading 1', 'pre': 'Code Block', } parser = HtmlToDocx( style_map=style_map, tag_style_overrides=tag_overrides, default_paragraph_style='Body Text', ) ``` ``` -------------------------------- ### Convert HTML File to DOCX File Source: https://context7.com/dfop02/html4docx/llms.txt Reads an HTML file, converts its content, and saves the result as a DOCX file. An alternative encoding can be specified for legacy files. If the output filename is omitted, a default name is generated. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() # Basic usage — output saved alongside input file parser.parse_html_file('report.html', 'output/report.docx') # With explicit encoding parser.parse_html_file('legacy_report.html', 'output/legacy_report.docx', encoding='latin-1') # Output filename is optional; defaults to new_docx_file_ in same directory parser.parse_html_file('report.html', None) # Saves as: new_docx_file_report.html (alongside report.html) ``` -------------------------------- ### Convert HTML with Lists to DOCX Source: https://context7.com/dfop02/html4docx/llms.txt Demonstrates converting HTML with nested unordered and ordered lists to a DOCX document. Nesting beyond level 3 is capped at level 3. Each new top-level ordered list resets its counter independently. ```python from html4docx import HtmlToDocx html = """
  1. First ordered item
  2. Second ordered item
    1. Sub-step A
    2. Sub-step B
  3. Third ordered item
  1. New list resets to 1
  2. Item 2
""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('lists.docx') # Note: Nesting beyond level 3 is capped at level 3. ``` -------------------------------- ### Embed Images from URLs, Local Files, and Base64 Source: https://context7.com/dfop02/html4docx/llms.txt The `` tag supports remote URLs (with a 5-second timeout), local file paths, and inline base64-encoded images. Use `width` and `height` attributes for dimensions and the `style` attribute for alignment (e.g., `display: block; margin-left: auto; margin-right: auto;` for centering, `float: right;` for right-alignment). If an image cannot be processed, a placeholder text is inserted. ```python from html4docx import HtmlToDocx html = """

See diagram: for details.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('images.docx') # If an image cannot be fetched/found, a placeholder text is inserted. ``` -------------------------------- ### Create Hyperlinks and Internal Anchors Source: https://context7.com/dfop02/html4docx/llms.txt Use the `` tag to create external hyperlinks (blue, underlined, opens in browser) or internal anchor links that navigate within the document. Elements with an `id` attribute will automatically generate Word bookmarks, allowing for internal navigation. ```python from html4docx import HtmlToDocx html = """

Introduction

Visit our website at example.com for more information.

Jump to the introduction above.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('links.docx') # External links: blue (#0000EE) underlined text with optional tooltip # Internal anchors: href="#bookmark-id" links to elements with matching id="" ``` -------------------------------- ### Configure Conversion Options Source: https://github.com/dfop02/html4docx/blob/main/README.md Customize the HTML to DOCX conversion process by enabling or disabling various features like images, tables, styles, and HTML fixing. The options are set as boolean values in the parser's 'options' dictionary. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() parser.options['images'] = False # Default True parser.options['tables'] = False # Default True parser.options['styles'] = False # Default True parser.options['fix-html'] = False # Default True parser.options['html-comments'] = False # Default False parser.options['style-map'] = False # Default True parser.options['tag-override'] = False # Default True docx = parser.parse_html_string(input_html_file_string) ``` -------------------------------- ### Apply Semantic Inline Formatting in DOCX Source: https://context7.com/dfop02/html4docx/llms.txt Illustrates mapping of HTML semantic inline tags like ``, ``, ``, ``, ``, ``, ``, ``, ``, ``, and `
` to Word character formatting. `` applies a yellow background highlight, and ``/`
` use Courier font.

```python
from html4docx import HtmlToDocx

html = """

Bold, also bold, italic, also italic, underlined, inserted (underlined), strikethrough, deleted (strikethrough), highlighted in yellow, H2O and E=mc2.

Inline code snippet uses Courier font.

def block_code():
    return "pre block also uses Courier"
""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('inline_tags.docx') ``` -------------------------------- ### Inline Code Element Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html Demonstrates the use of inline code formatting for short code fragments within a sentence. ```text code ``` -------------------------------- ### Set Default Paragraph Style Source: https://github.com/dfop02/html4docx/blob/main/README.md Configure the default paragraph style for the document. Use 'Body' for the default behavior or None to use Word's 'Normal' style. ```python # Use 'Body' as default (default behavior) parser = HtmlToDocx(default_paragraph_style='Body') # Use Word's default 'Normal' style parser = HtmlToDocx(default_paragraph_style=None) ``` -------------------------------- ### Use Custom Styles from a Word Template Source: https://github.com/dfop02/html4docx/blob/main/README.md Apply custom styles defined in a Word template (.docx) by passing the template document to the HtmlToDocx parser. Save the output document to preserve these styles. ```python from docx import Document from html4docx import HtmlToDocx doc = Document("path/to/template.docx") # template has Code Block, Custom Markdown, etc. parser = HtmlToDocx(tag_style_overrides={"code": "Custom Markdown", "pre": "Code Block"}) parser.add_html_to_document(html, doc) doc.save("output.docx") # save the template-based doc so custom styles are preserved ``` -------------------------------- ### Map CSS Classes to Word Styles Source: https://context7.com/dfop02/html4docx/llms.txt Use the `style_map` parameter to map HTML classes to specific Word paragraph styles. This requires a document template that defines these styles. ```python from docx import Document from html4docx import HtmlToDocx style_map = { 'note': 'Quote', 'warning': 'Intense Quote', 'code-block': 'No Spacing', } doc = Document('path/to/branded_template.docx') parser = HtmlToDocx(style_map=style_map) html = """

This is a note paragraph.

Warning: data loss may occur.

def hello(): pass
""" parser.add_html_to_document(html, doc) doc.save('styled_output.docx') ``` -------------------------------- ### Save Document to Path or BytesIO Source: https://context7.com/dfop02/html4docx/llms.txt Saves the underlying document to a file path or an in-memory BytesIO buffer. The '.docx' extension is automatically appended if not present. ```python from io import BytesIO from docx import Document from html4docx import HtmlToDocx ``` -------------------------------- ### Save Document to File or Buffer Source: https://context7.com/dfop02/html4docx/llms.txt Demonstrates saving a generated Word document to a file path or an in-memory BytesIO buffer. The buffer can be used for web responses. ```python from html4docx import HtmlToDocx from io import BytesIO document = Document() parser = HtmlToDocx() parser.add_html_to_document('

Hello

', document) parser.save('output/hello') # Saves as output/hello.docx buffer = BytesIO() document2 = Document() parser2 = HtmlToDocx() parser2.add_html_to_document('

Report content

', document2) parser2.save(buffer) buffer.seek(0) ``` -------------------------------- ### Configure Parser Options Source: https://context7.com/dfop02/html4docx/llms.txt Control parser behavior by modifying the `options` dictionary. Disable image embedding, table rendering, and style application for plain text output, or enable HTML comment rendering. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() # Default values: # parser.options['images'] = True — embed images # parser.options['tables'] = True — render HTML tables # parser.options['styles'] = True — apply CSS styles # parser.options['fix-html'] = True — run BeautifulSoup HTML cleanup # parser.options['html-comments'] = False — render as visible green text # parser.options['style-map'] = True — apply CSS-class → Word style mapping # parser.options['tag-override'] = True — apply tag → Word style overrides # Strip all styling and images for a plain-text-only docx parser.options['images'] = False parser.options['styles'] = False parser.options['tables'] = False # Render HTML comments as visible italic green text parser.options['html-comments'] = True doc = parser.parse_html_string('

Content

') doc.save('plain.docx') ``` -------------------------------- ### Read and Set Document Metadata Source: https://context7.com/dfop02/html4docx/llms.txt Explains how to access and modify the document's built-in metadata (author, title, subject, etc.) using the `parser.metadata` property. Invalid revision or datetime strings will print a warning and be skipped. ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() parser.set_initial_attrs(document) metadata = parser.metadata # Read all metadata as a dict props = metadata.get_metadata() print(props.get('author')) # e.g., '' (empty on new document) print(props.get('created')) # datetime object # Print all metadata to stdout as formatted JSON metadata.get_metadata(print_result=True) # Set metadata fields metadata.set_metadata( author='Jane Smith', title='Q4 Financial Report', subject='Finance', keywords='finance, quarterly, 2024', description='Official Q4 2024 report', revision='3', created='2024-01-01T00:00:00', modified='2024-12-31T23:59:59', ) parser.add_html_to_document('

Q4 Report

Content here.

', document) document.save('q4_report.docx') # Invalid revision (non-integer) or datetime string prints a warning and skips that field. ``` -------------------------------- ### Apply Table Styles Source: https://github.com/dfop02/html4docx/blob/main/README.md Set a specific table style for all tables converted from HTML. The 'table_style' attribute must be set on the parser instance before conversion. Supported styles can be found in the python-docx documentation. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() parser.table_style = 'Light Shading Accent 4' docx = parser.parse_html_string(input_html_file_string) ``` ```python parser.table_style = 'Table Grid' ``` -------------------------------- ### Add HTML to Existing DOCX Document Source: https://github.com/dfop02/html4docx/blob/main/README.md Use this method to add HTML-formatted content to an existing .docx document. Requires the HtmlToDocx parser and a filename for the output. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() html_string = '

Hello world

' parser.add_html_to_document(html_string, filename_docx) ``` -------------------------------- ### Apply Word Table Style Source: https://context7.com/dfop02/html4docx/llms.txt Set the `table_style` attribute to apply a specific Word table style to all tables generated from HTML. Ensure the style exists in your document template. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() parser.table_style = 'Table Grid' # Bordered grid style html = """
NameScore
Alice95
Bob87
""" doc = parser.parse_html_string(html) doc.save('scores.docx') ``` -------------------------------- ### HtmlToDocx.parse_html_file Source: https://context7.com/dfop02/html4docx/llms.txt Reads an HTML file from disk, converts its content, and saves the result as a .docx file. An optional encoding parameter can be specified for handling different file encodings. ```APIDOC ## HtmlToDocx.parse_html_file — Convert an HTML file to a `.docx` file Reads an HTML file from disk, converts it, and saves the result as a `.docx` file. Supports specifying an alternative encoding. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() # Basic usage — output saved alongside input file parser.parse_html_file('report.html', 'output/report.docx') # With explicit encoding parser.parse_html_file('legacy_report.html', 'output/legacy_report.docx', encoding='latin-1') # Output filename is optional; defaults to new_docx_file_ in same directory parser.parse_html_file('report.html', None) # Saves as: new_docx_file_report.html (alongside report.html) ``` ``` -------------------------------- ### Apply Inline CSS for Text and Paragraph Properties Source: https://context7.com/dfop02/html4docx/llms.txt Use inline style attributes on HTML elements to control typography, color, spacing, and decoration for runs or paragraph formats in the DOCX. Supported properties include text-align, line-height, margin-left, font-family, font-size, color, font-weight, text-indent, text-decoration, background-color, font-style, and text-transform. ```python from html4docx import HtmlToDocx html = """

Centered, spaced heading-like paragraph

First line indented with wavy red underline and yellow highlight.

Blue italic paragraph using RGB color.

this text will be uppercased in courier via serif generic mapping.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('styled_text.docx') ``` -------------------------------- ### Convert HTML File Directly to DOCX Source: https://github.com/dfop02/html4docx/blob/main/README.md Convert an HTML file to a DOCX file using the parse_html_file method. You can specify the input HTML file path, output DOCX file path, and optionally the file encoding (defaults to 'utf-8'). ```python from html4docx import HtmlToDocx parser = HtmlToDocx() parser.parse_html_file(input_html_file_path, output_docx_file_path) # You can also define a encoding, by default is utf-8 parser.parse_html_file(input_html_file_path, output_docx_file_path, 'utf-8') ``` -------------------------------- ### Incrementally Add HTML to Document and Save Source: https://github.com/dfop02/html4docx/blob/main/README.md Add multiple HTML snippets to a document incrementally. The content is appended to the end of the document. Saving can be done using either python-docx's document.save() or html4docx's parser.save(). ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() for part in ['First', 'Second', 'Third']: parser.add_html_to_document(f'

{part} Part

', document) parser.save('your_file_name.docx') ``` -------------------------------- ### Map CSS Classes to Word Styles Source: https://github.com/dfop02/html4docx/blob/main/README.md Define a mapping between HTML CSS classes and Word document styles to control the appearance of specific HTML elements. Pass the style map as an argument during HtmlToDocx instantiation or use add_html_to_document. ```python from html4docx import HtmlToDocx style_map = { 'code-block': 'Code Block', 'numbered-heading-1': 'Heading 1 Numbered', 'finding-critical': 'Finding Critical' } parser = HtmlToDocx(style_map=style_map) parser.add_html_to_document(html, document) ``` -------------------------------- ### Convert HTML String to a New Document Source: https://context7.com/dfop02/html4docx/llms.txt Parses an HTML string and returns a new python-docx Document object. This is suitable for one-shot conversions. ```python from html4docx import HtmlToDocx html = """

Product Specification

Model: X-500

Note: Subject to change without notice.

  1. Step one: Unbox the unit
  2. Step two: Connect to power
    1. Use the provided cable
    2. Verify LED indicator
""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('spec_sheet.docx') # Returns: docx.document.Document ``` -------------------------------- ### Manage Document Metadata Source: https://github.com/dfop02/html4docx/blob/main/README.md Read and set document metadata such as author and creation date using the parser's metadata attributes. Available attributes can be found in the python-docx documentation. ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() parser.set_initial_attrs(document) metadata = parser.metadata # You can get metadata as dict metadata_json = metadata.get_metadata() print(metadata_json['author']) # Jane # or just print all metadata if if you want metadata.get_metadata(print_result=True) # Set new metadata metadata.set_metadata(author="Jane", created="2025-07-18T09:30:00") document.save('your_file_name.docx') ``` -------------------------------- ### HtmlToDocx.save Source: https://context7.com/dfop02/html4docx/llms.txt Saves the underlying document to a specified path or a BytesIO buffer. If a path is provided, the '.docx' extension is automatically appended. This method is useful for file persistence or streaming document data. ```APIDOC ## HtmlToDocx.save — Save the document to a path or BytesIO buffer Saves the underlying document either to a file path (`.docx` extension appended automatically) or to an in-memory `BytesIO` buffer for streaming use cases (e.g., HTTP responses). ```python from io import BytesIO from docx import Document from html4docx import HtmlToDocx ``` ``` -------------------------------- ### Pre-formatted Text Block Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html This snippet represents a pre-formatted text block, retaining all whitespace and line breaks exactly as they appear in the source. ```text This is a pre-formatted block. That should be pre-formatted. Retaining any carriage returns, and all white space. And blank lines. Tabs tabs tabs tabs spac spac spac spac ``` -------------------------------- ### Save Document to In-Memory Buffer Source: https://github.com/dfop02/html4docx/blob/main/README.md Utilize BytesIO to save the DOCX document in memory. This is useful for applications that need to handle the document data without writing to a physical file immediately. Remember to reset the buffer's position after saving if you intend to read from it. ```python from io import BytesIO from docx import Document from html4docx import HtmlToDocx buffer = BytesIO() document = Document() parser = HtmlToDocx() html_string = '

Hello world

' parser.add_html_to_document(html_string, document) # Save the document to the in-memory buffer parser.save(buffer) # If you need to read from the buffer again after saving, # you might need to reset its position to the beginning buffer.seek(0) ``` -------------------------------- ### Style HTML Table Cells with CSS Source: https://context7.com/dfop02/html4docx/llms.txt Apply CSS properties to HTML table cells (, ) for borders, background color, dimensions, and text alignment. Supported properties include border shorthand/longhand, background-color, width, height, color, and vertical-align. The 'Table Grid' style can be applied to the document for consistent table formatting. ```python from html4docx import HtmlToDocx html = """
Header A Header B
Top-aligned dashed cell Left accent border cell
Merged cell spanning 2 columns
""" parser = HtmlToDocx() parser.table_style = 'Table Grid' doc = parser.parse_html_string(html) doc.save('styled_table.docx') # Supported border keywords: thin (1px), medium (3px), thick (5px) # Supported border styles: solid, dashed, dotted, double, inset, outset ``` -------------------------------- ### Override Default Tag Styles Source: https://github.com/dfop02/html4docx/blob/main/README.md Customize the styles applied to specific HTML tags like 'h1' and 'pre'. Ensure the target styles exist in your Word document. ```python tag_overrides = { 'h1': 'Custom Heading 1', 'pre': 'Code Block' } parser = HtmlToDocx(tag_style_overrides=tag_overrides) ``` -------------------------------- ### Apply Inline CSS Styles Source: https://github.com/dfop02/html4docx/blob/main/README.md Utilize inline CSS styles directly within HTML tags for precise formatting of text and paragraphs. Supported properties include color, font-size, font-weight, and more. ```html

Red 14pt paragraph

Bold blue text ``` -------------------------------- ### Convert HTML String Directly to DOCX Source: https://github.com/dfop02/html4docx/blob/main/README.md Convert an HTML string into a DOCX document object using parse_html_string. The method returns the DOCX object, which can then be saved. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() docx = parser.parse_html_string(input_html_file_string) ``` -------------------------------- ### Insert Page Breaks and Horizontal Rules in DOCX Source: https://context7.com/dfop02/html4docx/llms.txt Shows how to insert page breaks using CSS page-break properties and horizontal rules using the `
` tag when converting HTML to DOCX. The `
` tag renders as a paragraph-bottom border line. ```python from html4docx import HtmlToDocx html = """

Chapter 1

Content for chapter one.

Chapter 2

Content for chapter two starts on a new page.


Section below the horizontal rule.

Chapter 3

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('paged_document.docx') ``` -------------------------------- ### Override HTML Tag to Word Style Mapping Source: https://context7.com/dfop02/html4docx/llms.txt Use the `tag_overrides` parameter to replace default tag-to-style mappings with custom Word styles. This is useful for structural tags like headings and preformatted text. ```python from docx import Document from html4docx import HtmlToDocx tag_overrides = { 'h1': 'Report Title', 'h2': 'Section Header', 'pre': 'Code Block', } doc = Document('template_with_custom_styles.docx') parser = HtmlToDocx(tag_overrides=tag_overrides) html = """

Executive Summary

Background

SELECT * FROM reports WHERE year = 2024;
""" parser.add_html_to_document(html, doc) doc.save('executive_summary.docx') ``` -------------------------------- ### Add HTML to python-docx Document Object Source: https://github.com/dfop02/html4docx/blob/main/README.md Integrate HTML content directly into a python-docx Document object. This allows for further manipulation of the document before saving. ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() html_string = '

Hello world

' parser.add_html_to_document(html_string, document) document.save('your_file_name.docx') ``` -------------------------------- ### Add HTML to an Existing Document Source: https://context7.com/dfop02/html4docx/llms.txt Append parsed HTML content to a python-docx Document object. This method can be called multiple times to build a document incrementally. Ensure the input is a string and the document is a valid Document or _Cell object. ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() html_parts = [ '

Annual Report 2024

', '

This report covers all fiscal quarters.

', '
  • Q1: $1.2M revenue
  • Q2: $1.5M revenue
', '' '' '
QuarterRevenue
Q1$1.2M
Q2$1.5M
', ] for part in html_parts: parser.add_html_to_document(part, document) document.save('annual_report.docx') # Raises ValueError if html is not str or document is not a Document/_Cell ``` -------------------------------- ### HtmlToDocx.parse_html_string Source: https://context7.com/dfop02/html4docx/llms.txt Converts an HTML string into a new python-docx Document object. This method is suitable for single, self-contained HTML conversions. ```APIDOC ## HtmlToDocx.parse_html_string — Convert HTML string to a new Document Pareses an HTML string and returns a brand-new `python-docx` `Document` object. Ideal for one-shot conversions. ```python from html4docx import HtmlToDocx html = """

Product Specification

Model: X-500

Note: Subject to change without notice.

  1. Step one: Unbox the unit
  2. Step two: Connect to power
    1. Use the provided cable
    2. Verify LED indicator
""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('spec_sheet.docx') # Returns: docx.document.Document ``` ``` -------------------------------- ### HtmlToDocx.add_html_to_document Source: https://context7.com/dfop02/html4docx/llms.txt Appends parsed HTML content to an existing python-docx Document object. This method is the primary interface for programmatic use and can be called multiple times to build a document incrementally. ```APIDOC ## HtmlToDocx.add_html_to_document — Append HTML to an existing Document object The primary method for programmatic use. Appends the parsed HTML content at the end of a `python-docx` `Document` object. Can be called multiple times to build a document incrementally. ```python from docx import Document from html4docx import HtmlToDocx document = Document() parser = HtmlToDocx() html_parts = [ '

Annual Report 2024

', '

This report covers all fiscal quarters.

', '
  • Q1: $1.2M revenue
  • Q2: $1.5M revenue
', '' '' '
QuarterRevenue
Q1$1.2M
Q2$1.5M
', ] for part in html_parts: parser.add_html_to_document(part, document) document.save('annual_report.docx') # Raises ValueError if html is not str or document is not a Document/_Cell ``` ``` -------------------------------- ### Utilize !important Flag in Inline CSS Source: https://github.com/dfop02/html4docx/blob/main/README.md Ensure the highest CSS precedence for inline styles by using the '!important' flag. This overrides other style declarations. ```html Gray text with red important. ``` -------------------------------- ### Handle !important CSS Flag for Style Overrides Source: https://context7.com/dfop02/html4docx/llms.txt Styles marked with !important on a child element will override any parent-level styles for the same property, mimicking CSS cascade behavior. This is useful for ensuring specific styles take precedence. ```python from html4docx import HtmlToDocx html = """

Normal gray text, important red override back to gray.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('important_styles.docx') # The span overrides the paragraph's gray color and 11pt size with red and 14pt. ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.