### Inline Code Example Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html This snippet shows a single line of code enclosed in backticks, intended for inline display within text. ```code This is a code block. That should be NOT be pre-formatted. It should NOT retain carriage returns, or all white space. or blank lines. Tabs tabs tabs tabs spac spac spac spac ``` -------------------------------- ### Run html4docx with BeautifulSoup HTML fixing Source: https://context7.com/dfop02/html4docx/llms.txt This command runs the html4docx converter with BeautifulSoup enabled for HTML fixing. Ensure you have html4docx installed via pip. ```bash python -m html4docx.h4d input.html output_report --bs ``` -------------------------------- ### Initialize HtmlToDocx Parser Source: https://context7.com/dfop02/html4docx/llms.txt Create an instance of the parser. Sensible defaults are applied if parameters are omitted. Custom style maps and tag overrides can be provided for advanced styling. ```python from html4docx import HtmlToDocx # Minimal — use all defaults parser = HtmlToDocx() # With CSS-class-to-Word-style mapping and tag overrides style_map = { 'code-block': 'Code Block', 'finding-critical': 'Finding Critical', } tag_overrides = { 'h1': 'Custom Heading 1', 'pre': 'Code Block', } parser = HtmlToDocx( style_map=style_map, tag_style_overrides=tag_overrides, default_paragraph_style='Body Text', ) ``` -------------------------------- ### Clone Parser Settings Source: https://context7.com/dfop02/html4docx/llms.txt Demonstrates using `HtmlToDocx.copy_settings_from` to propagate parser configurations like table style, `style_map`, `tag_style_overrides`, and `default_paragraph_style` from one parser instance to another. This is useful for reusing configurations across multiple documents. ```python from docx import Document from html4docx import HtmlToDocx # Master parser with full configuration master_parser = HtmlToDocx( style_map={'highlight': 'Intense Quote'}, tag_style_overrides={'h1': 'Title'}, default_paragraph_style='Body Text', ) master_parser.table_style = 'Light Grid' # Child parser shares settings child_parser = HtmlToDocx() child_parser.copy_settings_from(master_parser) doc = Document() child_parser.add_html_to_document('

Copied Settings

', doc) doc.save('copied_settings.docx') ``` -------------------------------- ### Command-Line HTML to DOCX Conversion Source: https://context7.com/dfop02/html4docx/llms.txt Shows how to use the `h4d` module directly from the command line for basic HTML to DOCX file conversions. You can specify input and output filenames. ```bash # Basic conversion python -m html4docx.h4d input.html # Specify output filename (extension .docx added automatically) python -m html4docx.h4d input.html output_report ``` -------------------------------- ### HtmlToDocx.__init__ Source: https://context7.com/dfop02/html4docx/llms.txt Constructor for the HtmlToDocx class. Allows for optional customization of CSS class-to-Word style mappings, HTML tag overrides, and the default paragraph style. ```APIDOC ## HtmlToDocx.__init__ — Constructor with optional style customization Create a parser instance. All three parameters are optional; omitting them applies sensible defaults (`Normal` paragraph style, no custom class/tag mappings). ```python from html4docx import HtmlToDocx # Minimal — use all defaults parser = HtmlToDocx() # With CSS-class-to-Word-style mapping and tag overrides style_map = { 'code-block': 'Code Block', 'finding-critical': 'Finding Critical', } tag_overrides = { 'h1': 'Custom Heading 1', 'pre': 'Code Block', } parser = HtmlToDocx( style_map=style_map, tag_style_overrides=tag_overrides, default_paragraph_style='Body Text', ) ``` ``` -------------------------------- ### Convert HTML File to DOCX File Source: https://context7.com/dfop02/html4docx/llms.txt Reads an HTML file, converts its content, and saves the result as a DOCX file. An alternative encoding can be specified for legacy files. If the output filename is omitted, a default name is generated. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() # Basic usage — output saved alongside input file parser.parse_html_file('report.html', 'output/report.docx') # With explicit encoding parser.parse_html_file('legacy_report.html', 'output/legacy_report.docx', encoding='latin-1') # Output filename is optional; defaults to new_docx_file_ in same directory parser.parse_html_file('report.html', None) # Saves as: new_docx_file_report.html (alongside report.html) ``` -------------------------------- ### Convert HTML with Lists to DOCX Source: https://context7.com/dfop02/html4docx/llms.txt Demonstrates converting HTML with nested unordered and ordered lists to a DOCX document. Nesting beyond level 3 is capped at level 3. Each new top-level ordered list resets its counter independently. ```python from html4docx import HtmlToDocx html = """

Unordered item one
Unordered item two
- Nested bullet level 2
- Another nested item
  - Level 3 bullet

First ordered item
Second ordered item
1. Sub-step A
2. Sub-step B
Third ordered item

New list resets to 1
Item 2

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('lists.docx') # Note: Nesting beyond level 3 is capped at level 3. ``` -------------------------------- ### Embed Images from URLs, Local Files, and Base64 Source: https://context7.com/dfop02/html4docx/llms.txt The `` tag supports remote URLs (with a 5-second timeout), local file paths, and inline base64-encoded images. Use `width` and `height` attributes for dimensions and the `style` attribute for alignment (e.g., `display: block; margin-left: auto; margin-right: auto;` for centering, `float: right;` for right-alignment). If an image cannot be processed, a placeholder text is inserted. ```python from html4docx import HtmlToDocx html = """

See diagram: for details.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('images.docx') # If an image cannot be fetched/found, a placeholder text is inserted. ``` -------------------------------- ### Create Hyperlinks and Internal Anchors Source: https://context7.com/dfop02/html4docx/llms.txt Use the `` tag to create external hyperlinks (blue, underlined, opens in browser) or internal anchor links that navigate within the document. Elements with an `id` attribute will automatically generate Word bookmarks, allowing for internal navigation. ```python from html4docx import HtmlToDocx html = """

Introduction

Visit our website at example.com for more information.

Jump to the introduction above.

""" parser = HtmlToDocx() doc = parser.parse_html_string(html) doc.save('links.docx') # External links: blue (#0000EE) underlined text with optional tooltip # Internal anchors: href="#bookmark-id" links to elements with matching id="" ``` -------------------------------- ### Configure Conversion Options Source: https://github.com/dfop02/html4docx/blob/main/README.md Customize the HTML to DOCX conversion process by enabling or disabling various features like images, tables, styles, and HTML fixing. The options are set as boolean values in the parser's 'options' dictionary. ```python from html4docx import HtmlToDocx parser = HtmlToDocx() parser.options['images'] = False # Default True parser.options['tables'] = False # Default True parser.options['styles'] = False # Default True parser.options['fix-html'] = False # Default True parser.options['html-comments'] = False # Default False parser.options['style-map'] = False # Default True parser.options['tag-override'] = False # Default True docx = parser.parse_html_string(input_html_file_string) ``` -------------------------------- ### Apply Semantic Inline Formatting in DOCX Source: https://context7.com/dfop02/html4docx/llms.txt Illustrates mapping of HTML semantic inline tags like ``, ``, ``, ``, ``, ``, ``, `^{`, `_{`, ``, and `` to Word character formatting. `` applies a yellow background highlight, and ``/`` use Courier font.

```python
from html4docx import HtmlToDocx

html = """

Bold, also bold,
italic, also italic,
underlined, inserted (underlined),
strikethrough, deleted (strikethrough),
highlighted in yellow,
H₂O and E=mc².

Inline code snippet uses Courier font.
def block_code():
return "pre block also uses Courier"

"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('inline_tags.docx')
```

--------------------------------

### Inline Code Element

Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html

Demonstrates the use of inline code formatting for short code fragments within a sentence.

```text
code
```

--------------------------------

### Set Default Paragraph Style

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Configure the default paragraph style for the document. Use 'Body' for the default behavior or None to use Word's 'Normal' style.

```python
# Use 'Body' as default (default behavior)
parser = HtmlToDocx(default_paragraph_style='Body')

# Use Word's default 'Normal' style
parser = HtmlToDocx(default_paragraph_style=None)
```

--------------------------------

### Use Custom Styles from a Word Template

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Apply custom styles defined in a Word template (.docx) by passing the template document to the HtmlToDocx parser. Save the output document to preserve these styles.

```python
from docx import Document
from html4docx import HtmlToDocx

doc = Document("path/to/template.docx") # template has Code Block, Custom Markdown, etc.
parser = HtmlToDocx(tag_style_overrides={"code": "Custom Markdown", "pre": "Code Block"})
parser.add_html_to_document(html, doc)
doc.save("output.docx") # save the template-based doc so custom styles are preserved
```

--------------------------------

### Map CSS Classes to Word Styles

Source: https://context7.com/dfop02/html4docx/llms.txt

Use the `style_map` parameter to map HTML classes to specific Word paragraph styles. This requires a document template that defines these styles.

```python
from docx import Document
from html4docx import HtmlToDocx

style_map = {
'note': 'Quote',
'warning': 'Intense Quote',
'code-block': 'No Spacing',
}

doc = Document('path/to/branded_template.docx')
parser = HtmlToDocx(style_map=style_map)

html = """
This is a note paragraph.
Warning: data loss may occur.
def hello(): pass
"""
parser.add_html_to_document(html, doc)
doc.save('styled_output.docx')
```

--------------------------------

### Save Document to Path or BytesIO

Source: https://context7.com/dfop02/html4docx/llms.txt

Saves the underlying document to a file path or an in-memory BytesIO buffer. The '.docx' extension is automatically appended if not present.

```python
from io import BytesIO
from docx import Document
from html4docx import HtmlToDocx

```

--------------------------------

### Save Document to File or Buffer

Source: https://context7.com/dfop02/html4docx/llms.txt

Demonstrates saving a generated Word document to a file path or an in-memory BytesIO buffer. The buffer can be used for web responses.

```python
from html4docx import HtmlToDocx
from io import BytesIO

document = Document()
parser = HtmlToDocx()
parser.add_html_to_document('Hello', document)
parser.save('output/hello') # Saves as output/hello.docx

buffer = BytesIO()
document2 = Document()
parser2 = HtmlToDocx()
parser2.add_html_to_document('Report content', document2)
parser2.save(buffer)
buffer.seek(0)
```

--------------------------------

### Configure Parser Options

Source: https://context7.com/dfop02/html4docx/llms.txt

Control parser behavior by modifying the `options` dictionary. Disable image embedding, table rendering, and style application for plain text output, or enable HTML comment rendering.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()

# Default values:
# parser.options['images'] = True — embed images
# parser.options['tables'] = True — render HTML tables
# parser.options['styles'] = True — apply CSS styles
# parser.options['fix-html'] = True — run BeautifulSoup HTML cleanup
# parser.options['html-comments'] = False — render as visible green text
# parser.options['style-map'] = True — apply CSS-class → Word style mapping
# parser.options['tag-override'] = True — apply tag → Word style overrides

# Strip all styling and images for a plain-text-only docx
parser.options['images'] = False
parser.options['styles'] = False
parser.options['tables'] = False

# Render HTML comments as visible italic green text
parser.options['html-comments'] = True

doc = parser.parse_html_string('Content')
doc.save('plain.docx')
```

--------------------------------

### Read and Set Document Metadata

Source: https://context7.com/dfop02/html4docx/llms.txt

Explains how to access and modify the document's built-in metadata (author, title, subject, etc.) using the `parser.metadata` property. Invalid revision or datetime strings will print a warning and be skipped.

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()
parser.set_initial_attrs(document)

metadata = parser.metadata

# Read all metadata as a dict
props = metadata.get_metadata()
print(props.get('author')) # e.g., '' (empty on new document)
print(props.get('created')) # datetime object

# Print all metadata to stdout as formatted JSON
metadata.get_metadata(print_result=True)

# Set metadata fields
metadata.set_metadata(
author='Jane Smith',
title='Q4 Financial Report',
subject='Finance',
keywords='finance, quarterly, 2024',
description='Official Q4 2024 report',
revision='3',
created='2024-01-01T00:00:00',
modified='2024-12-31T23:59:59',
)

parser.add_html_to_document('Q4 Report
Content here.', document)
document.save('q4_report.docx')
# Invalid revision (non-integer) or datetime string prints a warning and skips that field.
```

--------------------------------

### Apply Table Styles

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Set a specific table style for all tables converted from HTML. The 'table_style' attribute must be set on the parser instance before conversion. Supported styles can be found in the python-docx documentation.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()
parser.table_style = 'Light Shading Accent 4'
docx = parser.parse_html_string(input_html_file_string)
```

```python
parser.table_style = 'Table Grid'
```

--------------------------------

### Add HTML to Existing DOCX Document

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Use this method to add HTML-formatted content to an existing .docx document. Requires the HtmlToDocx parser and a filename for the output.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()
html_string = 'Hello world'
parser.add_html_to_document(html_string, filename_docx)
```

--------------------------------

### Apply Word Table Style

Source: https://context7.com/dfop02/html4docx/llms.txt

Set the `table_style` attribute to apply a specific Word table style to all tables generated from HTML. Ensure the style exists in your document template.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()
parser.table_style = 'Table Grid' # Bordered grid style

html = """

Name Score
Alice 95
Bob 87

"""
doc = parser.parse_html_string(html)
doc.save('scores.docx')
```

--------------------------------

### HtmlToDocx.parse_html_file

Source: https://context7.com/dfop02/html4docx/llms.txt

Reads an HTML file from disk, converts its content, and saves the result as a .docx file. An optional encoding parameter can be specified for handling different file encodings.

```APIDOC
## HtmlToDocx.parse_html_file — Convert an HTML file to a `.docx` file

Reads an HTML file from disk, converts it, and saves the result as a `.docx` file. Supports specifying an alternative encoding.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()

# Basic usage — output saved alongside input file
parser.parse_html_file('report.html', 'output/report.docx')

# With explicit encoding
parser.parse_html_file('legacy_report.html', 'output/legacy_report.docx', encoding='latin-1')

# Output filename is optional; defaults to new_docx_file_ in same directory
parser.parse_html_file('report.html', None)
# Saves as: new_docx_file_report.html (alongside report.html)
```
```

--------------------------------

### Apply Inline CSS for Text and Paragraph Properties

Source: https://context7.com/dfop02/html4docx/llms.txt

Use inline style attributes on HTML elements to control typography, color, spacing, and decoration for runs or paragraph formats in the DOCX. Supported properties include text-align, line-height, margin-left, font-family, font-size, color, font-weight, text-indent, text-decoration, background-color, font-style, and text-transform.

```python
from html4docx import HtmlToDocx

html = """

Centered, spaced heading-like paragraph

First line indented with
wavy red underline
and yellow highlight.

Blue italic paragraph using RGB color.

this text will be uppercased in courier via serif generic mapping.

"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('styled_text.docx')
```

--------------------------------

### Convert HTML File Directly to DOCX

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Convert an HTML file to a DOCX file using the parse_html_file method. You can specify the input HTML file path, output DOCX file path, and optionally the file encoding (defaults to 'utf-8').

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()
parser.parse_html_file(input_html_file_path, output_docx_file_path)
# You can also define a encoding, by default is utf-8
parser.parse_html_file(input_html_file_path, output_docx_file_path, 'utf-8')
```

--------------------------------

### Incrementally Add HTML to Document and Save

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Add multiple HTML snippets to a document incrementally. The content is appended to the end of the document. Saving can be done using either python-docx's document.save() or html4docx's parser.save().

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()

for part in ['First', 'Second', 'Third']:
parser.add_html_to_document(f'{part} Part', document)

parser.save('your_file_name.docx')
```

--------------------------------

### Map CSS Classes to Word Styles

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Define a mapping between HTML CSS classes and Word document styles to control the appearance of specific HTML elements. Pass the style map as an argument during HtmlToDocx instantiation or use add_html_to_document.

```python
from html4docx import HtmlToDocx

style_map = {
'code-block': 'Code Block',
'numbered-heading-1': 'Heading 1 Numbered',
'finding-critical': 'Finding Critical'
}

parser = HtmlToDocx(style_map=style_map)
parser.add_html_to_document(html, document)
```

--------------------------------

### Convert HTML String to a New Document

Source: https://context7.com/dfop02/html4docx/llms.txt

Parses an HTML string and returns a new python-docx Document object. This is suitable for one-shot conversions.

```python
from html4docx import HtmlToDocx

html = """
Product Specification
Model: X-500
Note: Subject to change without notice.

Step one: Unbox the unit
Step two: Connect to power

Use the provided cable
Verify LED indicator

"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('spec_sheet.docx')
# Returns: docx.document.Document
```

--------------------------------

### Manage Document Metadata

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Read and set document metadata such as author and creation date using the parser's metadata attributes. Available attributes can be found in the python-docx documentation.

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()
parser.set_initial_attrs(document)
metadata = parser.metadata

# You can get metadata as dict
metadata_json = metadata.get_metadata()
print(metadata_json['author']) # Jane
# or just print all metadata if if you want
metadata.get_metadata(print_result=True)

# Set new metadata
metadata.set_metadata(author="Jane", created="2025-07-18T09:30:00")
document.save('your_file_name.docx')
```

--------------------------------

### HtmlToDocx.save

Source: https://context7.com/dfop02/html4docx/llms.txt

Saves the underlying document to a specified path or a BytesIO buffer. If a path is provided, the '.docx' extension is automatically appended. This method is useful for file persistence or streaming document data.

```APIDOC
## HtmlToDocx.save — Save the document to a path or BytesIO buffer

Saves the underlying document either to a file path (`.docx` extension appended automatically) or to an in-memory `BytesIO` buffer for streaming use cases (e.g., HTTP responses).

```python
from io import BytesIO
from docx import Document
from html4docx import HtmlToDocx
```
```

--------------------------------

### Pre-formatted Text Block

Source: https://github.com/dfop02/html4docx/blob/main/tests/assets/htmls/code.html

This snippet represents a pre-formatted text block, retaining all whitespace and line breaks exactly as they appear in the source.

```text
This is a pre-formatted block.
That should be pre-formatted.
Retaining any carriage returns, and all white space.

And blank lines.
Tabs tabs tabs tabs
spac spac spac spac
```

--------------------------------

### Save Document to In-Memory Buffer

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Utilize BytesIO to save the DOCX document in memory. This is useful for applications that need to handle the document data without writing to a physical file immediately. Remember to reset the buffer's position after saving if you intend to read from it.

```python
from io import BytesIO
from docx import Document
from html4docx import HtmlToDocx

buffer = BytesIO()
document = Document()
parser = HtmlToDocx()

html_string = 'Hello world'
parser.add_html_to_document(html_string, document)

# Save the document to the in-memory buffer
parser.save(buffer)

# If you need to read from the buffer again after saving,
# you might need to reset its position to the beginning
buffer.seek(0)
```

--------------------------------

### Style HTML Table Cells with CSS

Source: https://context7.com/dfop02/html4docx/llms.txt

Apply CSS properties to HTML table cells (, ) for borders, background color, dimensions, and text alignment. Supported properties include border shorthand/longhand, background-color, width, height, color, and vertical-align. The 'Table Grid' style can be applied to the document for consistent table formatting.

```python
from html4docx import HtmlToDocx

html = """

Header A

Header B

Top-aligned dashed cell

Left accent border cell

Merged cell spanning 2 columns

"""

parser = HtmlToDocx()
parser.table_style = 'Table Grid'
doc = parser.parse_html_string(html)
doc.save('styled_table.docx')
# Supported border keywords: thin (1px), medium (3px), thick (5px)
# Supported border styles: solid, dashed, dotted, double, inset, outset
```

--------------------------------

### Override Default Tag Styles

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Customize the styles applied to specific HTML tags like 'h1' and 'pre'. Ensure the target styles exist in your Word document.

```python
tag_overrides = {
'h1': 'Custom Heading 1',
'pre': 'Code Block'
}

parser = HtmlToDocx(tag_style_overrides=tag_overrides)
```

--------------------------------

### Apply Inline CSS Styles

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Utilize inline CSS styles directly within HTML tags for precise formatting of text and paragraphs. Supported properties include color, font-size, font-weight, and more.

```html
Red 14pt paragraph
Bold blue text
```

--------------------------------

### Convert HTML String Directly to DOCX

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Convert an HTML string into a DOCX document object using parse_html_string. The method returns the DOCX object, which can then be saved.

```python
from html4docx import HtmlToDocx

parser = HtmlToDocx()
docx = parser.parse_html_string(input_html_file_string)
```

--------------------------------

### Insert Page Breaks and Horizontal Rules in DOCX

Source: https://context7.com/dfop02/html4docx/llms.txt

Shows how to insert page breaks using CSS page-break properties and horizontal rules using the `
` tag when converting HTML to DOCX. The `` tag renders as a paragraph-bottom border line.

```python
from html4docx import HtmlToDocx

html = """
Chapter 1
Content for chapter one.

Chapter 2
Content for chapter two starts on a new page.

Section below the horizontal rule.

Chapter 3
"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('paged_document.docx')
```

--------------------------------

### Override HTML Tag to Word Style Mapping

Source: https://context7.com/dfop02/html4docx/llms.txt

Use the `tag_overrides` parameter to replace default tag-to-style mappings with custom Word styles. This is useful for structural tags like headings and preformatted text.

```python
from docx import Document
from html4docx import HtmlToDocx

tag_overrides = {
'h1': 'Report Title',
'h2': 'Section Header',
'pre': 'Code Block',
}

doc = Document('template_with_custom_styles.docx')
parser = HtmlToDocx(tag_overrides=tag_overrides)

html = """
Executive Summary
Background
SELECT * FROM reports WHERE year = 2024;
"""
parser.add_html_to_document(html, doc)
doc.save('executive_summary.docx')
```

--------------------------------

### Add HTML to python-docx Document Object

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Integrate HTML content directly into a python-docx Document object. This allows for further manipulation of the document before saving.

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()

html_string = 'Hello world'
parser.add_html_to_document(html_string, document)

document.save('your_file_name.docx')
```

--------------------------------

### Add HTML to an Existing Document

Source: https://context7.com/dfop02/html4docx/llms.txt

Append parsed HTML content to a python-docx Document object. This method can be called multiple times to build a document incrementally. Ensure the input is a string and the document is a valid Document or _Cell object.

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()

html_parts = [
'Annual Report 2024',
'This report covers all fiscal quarters.',
'Q1: $1.2M revenue
Q2: $1.5M revenue',
''
''
'Quarter Revenue
Q1 $1.2M
Q2 $1.5M',
]

for part in html_parts:
parser.add_html_to_document(part, document)

document.save('annual_report.docx')
# Raises ValueError if html is not str or document is not a Document/_Cell
```

--------------------------------

### HtmlToDocx.parse_html_string

Source: https://context7.com/dfop02/html4docx/llms.txt

Converts an HTML string into a new python-docx Document object. This method is suitable for single, self-contained HTML conversions.

```APIDOC
## HtmlToDocx.parse_html_string — Convert HTML string to a new Document

Pareses an HTML string and returns a brand-new `python-docx` `Document` object. Ideal for one-shot conversions.

```python
from html4docx import HtmlToDocx

html = """
Product Specification
Model: X-500
Note: Subject to change without notice.

Step one: Unbox the unit
Step two: Connect to power

Use the provided cable
Verify LED indicator

"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('spec_sheet.docx')
# Returns: docx.document.Document
```
```

--------------------------------

### HtmlToDocx.add_html_to_document

Source: https://context7.com/dfop02/html4docx/llms.txt

Appends parsed HTML content to an existing python-docx Document object. This method is the primary interface for programmatic use and can be called multiple times to build a document incrementally.

```APIDOC
## HtmlToDocx.add_html_to_document — Append HTML to an existing Document object

The primary method for programmatic use. Appends the parsed HTML content at the end of a `python-docx` `Document` object. Can be called multiple times to build a document incrementally.

```python
from docx import Document
from html4docx import HtmlToDocx

document = Document()
parser = HtmlToDocx()

html_parts = [
'Annual Report 2024',
'This report covers all fiscal quarters.',
'Q1: $1.2M revenue
Q2: $1.5M revenue',
''
''
'Quarter Revenue
Q1 $1.2M
Q2 $1.5M',
]

for part in html_parts:
parser.add_html_to_document(part, document)

document.save('annual_report.docx')
# Raises ValueError if html is not str or document is not a Document/_Cell
```
```

--------------------------------

### Utilize !important Flag in Inline CSS

Source: https://github.com/dfop02/html4docx/blob/main/README.md

Ensure the highest CSS precedence for inline styles by using the '!important' flag. This overrides other style declarations.

```html

Gray text with red important.

```

--------------------------------

### Handle !important CSS Flag for Style Overrides

Source: https://context7.com/dfop02/html4docx/llms.txt

Styles marked with !important on a child element will override any parent-level styles for the same property, mimicking CSS cascade behavior. This is useful for ensuring specific styles take precedence.

```python
from html4docx import HtmlToDocx

html = """

Normal gray text,

important red override

back to gray.

"""

parser = HtmlToDocx()
doc = parser.parse_html_string(html)
doc.save('important_styles.docx')
# The span overrides the paragraph's gray color and 11pt size with red and 14pt.
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.}}

Header A	Header B
Top-aligned dashed cell	Left accent border cell
Merged cell spanning 2 columns