### Add Custom Properties to DOCX Source: https://github.com/onizet/html2openxml/wiki/Custom-document-properties-in-a-docx Example of how to initialize a dictionary of custom properties and call the helper method to update them in a DOCX package. ```csharp // Add all our custom properties in the property bag Dictionary customProperties = new Dictionary() { { "MyProperty", "Some Value" } }; OpenXmlHelper.UpdateDocumentProperties(package, customProperties); ``` -------------------------------- ### HTML Content Example Source: https://github.com/onizet/html2openxml/wiki/Home This is an example of HTML content that can be converted. Ensure your HTML is well-formed. ```html Looks how cool is Open Xml. Now with HtmlToOpenXml, it nevers been so easy to convert html.

If you like it, add me a rating on github


``` -------------------------------- ### HTML Pre Tag Example Source: https://github.com/onizet/html2openxml/wiki/Preformatted-Text-(-pre-) Demonstrates how spaces and line breaks are preserved within an HTML
 tag. This is useful for displaying code or preformatted text.

```html
Text in a pre element
is displayed in a fixed-width
font, and it preserves
both      spaces and
line breaks
``` -------------------------------- ### Reset or Override List Starting Number Source: https://github.com/onizet/html2openxml/wiki/Numbering-List Use the 'start' attribute on the 'ol' tag to reset the numbering of a list or start it from a specific value. ```html
  1. Item
  1. Item
``` -------------------------------- ### Embedding Clickable Images in HTML Links Source: https://github.com/onizet/html2openxml/wiki/Hyperlinks Shows how to create clickable links that contain images. The first example links text, while the second embeds an image within the anchor tag. ```html HtmlToOpenXml

Wikipedia, the Free Encyclopedia ``` -------------------------------- ### Enable Tracing in app.config/web.config Source: https://github.com/onizet/html2openxml/wiki/Logging Configure .NET's tracing mechanism in your application's configuration file to capture silent error messages from html2openxml. This example sets up a text trace listener to log errors to 'error.log'. ```xml ``` -------------------------------- ### Subclass DefaultWebRequest for Image Format Conversion Source: https://context7.com/onizet/html2openxml/llms.txt Extend `DefaultWebRequest` to handle unsupported image formats. This example demonstrates converting WebP images to PNG using SixLabors.ImageSharp during the download process. ```csharp // Convert unsupported WebP to PNG using SixLabors.ImageSharp class WebPWebRequest : HtmlToOpenXml.IO.DefaultWebRequest { protected override async Task DownloadHttpFile( Uri requestUri, CancellationToken cancellationToken) { var resource = await base.DownloadHttpFile(requestUri, cancellationToken); if (resource?.Content is not null && requestUri.OriginalString.EndsWith(".webp", StringComparison.OrdinalIgnoreCase)) { using var img = await SixLabors.ImageSharp.Image.LoadAsync(resource.Content, cancellationToken); var convertedStream = new MemoryStream(); await img.SaveAsPngAsync(convertedStream, cancellationToken); convertedStream.Position = 0L; resource.Content.Dispose(); resource.Content = convertedStream; } return resource; } } var converter2 = new HtmlConverter(mainPart, new WebPWebRequest()); await converter2.ParseBody(""); ``` -------------------------------- ### Inline Base64 Image HTML Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Example of an inline image using the data URI format with base64 encoding for PNG images. ```html Red dot ``` -------------------------------- ### HTML Table with Combined Rowspan and Colspan Source: https://github.com/onizet/html2openxml/wiki/Table Provides an example of a complex table cell spanning multiple rows and columns simultaneously. This requires careful definition of `rowspan` and `colspan` attributes. ```html
Cell 1 Value 1
Cell 1
Cell 2 Value 2 Value 3
``` -------------------------------- ### Nested Lists with Custom Styling and Numbering Source: https://context7.com/onizet/html2openxml/llms.txt Convert ordered and unordered lists with up to 8 nesting levels. Customize list appearance using `list-style-type` CSS or the `type` attribute. The `start` attribute controls the initial number, and `ContinueNumbering` affects sequential list numbering. ```csharp var converter = new HtmlConverter(mainPart) { ContinueNumbering = false, // always restart numbering at 1 SupportsHeadingNumbering = false // disable auto-detect of "1. Heading" patterns }; await converter.ParseBody("""
  1. Alpha
  2. Beta
    • Sub-item one
    • Sub-item two
  1. Starts at ten
  2. Eleven
"""); ``` -------------------------------- ### Build the Project Source: https://github.com/onizet/html2openxml/blob/dev/examples/Benchmark/README.md Build the project in Release configuration before running benchmarks. ```bash dotnet build -c Release ``` -------------------------------- ### Manual Image Provisioning Event Handler Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Implement manual image provisioning by setting ImageProcessing to ManualProvisioning and handling the ProvisionImage event. This allows custom logic for reading image data, such as from a file system. ```csharp converter.ImageProcessing = ImageProcessing.ManualProvisioning; converter.ProvisionImage += converter_ProvisionImage; ... private void converter_ProvisionImage(object sender, ProvisionImageEventArgs e) { // Read the image from the file system: e.Data = File.ReadAllBytes(@"c:\inetpub\wwwroot\mysite\images\" + e.ImageUrl); } ``` -------------------------------- ### Serve Generated DOCX from ASP.NET Core Source: https://context7.com/onizet/html2openxml/llms.txt Generate a DOCX document in memory and stream it to an HTTP client with appropriate headers for download. ```csharp // ASP.NET Core minimal API example app.MapGet("/generate", async (HttpResponse response) => { string html = "

Report

Generated at " + DateTime.UtcNow + "

"; using var stream = new MemoryStream(); using (var package = WordprocessingDocument.Create(stream, WordprocessingDocumentType.Document)) { var mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); var converter = new HtmlConverter(mainPart); await converter.ParseBody(html); mainPart.Document.Save(); } response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; response.Headers["Content-Disposition"] = "attachment; filename=\"report.docx\""; response.Headers["Content-Length"] = stream.Length.ToString(); stream.Position = 0; await stream.CopyToAsync(response.Body); }); ``` -------------------------------- ### Convert .dotx to .docx using Open XML SDK Source: https://github.com/onizet/html2openxml/wiki/Convert-.dotx-to-.docx This snippet demonstrates the core logic for converting a .dotx file to a .docx document. It requires a writable WordprocessingDocument and involves changing the document type and adding an attached template relationship. ```csharp using (WordprocessingDocument template = WordprocessingDocument.Open(documentStream, true)) { template.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document); MainDocumentPart mainPart = template.MainDocumentPart; mainPart.DocumentSettingsPart.AddExternalRelationship( "http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(@"C:\temp.dotx", UriKind.Absolute)); ... } ``` -------------------------------- ### SVG Image XML Structure Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Example of an SVG XML structure that can be embedded as an image resource. The 'title' and 'desc' elements within the SVG can be used for image descriptions in Word. ```xml Illustration of a Kiwi Kiwi (/ˈkiːwiː/ KEE-wee)[4] are flightless birds endemic to New Zealand of the order Apterygiformes. [...] ``` -------------------------------- ### Complete .dotx to .docx Conversion Program Source: https://github.com/onizet/html2openxml/wiki/Convert-.dotx-to-.docx This is a complete C# program that reads a .dotx template, converts it to a .docx document in memory, saves it to a file, and then opens it with Microsoft Word. It includes a helper method for copying streams. ```csharp using System; using System.IO; using DocumentFormat.OpenXml.Packaging; namespace TemplateConverter { class Program { static void Main(string[] args) { MemoryStream documentStream; String templatePath = Path.Combine(Environment.CurrentDirectory, "template.dotx"); using (Stream tplStream = File.OpenRead(templatePath)) { documentStream = new MemoryStream((int)tplStream.Length); CopyStream(tplStream, documentStream); documentStream.Position = 0L; } using (WordprocessingDocument template = WordprocessingDocument.Open(documentStream, true)) { // We convert our template to a docx. These 2 lines of code do the tricks. The attached template // link refer to the location of the template used to build the document. I just hard-code a // dummy link (it's useless) template.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document); MainDocumentPart mainPart = template.MainDocumentPart; mainPart.DocumentSettingsPart.AddExternalRelationship( "http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(templatePath, UriKind.Absolute)); mainPart.Document.Save(); } string path = "test.doc"; File.WriteAllBytes(path, documentStream.ToArray()); // Run Word to open the document: System.Diagnostics.Process.Start(path); } /// /// Write the content of a stream into another. /// public static void CopyStream(Stream source, Stream target) { if (source != null) { MemoryStream mstream = source as MemoryStream; if (mstream != null) mstream.WriteTo(target); else { byte[] buffer = new byte[2048]; int length = buffer.Length, size; while ((size = source.Read(buffer, 0, length)) != 0) target.Write(buffer, 0, size); } } } } } ``` -------------------------------- ### HTML Structure with Anchors and Links Source: https://github.com/onizet/html2openxml/wiki/Hyperlinks Demonstrates how to create a table of contents using HTML anchors and links, including a link to the top of the document using the #_top alias. ```html

Table of Contents

Heading 1

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis dictum leo quis ipsum tempor nec ultrices sapien elementum.

Heading 2

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis dictum leo quis ipsum tempor nec ultrices sapien elementum.
  • Back to ToC
  • ``` -------------------------------- ### Custom WebP Image Handling in C# Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Demonstrates how to extend DefaultWebRequest to handle unsupported image formats like WebP by converting them to PNG. Requires the SixLabors.ImageSharp NuGet package. ```csharp class WebPWebRequest : DefaultWebRequest { protected override async Task DownloadHttpFile(Uri requestUri, CancellationToken cancellationToken) { var resource = await base.DownloadHttpFile(requestUri, cancellationToken); if (resource.Content is not null && requestUri.OriginalString.EndsWith(".webp", StringComparison.OrdinalIgnoreCase)) { using var img = await Image.LoadAsync(resource.Content, cancellationToken); var convertedStream = new System.IO.MemoryStream(); await img.SaveAsPngAsync(convertedStream, cancellationToken); convertedStream.Position = 0L; resource.Content.Dispose(); resource.Content = convertedStream; } return resource; } } HtmlConverter converter = new HtmlConverter(mainPart, new WebPWebRequest()); await converter.ParseBody(@""); ``` -------------------------------- ### Configure Web Proxy Credentials and Proxy Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Set custom credentials and proxy for the WebClient used in automatic image downloading. This is useful when anonymous access is not sufficient. ```csharp converter.WebProxy.Credentials = new NetworkCredential("john", "123456", "codeplex"); converter.WebProxy.Proxy = new System.Net.WebProxy("http://proxy-isa:8080"); ``` -------------------------------- ### Update Custom Document Properties in DOCX Source: https://github.com/onizet/html2openxml/wiki/Custom-document-properties-in-a-docx This C# code updates or adds custom properties to a DOCX file. It handles finding the highest existing property ID to ensure new properties get unique IDs. It also manages updating existing properties and adding new ones to the CustomFilePropertiesPart. ```csharp using System; using System.Collections.Generic; using System.IO; using System.Xml; using DocumentFormat.OpenXml.Packaging; using System.Xml.XPath; using op = DocumentFormat.OpenXml.CustomProperties; using vt = DocumentFormat.OpenXml.VariantTypes; namespace OpenXmlDemo { static class OpenXmlHelper { /// /// Update or add all the specified custom properties inside the document. /// public static void UpdateDocumentProperties(WordprocessingDocument package, IDictionary properties) { int nextPropertyId = 2; // 2 is the minimum ID set by MS Office. Don't have a clue why they don't start at 0. CustomFilePropertiesPart customPropertiesPart = package.CustomFilePropertiesPart; MainDocumentPart mainPart = package.MainDocumentPart; if (customPropertiesPart == null) { customPropertiesPart = package.AddCustomFilePropertiesPart(); new op.Properties().Save(customPropertiesPart); } else { // In order to add properties in the document, we need to assignoper an unique id // to each Property object. So we'll loop through all of the existing elements // to find the highest Id, then increment it for each new property. foreach (var p in package.CustomFilePropertiesPart.Properties.Elements()) { if (p.PropertyId.Value > nextPropertyId) nextPropertyId = p.PropertyId; } if(nextPropertyId > 2) nextPropertyId++; } // Get back all the custom properties contained in this document. var knownCustomProperties = new Dictionary(); foreach (var p in customPropertiesPart.Properties.Elements()) { knownCustomProperties.Add(p.Name, p); } // For each of the properties specified in parameters, ensure the property exists // and update its value Queue> propertiesToUpdate = new Queue>(); foreach (var p in properties) { op.Property propertyValue; if (knownCustomProperties.TryGetValue(p.Key, out propertyValue)) { propertyValue.RemoveAllChildren(); propertyValue.Append(new vt.VTLPWSTR(p.Value)); propertiesToUpdate.Enqueue(p); } else { customPropertiesPart.Properties.Append(new op.Property( new vt.VTLPWSTR(p.Value) ) { FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}", PropertyId = nextPropertyId++, Name = p.Key }); } } customPropertiesPart.Properties.Save(); // No properties was already existing so no chances we found them in the footer/header/document if (propertiesToUpdate.Count == 0) return; List parts = new List(3); parts.Add(new ContentDocumentParts { Part = mainPart }); IEnumerator enHeader = mainPart.HeaderParts.GetEnumerator(); if (enHeader.MoveNext()) parts.Add(new ContentDocumentParts { Part = enHeader.Current }); IEnumerator enFooter = mainPart.FooterParts.GetEnumerator(); if (enFooter.MoveNext()) parts.Add(new ContentDocumentParts { Part = enFooter.Current }); for (int i = 0; i < parts.Count; i++) { parts[i].PartStream = parts[i].Part.GetStream(FileMode.Open, FileAccess.ReadWrite); parts[i].XmlDocument = new XmlDocument(); parts[i].XmlDocument.Load(parts[i].PartStream); } try { XmlNamespaceManager nsMgr = new XmlNamespaceManager(new NameTable()); nsMgr.AddNamespace("w", "http://schemas.openxmlformats.org/wordprocessingml/2006/main"); nsMgr.AddNamespace("op", "http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"); // Else we will update all the field codes. while (propertiesToUpdate.Count > 0) { var prop = propertiesToUpdate.Dequeue(); parts.ForEach(p => { // Quoted from http://openxmldeveloper.org/forums/thread/962.aspx // The reason I do a RemoveAll is because if some fool half formated the field in the document there will be multiple tags. ``` -------------------------------- ### Run Performance Tests Source: https://github.com/onizet/html2openxml/blob/dev/examples/Benchmark/README.md Execute the performance test targeting multiple .NET runtimes. ```bash dotnet run -c Release -f net8.0 --runtimes net48 net8.0 ``` -------------------------------- ### Customize HTTP Image Downloader with HttpClient Source: https://context7.com/onizet/html2openxml/llms.txt Provide a custom `HttpClient` to the `DefaultWebRequest` for scenarios requiring authentication headers or proxy configurations. ```csharp // Custom HttpClient with auth token var httpClient = new HttpClient(); httpClient.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", "my-token"); var authRequest = new HtmlToOpenXml.IO.DefaultWebRequest(httpClient); var secureConverter = new HtmlConverter(mainPart, authRequest); ``` -------------------------------- ### Customize HTTP Image Downloader with Base URL Source: https://context7.com/onizet/html2openxml/llms.txt Resolve relative image URLs by setting a `BaseImageUrl`. This is useful for consolidating assets from a CDN or specific directory. ```csharp // Resolve relative image URLs against a base URL var webRequest = new HtmlToOpenXml.IO.DefaultWebRequest() { BaseImageUrl = new Uri("https://cdn.example.com/assets/") }; var converter = new HtmlConverter(mainPart, webRequest); await converter.ParseBody(""); // resolves to https://cdn.example.com/assets/logo.png ``` -------------------------------- ### Preformatted Text Rendering Options Source: https://context7.com/onizet/html2openxml/llms.txt Configure how `
    ` tags are rendered. By default, they are rendered within a single-cell table for a code-block appearance. Set `RenderPreAsTable = false` to render them as plain paragraphs while preserving whitespace and indentation.
    
    ```csharp
    // Default: pre rendered as a table (code-block appearance)
    var converter = new HtmlConverter(mainPart) { RenderPreAsTable = true };
    await converter.ParseBody("""
        
        function greet(name) {
            return "Hello, " + name + "!";
        }
        
    """); // Plain paragraphs (no table box) converter.RenderPreAsTable = false; await converter.ParseBody("
    Plain\n  preserved\n    indentation
    "); ``` -------------------------------- ### Generate and Serve DOCX in Classic ASP.NET Source: https://github.com/onizet/html2openxml/wiki/Serves-a-generated-docx-from-the-server Use this handler to generate and serve DOCX files in a classic ASP.NET application. It includes logic for filename encoding and handling browser-specific issues, particularly for Internet Explorer. ```csharp public class GenerateDocument : IHttpHandler { public void ProcessRequest(HttpContext context) { String documentName = "My Document"; if (documentName.Length > 128) documentName = documentName.Substring(0, 128); string encodedFilename = documentName.Replace(';', ' '); // Avoid accent encoding bug if (request.Browser.Browser.Contains("IE")) { encodedFilename = Uri.EscapeDataString(Path.GetFileNameWithoutExtension(encodedFilename)).Replace("%20", " "); } // IE cannot download an MS Office document from a website using SSL if the response // contains HTTP headers such as: Pragram: no-cache and/or Cache-control: no-cache,max-age=0,must-revalidate // http://support.microsoft.com/kb/316431/ if (!(request.IsSecureConnection && request.Browser.Browser.Contains("IE"))) response.Cache.SetCacheability(System.Web.HttpCacheability.NoCache); using (MemoryStream mstream = GenerateDocument()) { context.Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; context.Response.AppendHeader("Content-Disposition", String.Concat("attachment;filename=\", encodedFilename, ".docx\"")); context.Response.AddHeader("Content-Length", mstream.Length.ToString()); mstream.WriteTo(context.Response.OutputStream); } // use IsClientConnection to avoid an HttpException // see http://stackoverflow.com/questions/1556073/response-flush-throws-system-web-httpexception if (context.Response.IsClientConnected) try { context.Response.Flush(); } catch (System.Web.HttpException) { } } private MemoryStream GenerateDocument() { ... } public bool IsReusable { get { return false; } } } ``` -------------------------------- ### Convert .dotx Template to .docx with Injected Content Source: https://context7.com/onizet/html2openxml/llms.txt Open a Word template, change its type, inject HTML content using `HtmlConverter`, and save as a DOCX, preserving template styles. ```csharp MemoryStream documentStream; string templatePath = "template.dotx"; using (var tplStream = File.OpenRead(templatePath)) { documentStream = new MemoryStream((int)tplStream.Length); await tplStream.CopyToAsync(documentStream); documentStream.Position = 0L; } using (var template = WordprocessingDocument.Open(documentStream, isEditable: true)) { template.ChangeDocumentType(WordprocessingDocumentType.Document); var mainPart = template.MainDocumentPart!; mainPart.DocumentSettingsPart!.AddExternalRelationship( "http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(templatePath, UriKind.Absolute)); var converter = new HtmlConverter(mainPart); await converter.ParseBody("

    From Template

    Content injected via HtmlToOpenXml.

    "); mainPart.Document.Save(); } await File.WriteAllBytesAsync("output.docx", documentStream.ToArray()); ``` -------------------------------- ### Nested Lists with Tables Source: https://github.com/onizet/html2openxml/wiki/Numbering-List Demonstrates how nested lists containing tables are rendered, ensuring alignment respects the list hanging. ```html
    1. Item 1
      Cell1
      1. Item 1.1
        Cell1.1
    ``` -------------------------------- ### Integrate ILogger for Image Download Diagnostics Source: https://context7.com/onizet/html2openxml/llms.txt Pass an `ILogger` instance to `DefaultWebRequest` to capture detailed logs of image download operations, useful for debugging. ```csharp using Microsoft.Extensions.Logging; var loggerFactory = LoggerFactory.Create(builder => builder.AddConsole().SetMinimumLevel(LogLevel.Debug)); var logger = loggerFactory.CreateLogger(); var webRequest = new HtmlToOpenXml.IO.DefaultWebRequest(httpClient: null, logger: logger); var converter = new HtmlConverter(mainPart, webRequest); // Logs: "Downloading remote file: https://example.com/image.png" await converter.ParseBody(""); ``` -------------------------------- ### Set Base URL for Relative Image Links Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Configure a BaseImageUrl to enable the converter to resolve relative image URLs, such as '/_layouts/images/pic.gif'. This can be a web URI or a local file system path. ```csharp converter.BaseImageUrl = new Uri("http://myserver:8080/"); ``` ```csharp converter.BaseImageUrl = new Uri(@"C:\inetpub\wwwroot\static-assets\"); ``` -------------------------------- ### Provide ILogger for WebRequest Source: https://github.com/onizet/html2openxml/wiki/Logging Instantiate HtmlConverter with a custom ILogger to troubleshoot IO access issues when downloading images. A null HttpClient will use the default implementation. ```csharp HtmlConverter converter = new HtmlConverter(mainPart, new DefaultWebRequest(null, logger)); ``` -------------------------------- ### Convert HTML to OpenXml Document Body Source: https://context7.com/onizet/html2openxml/llms.txt Demonstrates the basic usage of HtmlConverter to parse an HTML string and append it to the main document part of a Word document. Ensure the OpenXml SDK and HtmlToOpenXml libraries are referenced. ```csharp using System.IO; using DocumentFormat.OpenXml; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; using HtmlToOpenXml; string html = """

    Welcome

    This is bold and italic text.

    • Item A
    • Item B
    """; using var generatedDocument = new MemoryStream(); using (var package = WordprocessingDocument.Create(generatedDocument, WordprocessingDocumentType.Document)) { var mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); var converter = new HtmlConverter(mainPart); await converter.ParseBody(html); mainPart.Document.Save(); } await File.WriteAllBytesAsync("output.docx", generatedDocument.ToArray()); ``` -------------------------------- ### Generate and Serve DOCX in SharePoint Source: https://github.com/onizet/html2openxml/wiki/Serves-a-generated-docx-from-the-server This handler is designed for SharePoint environments to generate and serve DOCX files. It utilizes SharePoint's utility for URL encoding filenames and includes headers to control browser behavior based on farm administrator settings. ```csharp public class GenerateDocument : IHttpHandler { public void ProcessRequest(HttpContext context) { SPContext spcontext = SPContext.GetContext(context); SPWebApplication webApplication = spcontext.Site.WebApplication; String documentName = "My Document"; if (documentName.Length > 128) documentName = documentName.Substring(0, 128); string encodedFilename = SPHttpUtility.UrlEncodeFilenameForHttpHeader(documentName.Replace(';', ' ')); // ensure whether the farm admin force the user to download the file or to display it inside the browser. if (webApplication.BrowserFileHandling == SPBrowserFileHandling.Strict) { context.Response.AppendHeader("X-Content-Type-Options", "nosniff"); context.Response.AppendHeader("X-Download-Options", "noopen"); } using (MemoryStream mstream = GenerateDocument()) { context.Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; context.Response.AppendHeader("Content-Disposition", String.Concat("attachment;filename=\", encodedFilename, ".docx\"")); context.Response.AddHeader("Content-Length", mstream.Length.ToString()); mstream.WriteTo(context.Response.OutputStream); } if (context.Response.IsClientConnected) try { context.Response.Flush(); } catch (System.Web.HttpException) { } } private MemoryStream GenerateDocument() { ... } public bool IsReusable { get { return false; } } } ``` -------------------------------- ### Manually Set Image Size Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Optionally set the image size manually within the ProvisionImage event handler. If left empty (0,0), the converter will automatically detect the image dimensions. ```csharp e.ImageSize = new System.Drawing.Size(50, 50); ``` -------------------------------- ### Control Image Parallelism with ParallelOptions Source: https://context7.com/onizet/html2openxml/llms.txt Use `ParallelOptions` to limit the number of concurrent image downloads and enable cancellation during HTML parsing. ```csharp using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30)); var parallelOptions = new ParallelOptions { MaxDegreeOfParallelism = 4, CancellationToken = cts.Token }; var converter = new HtmlConverter(mainPart); var elements = await converter.ParseAsync( "", parallelOptions); var body = mainPart.Document.Body!; foreach (var el in elements) body.AppendChild(el); ``` -------------------------------- ### C# HTML to DOCX Conversion Source: https://github.com/onizet/html2openxml/wiki/Home This C# code demonstrates how to convert an HTML string to a DOCX file using the HtmlConverter class. It sets up a new WordprocessingDocument, parses the HTML, and saves the result. ```csharp using System.IO; using DocumentFormat.OpenXml; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; using HtmlToOpenXml; ... static void Main(string[]() args) { const string filename = "test.docx"; string html = Properties.Resources.DemoHtml; if (File.Exists(filename)) File.Delete(filename); using (MemoryStream generatedDocument = new MemoryStream()) { using (WordprocessingDocument package = WordprocessingDocument.Create(generatedDocument, WordprocessingDocumentType.Document)) { MainDocumentPart mainPart = package.MainDocumentPart; if (mainPart == null) { mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); } HtmlConverter converter = new HtmlConverter(mainPart); await converter.ParseBody(html); mainPart.Document.Save(); } File.WriteAllBytes(filename, generatedDocument.ToArray()); } System.Diagnostics.Process.Start(filename); } ``` -------------------------------- ### Inline CSS with PreMailer.Net Source: https://github.com/onizet/html2openxml/wiki/Style Use PreMailer.Net to move CSS from external stylesheets or style tags into inline style attributes on HTML elements before conversion. ```csharp string html = ResourceHelper.GetString("Resources.CompleteRunTest.html"); string css = ResourceHelper.GetString("Resources.style.css"); var result = PreMailer.Net.PreMailer.MoveCssInline(html, css: css); html = result.Html; HtmlConverter converter = new HtmlConverter(mainPart); await converter.ParseHtml(html); ``` -------------------------------- ### HtmlConverter - Core Conversion Source: https://context7.com/onizet/html2openxml/llms.txt Demonstrates the basic usage of the HtmlConverter class to parse an HTML string and append it to the main document body. ```APIDOC ## HtmlConverter ### Description The entry point for all conversion work. Constructed with a `MainDocumentPart` (and optionally a custom `IWebRequest` for image downloading), it parses HTML strings into OpenXml elements and appends them to the document body, header, or footer. ### Method Signature ```csharp public HtmlConverter(MainDocumentPart mainDocumentPart, IWebRequest webRequest = null) ``` ### Usage Example ```csharp using System.IO; using DocumentFormat.OpenXml; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; using HtmlToOpenXml; string html = """

    Welcome

    This is bold and italic text.

    • Item A
    • Item B
    """; using var generatedDocument = new MemoryStream(); using (var package = WordprocessingDocument.Create(generatedDocument, WordprocessingDocumentType.Document)) { var mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); var converter = new HtmlConverter(mainPart); await converter.ParseBody(html); mainPart.Document.Save(); } await File.WriteAllBytesAsync("output.docx", generatedDocument.ToArray()); ``` ``` -------------------------------- ### HTML Table with Column Definitions Source: https://github.com/onizet/html2openxml/wiki/Table Demonstrates the use of `` and `` tags to define styles for entire columns within a table. The `span` attribute can apply styles to multiple columns. ```html
    AttendeePresent?Diet
    JaneYesGluten free
    RickNo
    MattYesVeggie
    ``` -------------------------------- ### Nested HTML Table Source: https://github.com/onizet/html2openxml/wiki/Table Illustrates the creation of nested tables within an HTML document. Ensure proper nesting and styling for correct rendering in the output. ```html
    The containing table
    A nested table
    ``` -------------------------------- ### HTML Table with Rowspan and Colspan Source: https://github.com/onizet/html2openxml/wiki/Table Demonstrates the use of rowspan and colspan attributes in HTML tables for creating complex cell layouts. Ensure correct HTML structure for proper rendering. ```html
    Anime Studio Pixar
    Studio Ghibli
    Studio Animes
    Pixar The incredibles Ratatouille
    Studio Ghibli Grave of the Fireflies Spirited Away
    ``` -------------------------------- ### Anchor Links and Bookmarks Source: https://context7.com/onizet/html2openxml/llms.txt Enable anchor link support to convert internal HTML links (``) into Word bookmarks. Use the `id` or `data-bookmark` attribute to define bookmark targets. Special anchors `#_top` and `#top` link to the document's beginning. ```csharp var converter = new HtmlConverter(mainPart) { SupportsAnchorLinks = true // default; set false to skip internal anchors }; await converter.ParseBody("""

    Table of Contents

    Introduction

    Lorem ipsum dolor sit amet.

    Details

    More detail here.

    Back to top

    """); ``` -------------------------------- ### Registering Bookmarks with data-bookmark Attribute Source: https://github.com/onizet/html2openxml/wiki/Hyperlinks Use the data-bookmark attribute on an HTML element to explicitly register a bookmark in the OpenXml document, which can then be linked to. ```html

    Heading 1.1

    ``` -------------------------------- ### Helper Class for Document Parts Source: https://github.com/onizet/html2openxml/wiki/Custom-document-properties-in-a-docx Defines a structure to hold a document part's stream, its Open XML part, and its XML document representation. ```csharp sealed class ContentDocumentParts { public Stream PartStream; public OpenXmlPart Part; public XmlDocument XmlDocument; } ``` -------------------------------- ### Parse HTML and Return Elements for Manual Insertion Source: https://context7.com/onizet/html2openxml/llms.txt Demonstrates using HtmlConverter.ParseAsync to parse HTML into OpenXml elements without automatically appending them. This allows for fine-grained control over element insertion points within the document body. ```csharp using var stream = new MemoryStream(); using (var package = WordprocessingDocument.Create(stream, WordprocessingDocumentType.Document)) { var mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); var converter = new HtmlConverter(mainPart); // Parse and insert manually before a specific body element var elements = await converter.ParseAsync("

    Injected paragraph

    "); var body = mainPart.Document.Body!; foreach (var el in elements) body.AppendChild(el); mainPart.Document.Save(); } ``` -------------------------------- ### Set Base Image URL for External Downloads in C# Source: https://github.com/onizet/html2openxml/wiki/ImageProcessing Configures the default web request handler to resolve relative image URLs by providing a base URL. This is useful when images are hosted on a specific path. ```csharp HtmlConverter converter = new(mainPart, new HtmlToOpenXml.IO.DefaultWebRequest() { BaseImageUrl = "http://web.site/path/" }) ``` -------------------------------- ### Parse HTML into Document Header or Footer Source: https://context7.com/onizet/html2openxml/llms.txt Illustrates how to use HtmlConverter.ParseHeader and HtmlConverter.ParseFooter to insert HTML content into different types of document headers and footers. This is necessary because images and hyperlinks require relationships scoped to their respective parts. ```csharp var converter = new HtmlConverter(mainPart); // Default header (shown on all pages) await converter.ParseHeader( "

    Company Confidential

    ", HeaderFooterValues.Default); // First-page-only footer await converter.ParseFooter( "

    Page 1

    ", HeaderFooterValues.First); // Even-page footer await converter.ParseFooter( "

    Even Page Footer

    ", HeaderFooterValues.Even); ``` -------------------------------- ### Read Custom Properties using OpenXml Source: https://github.com/onizet/html2openxml/wiki/Custom-document-properties-in-a-docx Iterates through custom properties in a DOCX file and prints their names and values to the console. Requires access to the CustomFilePropertiesPart. ```csharp foreach (var p in package.CustomFilePropertiesPart.Properties.Elements()) { Console.WriteLine("{0}: {1}", p.Name, p.InnerText); } ``` -------------------------------- ### Configure Image Embedding Modes Source: https://context7.com/onizet/html2openxml/llms.txt Control how images are processed during conversion. Choose between embedding all images, linking them externally, or only embedding base64 data URIs. ```csharp // Embed all images (default — self-contained document) var converter = new HtmlConverter(mainPart) { ImageProcessing = ImageProcessingMode.Embed }; // External link mode — images loaded by viewer at open time var converterExternal = new HtmlConverter(mainPart) { ImageProcessing = ImageProcessingMode.LinkExternal }; // Only embed inline base64 images; skip http/https/file images var converterDataUri = new HtmlConverter(mainPart) { ImageProcessing = ImageProcessingMode.EmbedDataUriOnly }; // Base64 inline image always supported regardless of mode await converter.ParseBody("\ \"Red "); ``` -------------------------------- ### Correct Content Type for DOCX Files Source: https://github.com/onizet/html2openxml/wiki/Serves-a-generated-docx-from-the-server Specifies the correct MIME type for DOCX files to ensure they are downloaded and not corrupted. This is a common issue when serving OpenXML SDK generated documents. ```text application/vnd.openxmlformats-officedocument.wordprocessingml.document ``` -------------------------------- ### Configure Footnotes as Document Notes using C# Source: https://github.com/onizet/html2openxml/wiki/Footnotes To replace standard footnotes with document notes appearing at the end of the document, set the AcronymPosition property of the converter to AcronymPosition.DocumentEnd. ```csharp converter.AcronymPosition = AcronymPosition.DocumentEnd; ``` -------------------------------- ### Apply RTL to Document Body Source: https://github.com/onizet/html2openxml/wiki/Right-to-Left-(Rtl)-and-Left-to-Right Use the `dir="rtl"` attribute on the `body` tag to apply RTL to the entire OpenXml document settings. ```html

    Heading

    ``` -------------------------------- ### Append Multiple HTML Chunks to Document Body Source: https://context7.com/onizet/html2openxml/llms.txt Shows how to sequentially append multiple HTML strings to the document body using HtmlConverter.ParseBody. This is useful for building documents from different content sources. ```csharp using var stream = new MemoryStream(); using (var package = WordprocessingDocument.Create(stream, WordprocessingDocumentType.Document)) { var mainPart = package.AddMainDocumentPart(); new Document(new Body()).Save(mainPart); var converter = new HtmlConverter(mainPart); // Append multiple HTML chunks sequentially await converter.ParseBody("

    Chapter 1

    Introduction text.

    "); await converter.ParseBody("

    Section 1.1

    More content here.

    "); mainPart.Document.Save(); } ``` -------------------------------- ### Set Default OpenXml Style Source: https://github.com/onizet/html2openxml/wiki/Style Assign a specific OpenXml style to be used as a default for elements like headings when the corresponding style is missing in the Word document. This ensures consistent formatting. ```csharp converter.HtmlStyles.DefaultStyle = converter.HtmlStyles.GetStyle("Intense Quote"); ``` -------------------------------- ### Complex Table Rendering with Spans and Column Definitions Source: https://context7.com/onizet/html2openxml/llms.txt Render HTML tables with advanced features like `rowspan`, `colspan`, vertical text, nested tables, and ``/`` definitions. The converter automatically reorders `thead`, `tbody`, and `tfoot` sections. ```csharp await converter.ParseBody("""
    StudioFilm 1Film 2
    Pixar The Incredibles Ratatouille
    Up
    End of table
    """); ``` -------------------------------- ### Disable Table Rendering for Pre Tags in C# Source: https://github.com/onizet/html2openxml/wiki/Preformatted-Text-(-pre-) Configure the converter to render
     tags as plain text instead of a table. Set the RenderPreAsTable property to false.
    
    ```csharp
    converter.RenderPreAsTable = false;
    ```
    
    --------------------------------
    
    ### Globally Reset List Numbering
    
    Source: https://github.com/onizet/html2openxml/wiki/Numbering-List
    
    Opt out of the flag to globally ensure that numbering always resets between consecutive lists.
    
    ```csharp
    HtmlConverter.ContinueNumbering = false;
    ```
    
    --------------------------------
    
    ### Parse HTML into Document Parts
    
    Source: https://github.com/onizet/html2openxml/wiki/Header-and-Footer
    
    Use these methods to parse HTML content and append it to the header, footer, or body of a Word document. The `headerType` parameter in `ParseHeader` and `ParseFooter` specifies on which pages the content should appear.
    
    ```csharp
    public Task ParseHeader(string html, HeaderFooterValues? headerType = null, CancellationToken cancellationToken = default)
    public Task ParseFooter(string html, HeaderFooterValues? headerType = null, CancellationToken cancellationToken = default)
    public Task ParseBody(string html, CancellationToken cancellationToken = default)
    ```
    
    --------------------------------
    
    ### HtmlConverter.ParseHeader / ParseFooter
    
    Source: https://context7.com/onizet/html2openxml/llms.txt
    
    Parses HTML content and appends it into the document's header or footer, supporting different types of headers/footers like default, first page, and even pages.
    
    ```APIDOC
    ## HtmlConverter.ParseHeader / ParseFooter
    
    ### Description
    Parses HTML and appends it into the document header or footer. Because images and hyperlinks require relationships scoped to their container part, these dedicated methods must be used (rather than inserting header/footer content via `ParseBody`).
    
    ### Method Signatures
    ```csharp
    public async Task ParseHeader(string htmlContent, HeaderFooterValues type)
    public async Task ParseFooter(string htmlContent, HeaderFooterValues type)
    ```
    
    ### Usage Example
    ```csharp
    var converter = new HtmlConverter(mainPart);
    
    // Default header (shown on all pages)
    await converter.ParseHeader(
        "

    Company Confidential

    ", HeaderFooterValues.Default); // First-page-only footer await converter.ParseFooter( "

    Page 1

    ", HeaderFooterValues.First); // Even-page footer await converter.ParseFooter( "

    Even Page Footer

    ", HeaderFooterValues.Even); ``` ```