# MuPDF MuPDF is a lightweight, open-source software framework for viewing, converting, and manipulating PDF, XPS, and e-book documents (EPUB, MOBI, FB2, CBZ). Written in portable C, it powers command-line tools (`mutool`), desktop and mobile viewers, and a comprehensive library accessible from C, JavaScript/WebAssembly, Java (Android), and Python (via PyMuPDF). The library provides high-quality rendering at any resolution, full text extraction and search, annotation creation and editing, digital signature support, and low-level PDF object manipulation. MuPDF 1.28 is the current release, available under AGPL v3 or a commercial license from Artifex Software. The JavaScript/WASM binding (`mupdf` on npm) exposes the full C library through an object-oriented API that works identically in Node.js, Bun, and modern web browsers. The same API is available through the `mutool run` command-line interpreter (ES5 only). Core classes include `Document` / `PDFDocument` for file I/O, `Page` / `PDFPage` for rendering and text extraction, `Pixmap` for raster images, `StructuredText` for analyzed text, `PDFAnnotation` for PDF markup, `DisplayList` for cached rendering, and low-level helpers such as `Matrix`, `Path`, `Text`, and `Font`. The C API uses an `fz_context` for thread-local state, a `fz_try/fz_catch` exception model, and reference-counted objects dropped with `fz_drop_*` calls. --- ## JavaScript API ### Install and import the mupdf module Install the npm package and import it as an ES module; all classes are exported from the top-level namespace. ```javascript // npm install mupdf import * as mupdf from "mupdf" // Verify available exports console.log(Object.keys(mupdf)) ``` --- ### `Document.openDocument()` — Open any supported document The static factory opens PDF, XPS, EPUB, MOBI, FB2, CBZ, and image files from a path, a Buffer, or an ArrayBuffer. It also accepts an optional accelerator file for faster EPUB loading and an Archive for resource lookup. ```javascript import * as mupdf from "mupdf" import * as fs from "fs" // Open by file path (MIME type inferred from extension) const doc1 = mupdf.Document.openDocument("report.pdf") // Open from a Node Buffer with explicit MIME type const buf = fs.readFileSync("report.pdf") const doc2 = mupdf.Document.openDocument(buf, "application/pdf") // Check document type console.log(doc2.isPDF()) // true console.log(doc2.countPages()) // e.g. 42 console.log(doc2.getMetaData("info:Title")) console.log(doc2.getMetaData("info:Author")) // Password-protected documents if (doc2.needsPassword()) { const result = doc2.authenticatePassword("secret") // 0=failed, 1=no password needed, 2=user ok, 4=owner ok, 6=both ok if (result === 0) throw new Error("Wrong password") } // Reflowable documents (EPUB) can be re-laid-out const epub = mupdf.Document.openDocument("book.epub") if (epub.isReflowable()) { epub.layout(400, 600, 16) // pageWidth, pageHeight, fontSize } ``` --- ### `Document.prototype.loadPage()` — Load a page for rendering or inspection Returns a `Page` object (or `PDFPage` for PDF files) that provides all rendering, text extraction, search, and link operations for that page. Pages are zero-indexed. ```javascript import * as mupdf from "mupdf" import * as fs from "fs" const doc = mupdf.Document.openDocument("input.pdf") const page = doc.loadPage(0) // first page // Bounding box of the page const bounds = page.getBounds() console.log("Page size:", bounds) // [x0, y0, x1, y1] // Render at 150 dpi (72 dpi is 1x scale) const scale = 150 / 72 const matrix = mupdf.Matrix.scale(scale, scale) const pixmap = page.toPixmap(matrix, mupdf.ColorSpace.DeviceRGB, false, true) fs.writeFileSync("page1.png", pixmap.asPNG()) // Text extraction const stext = page.toStructuredText("preserve-whitespace") console.log(stext.asText()) // Search const hits = page.search("important keyword") hits.forEach(quads => console.log("Hit quads:", quads)) // Links const links = page.getLinks() links.forEach(link => { if (link.uri.startsWith("#")) { console.log("Internal link to page", doc.resolveLink(link)) } else { console.log("External link:", link.uri) } }) ``` --- ### `Page.prototype.toPixmap()` — Render a page to a raster image Renders the full page (with or without annotations/widgets) into a `Pixmap` using the given transformation matrix and colorspace. The pixmap can then be exported as PNG, JPEG, PAM, or PSD. ```javascript import * as mupdf from "mupdf" import * as fs from "fs" const doc = mupdf.Document.openDocument("slides.pdf") const pageCount = doc.countPages() for (let i = 0; i < pageCount; i++) { const page = doc.loadPage(i) // 300 dpi render, RGB, no alpha, include annotations const dpi = 300 const matrix = mupdf.Matrix.scale(dpi / 72, dpi / 72) const pixmap = page.toPixmap(matrix, mupdf.ColorSpace.DeviceRGB, false, true) // Save as PNG fs.writeFileSync(`page-${i + 1}.png`, pixmap.asPNG()) // Or as JPEG at quality 85 // fs.writeFileSync(`page-${i + 1}.jpg`, pixmap.asJPEG(85, false)) console.log(`Page ${i + 1}: ${pixmap.getWidth()}x${pixmap.getHeight()} px`) } ``` --- ### `Page.prototype.toStructuredText()` — Extract structured text from a page Extracts all text on a page, organized into blocks, lines, and characters. The returned `StructuredText` provides plain text, HTML, JSON, search, and walker interfaces. ```javascript import * as mupdf from "mupdf" const doc = mupdf.Document.openDocument("article.pdf") const page = doc.loadPage(0) // Basic plain text extraction const stext = page.toStructuredText("preserve-whitespace") const plainText = stext.asText() console.log(plainText) // Structured JSON output (requires "preserve-spans" option) const json = JSON.parse(page.toStructuredText("preserve-spans").asJSON()) json.blocks.forEach(block => { if (block.type === "text") { block.lines.forEach(line => { console.log(`[${line.font.name} ${line.font.size}pt] ${line.text}`) }) } }) // Full-featured walker stext.walk({ beginTextBlock(bbox) { console.log("Block at", bbox) }, onChar(utf, origin, font, size, quad, argb) { process.stdout.write(utf) }, endTextBlock() { console.log() }, onImageBlock(bbox, transform, image) { console.log("Image at", bbox) } }) // Copy text in a selection rectangle const selected = stext.copy([50, 100], [400, 200]) console.log("Selected text:", selected) ``` --- ### `PDFDocument` constructor and `PDFDocument.prototype.save()` — Create and save PDF files `new mupdf.PDFDocument()` creates an empty PDF from scratch. Pages are added with `addPage()` then `insertPage()`. Documents are saved with a rich set of options controlling compression, encryption, garbage collection, and incremental updates. ```javascript import * as mupdf from "mupdf" // Create a new PDF with one page const pdf = new mupdf.PDFDocument() // Add a font resource const fontObj = pdf.addSimpleFont(new mupdf.Font("Helvetica"), "Latin") const fonts = pdf.newDictionary() fonts.put("F1", fontObj) const resources = pdf.addObject(pdf.newDictionary()) resources.put("Font", fonts) // Create a page (A4: 595x842 pts) const pageObj = pdf.addPage( [0, 0, 595, 842], 0, // rotation resources, "BT /F1 24 Tf 72 750 Td (Hello from MuPDF!) Tj ET" ) pdf.insertPage(-1, pageObj) // -1 = append // Save options: pretty-print, compress images, garbage-collect pdf.save("output.pdf", "pretty,compress-images,garbage") // Or save to a buffer for streaming const buffer = pdf.saveToBuffer("compress,garbage=deduplicate") // buffer.asUint8Array() or buffer.asString() // Incremental save (for appending changes to existing file) // pdf.save("output.pdf", "incremental") ``` --- ### `PDFPage.prototype.createAnnotation()` — Add annotations to PDF pages Creates a new PDF annotation of the specified type on a page. Annotation types include `Text`, `FreeText`, `Line`, `Square`, `Circle`, `Highlight`, `Underline`, `Stamp`, `Ink`, `Redaction`, and more. ```javascript import * as mupdf from "mupdf" const doc = mupdf.Document.openDocument("input.pdf") const pdf = doc.asPDF() const page = pdf.loadPage(0) // FreeText annotation (sticky note with text box) const freeText = page.createAnnotation("FreeText") freeText.setRect([50, 700, 300, 750]) freeText.setContents("Review this section!") freeText.setDefaultAppearance("Helv", 12, [1, 0, 0]) // red Helvetica 12pt freeText.setColor([1, 1, 0]) // yellow border freeText.update() // Highlight annotation with quad points const highlight = page.createAnnotation("Highlight") highlight.setColor([1, 1, 0]) // yellow highlight.setOpacity(0.5) highlight.setQuadPoints([ [72, 720, 300, 720, 300, 736, 72, 736] ]) highlight.update() // Ink (freehand) annotation const ink = page.createAnnotation("Ink") ink.setColor([0, 0, 1]) // blue ink.setBorderWidth(2) ink.setInkList([ [[100, 500], [150, 480], [200, 500], [250, 480]], // stroke 1 [[100, 460], [200, 460]] // stroke 2 ]) ink.update() // Save pdf.save("annotated.pdf", "incremental") ``` --- ### `PDFPage.prototype.applyRedactions()` — Permanently redact content Redaction removes content from a PDF permanently. First create `Redaction` annotations to mark areas, then call `applyRedactions()` to destructively apply them. ```javascript import * as mupdf from "mupdf" const pdf = mupdf.Document.openDocument("sensitive.pdf").asPDF() const page = pdf.loadPage(0) // Mark two areas for redaction const r1 = page.createAnnotation("Redaction") r1.setRect([50, 700, 400, 730]) // header area r1.update() const r2 = page.createAnnotation("Redaction") r2.setRect([50, 100, 300, 130]) // footer area r2.update() // Apply redactions (irreversible) page.applyRedactions( true, // black boxes at redacted areas mupdf.PDFPage.REDACT_IMAGE_PIXELS, // redact covered image pixels mupdf.PDFPage.REDACT_LINE_ART_REMOVE_IF_COVERED, mupdf.PDFPage.REDACT_TEXT_REMOVE ) pdf.save("redacted.pdf", "garbage,compress") ``` --- ### `PDFDocument.prototype.graftPage()` — Merge pages across documents Copies a page (and all its resources) from one `PDFDocument` into another. Use a graft map (`newGraftMap`) when copying multiple objects to avoid duplicating shared resources. ```javascript import * as mupdf from "mupdf" // Merge multiple PDFs into one const output = new mupdf.PDFDocument() const inputs = ["file1.pdf", "file2.pdf", "file3.pdf"] for (const path of inputs) { const src = mupdf.Document.openDocument(path).asPDF() const n = src.countPages() for (let i = 0; i < n; i++) { output.graftPage(-1, src, i) // append each page } } output.save("merged.pdf", "garbage,compress") // Low-level merge preserving shared resources with a graft map function mergeWithGraftMap(dstDoc, srcDoc) { const graftMap = dstDoc.newGraftMap() const n = srcDoc.countPages() for (let k = 0; k < n; k++) { const srcPage = srcDoc.findPage(k) const dstPage = dstDoc.newDictionary() dstPage.put("Type", dstDoc.newName("Page")) if (srcPage.get("MediaBox")) dstPage.put("MediaBox", graftMap.graftObject(srcPage.get("MediaBox"))) if (srcPage.get("Resources")) dstPage.put("Resources", graftMap.graftObject(srcPage.get("Resources"))) if (srcPage.get("Contents")) dstPage.put("Contents", graftMap.graftObject(srcPage.get("Contents"))) dstDoc.insertPage(-1, dstDoc.addObject(dstPage)) } } ``` --- ### `PDFDocument.prototype.rearrangePages()` — Reorder, subset, or duplicate pages Rearranges the page tree to match the supplied array of page indices. Pages omitted are removed; pages listed multiple times are duplicated. Save with `garbage` to physically remove orphaned objects. ```javascript import * as mupdf from "mupdf" const pdf = mupdf.Document.openDocument("book.pdf").asPDF() // Reverse all pages const n = pdf.countPages() const reversed = Array.from({ length: n }, (_, i) => n - 1 - i) pdf.rearrangePages(reversed) pdf.save("reversed.pdf", "garbage") // Keep only pages 0, 2, 4 (delete odd pages) const evens = Array.from({ length: Math.ceil(n / 2) }, (_, i) => i * 2) pdf.rearrangePages(evens) pdf.save("even-pages.pdf", "garbage") ``` --- ### `PDFDocument` journaling — Undo/redo support Enable journaling on a document before making changes to support undo/redo. Each named operation becomes one undo step. ```javascript import * as mupdf from "mupdf" const pdf = mupdf.Document.openDocument("form.pdf").asPDF() pdf.enableJournal() // Make a change as a named undoable operation pdf.beginOperation("Add annotation") try { const page = pdf.loadPage(0) const annot = page.createAnnotation("Text") annot.setRect([100, 100, 150, 150]) annot.setContents("TODO") annot.update() pdf.endOperation() } catch (e) { pdf.abandonOperation() throw e } console.log(pdf.canUndo()) // true console.log(pdf.getJournal()) // { position: 1, steps: ["Add annotation"] } pdf.undo() console.log(pdf.canRedo()) // true pdf.redo() pdf.save("edited.pdf", "incremental") ``` --- ### `DisplayList` — Cache page rendering for multiple uses A `DisplayList` records all device calls for a page so they can be replayed multiple times (e.g., rendering at different scales, or searching while also rendering) without re-parsing the file. ```javascript import * as mupdf from "mupdf" import * as fs from "fs" const doc = mupdf.Document.openDocument("large.pdf") const page = doc.loadPage(5) // Record page into a display list once const displayList = page.toDisplayList(true) // true = include annotations // Render at 72 dpi (thumbnail) const small = displayList.toPixmap( mupdf.Matrix.scale(0.5, 0.5), mupdf.ColorSpace.DeviceRGB, false ) fs.writeFileSync("thumb.png", small.asPNG()) // Render at 300 dpi (high quality) const big = displayList.toPixmap( mupdf.Matrix.scale(300 / 72, 300 / 72), mupdf.ColorSpace.DeviceRGB, false ) fs.writeFileSync("hires.png", big.asPNG()) // Search the cached display list const hits = displayList.search("contract terms") console.log(`Found ${hits.length} match(es)`) // Extract text from cached display list const stext = displayList.toStructuredText("preserve-whitespace") console.log(stext.asText()) ``` --- ### `Pixmap` — Raster image manipulation A `Pixmap` holds a raster image with a specific colorspace and optional alpha channel. It can be created from scratch, obtained from page rendering, or derived from image files. Pixel-level access, color conversion, gamma correction, tinting, warping, deskewing, and barcode encoding/decoding are all supported. ```javascript import * as mupdf from "mupdf" import * as fs from "fs" // Create a blank 500x600 RGB pixmap (white background) const pixmap = new mupdf.Pixmap(mupdf.ColorSpace.DeviceRGB, [0, 0, 500, 600], false) pixmap.clear(255) // Draw on it using a DrawDevice const device = new mupdf.DrawDevice(mupdf.Matrix.identity, pixmap) const path = new mupdf.Path() path.moveTo(50, 50) path.lineTo(450, 50) path.lineTo(450, 550) path.lineTo(50, 550) path.closePath() device.fillPath(path, false, mupdf.Matrix.identity, mupdf.ColorSpace.DeviceRGB, [0.8, 0.9, 1.0], 1) device.close() // Color convert to CMYK const cmyk = pixmap.convertToColorSpace(mupdf.ColorSpace.DeviceCMYK, false) // Image manipulation pixmap.gamma(1.4) // lighten pixmap.invert() // invert colors pixmap.tint(0x000000, 0xffffff) // tint // Save in various formats fs.writeFileSync("out.png", pixmap.asPNG()) fs.writeFileSync("out.jpg", pixmap.asJPEG(90, false)) // Deskew a scanned document pixmap const angle = pixmap.detectSkew() if (Math.abs(angle) > 0.5) { const deskewed = pixmap.deskew(angle, "increase") fs.writeFileSync("deskewed.png", deskewed.asPNG()) } // Encode a QR code const qr = mupdf.Pixmap.encodeBarcode("qrcode", "https://mupdf.com", 200, 2, true, false) fs.writeFileSync("qr.png", qr.asPNG()) ``` --- ### `Matrix` — 2D transformation matrices Matrices are plain six-element arrays `[a, b, c, d, e, f]` representing 2D affine transforms. Static helpers on `mupdf.Matrix` create common transforms; `concat` combines them. ```javascript import * as mupdf from "mupdf" // Identity matrix const id = mupdf.Matrix.identity // [1, 0, 0, 1, 0, 0] // Scale: 2× in both axes (e.g., for 144 dpi from 72 dpi base) const scale2x = mupdf.Matrix.scale(2, 2) // Translate 100 points right, 50 points down const translate = mupdf.Matrix.translate(100, 50) // Rotate 45 degrees clockwise const rotate45 = mupdf.Matrix.rotate(45) // Combine: first scale, then translate const combined = mupdf.Matrix.concat(scale2x, translate) // Invert a matrix const inv = mupdf.Matrix.invert(combined) // Typical use: render a page at 150 dpi const doc = mupdf.Document.openDocument("doc.pdf") const page = doc.loadPage(0) const dpiMatrix = mupdf.Matrix.scale(150 / 72, 150 / 72) const pixmap = page.toPixmap(dpiMatrix, mupdf.ColorSpace.DeviceRGB, false) ``` --- ### `Story` and `DocumentWriter` — Flow HTML text into PDF pages `Story` takes an HTML string (or a programmatic DOM tree) and flows it into rectangular areas across multiple pages using `DocumentWriter`. This is the high-level API for generating PDFs from formatted content. ```javascript import * as mupdf from "mupdf" const mediabox = [0, 0, 595, 842] // A4 const margin = 40 const html = `
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris.
` const writer = new mupdf.DocumentWriter("report.pdf", "PDF", "") const buf = new mupdf.Buffer() buf.write(html) const story = new mupdf.Story(buf, "", 12) // HTML, user-CSS, default font size let placed do { const where = [mediabox[0] + margin, mediabox[1] + margin, mediabox[2] - margin, mediabox[3] - margin] const dev = writer.beginPage(mediabox) placed = story.place(where) story.draw(dev, mupdf.Matrix.identity) writer.endPage() } while (placed.more) writer.close() ``` --- ### `mupdf.setLog()` — Global logging and cache management Configure a custom logger for MuPDF warnings and errors, and manage the resource store cache size. ```javascript import * as mupdf from "mupdf" // Custom logger mupdf.setLog({ error(msg) { console.error("[MuPDF ERROR]", msg) }, warning(msg) { console.warn("[MuPDF WARN]", msg) }, }) // Enable ICC color management mupdf.enableICC() // Set a CSS stylesheet for all reflowable documents (EPUB, HTML) mupdf.setUserCSS("body { font-size: 18pt; font-family: Georgia; }", true) // Manage resource store mupdf.shrinkStore(75) // shrink to 75% of current size mupdf.emptyStore() // free all cached resources // Install a system font loader mupdf.installLoadFontFunction((fontName, scriptName, isBold, isItalic) => { // Return a Font object, or null to continue with fallbacks return null }) ``` --- ## C API ### Basic C rendering example — `fz_new_context`, `fz_open_document`, `fz_new_pixmap_from_page_number` The C API is the foundation of all language bindings. A context (`fz_context`) carries global state; documents and pages are opened with `fz_open_document` / `fz_load_page`; pixmaps are rendered with `fz_new_pixmap_from_page_number`. All objects are reference-counted and must be freed with matching `fz_drop_*` calls. Exceptions are handled with `fz_try` / `fz_catch` macros. ```c #include