Question
Convert HTML and CSS to PDF in PHP: Options, Limits, and Practical Approaches
Question
I have an HTML document, not XHTML, that renders correctly in Firefox 3 and Internet Explorer 7. It uses fairly basic CSS, and the page displays properly in the browser.
I now need to convert that HTML into a PDF.
I have tried several tools:
- DOMPDF: It had major problems with tables. After simplifying some large nested tables, memory usage improved, but it still consumed a lot of memory and sometimes failed around the
128MPHP memory limit. It also produced poor table layout and had trouble rendering images. - HTML2PDF / HTML2PS: These worked somewhat better. Some images rendered correctly, and table formatting was better, but the conversion failed with
unknown node_type()errors that I could not diagnose. - HTMLDOC: This worked for basic HTML, but CSS support was extremely limited, so it was not suitable for my needs.
I also tested a Windows application called Html2Pdf Pilot, which produced better results, but I need a solution that runs on Linux and ideally can be triggered on demand from PHP on the web server.
What is the practical way to handle HTML and CSS to PDF conversion in this kind of setup, and why do these tools behave so differently?
Short Answer
By the end of this page, you will understand why converting HTML and CSS to PDF is harder than it looks, why browser output does not automatically match PDF output, and how PHP developers usually approach this problem on Linux servers. You will also learn the trade-offs between HTML-to-PDF libraries, when CSS support breaks down, and how to structure HTML for more reliable PDF generation.
Concept
HTML-to-PDF conversion is not just a file format change. It is a rendering problem.
A browser like Firefox or Chrome has a powerful layout engine that understands modern HTML, CSS, images, fonts, and complex page layout rules. A PDF generator must take your HTML and CSS, calculate the layout, split content into pages, place text and images precisely, and then write a PDF file.
That means an HTML-to-PDF tool needs its own rendering engine. Some tools support only a small subset of CSS. Others support tables but not floating layouts well. Some can render images but fail on external URLs, unsupported formats, or memory-heavy pages.
Why this matters
In real applications, PDF generation is common for:
- invoices
- reports
- tickets
- receipts
- downloadable exports
- printable summaries
If the tool's rendering engine does not match your HTML structure, the result may be:
- broken tables
- missing images
- incorrect page breaks
- high memory usage
- unsupported CSS rules
The key idea
You should not assume:
"If it looks correct in the browser, it will look correct in PDF."
That is usually false.
Instead, think of PDF generation as using a different layout engine with different rules and limits.
Why tools behave differently
Different libraries have different strengths:
- some parse HTML and CSS in pure PHP
- some rely on external rendering engines
- some support only older HTML/CSS features
- some are better at simple documents than complex nested tables
Mental Model
Imagine you wrote a document for a modern office printer, but now you hand it to three different machines:
- one only understands plain text and simple formatting
- one understands tables but gets confused by nested layouts
- one understands more styling but runs out of memory on large pages
All three are trying to produce the same final paper, but each has different abilities.
That is what HTML-to-PDF tools are like.
A web browser is like a highly trained reader of HTML and CSS. Many PDF libraries are more like simplified readers. They can handle some structures well, but not everything a browser can.
So the mental model is:
- Browser rendering = full-featured interpreter
- PDF rendering library = partial interpreter with constraints
- Your job = write HTML/CSS that matches the renderer's strengths, or choose a different approach
Syntax and Examples
In PHP, HTML-to-PDF generation usually follows one of these patterns.
1. Pass HTML into a renderer
<?php
$html = '
<h1>Invoice</h1>
<table border="1" cellpadding="6" cellspacing="0">
<tr><th>Item</th><th>Price</th></tr>
<tr><td>Book</td><td>$10</td></tr>
</table>
';
// Pseudocode: exact API depends on the library you choose
$renderer = new PdfRenderer();
$renderer->loadHtml($html);
$renderer->render();
$renderer->output('invoice.pdf');
This approach is convenient if you already have HTML templates.
2. Generate PDF directly with a PDF library
<?php
// Pseudocode showing the idea
$pdf = new PdfDocument();
$pdf->addPage();
$pdf->setFont('Helvetica', 12);
$pdf->writeText(20, 20, 'Invoice');
->(, , , , );
->(, , , , );
->();
Step by Step Execution
Consider this PHP example:
<?php
$html = '
<h1>Report</h1>
<table border="1" cellpadding="4" cellspacing="0">
<tr><th>Name</th><th>Score</th></tr>
<tr><td>Ava</td><td>92</td></tr>
</table>';
$renderer = new PdfRenderer();
$renderer->loadHtml($html);
$renderer->render();
$pdfBytes = $renderer->output();
file_put_contents('report.pdf', $pdfBytes);
Here is what happens step by step:
htmlstores the document as a string.new PdfRenderer()creates the PDF conversion object.loadHtml($html)gives the renderer the source markup.- The renderer parses the HTML tags such as
<h1>,<table>,<tr>,<th>, and<td>. - It parses any supported styling rules.
- It calculates layout:
Real World Use Cases
HTML-to-PDF conversion is used in many common backend workflows.
Invoices and receipts
A web app displays an invoice as HTML, then exports the same data as a PDF for download or email.
Reports and dashboards
Analytics pages often need printable PDF exports. Tables, charts, and summary sections are common here.
Booking confirmations
Travel, events, and appointment systems generate PDFs containing customer details, dates, and QR or barcode images.
Certificates and formal documents
Training platforms and internal company systems produce PDFs for records, completion certificates, or signed summaries.
Server-generated exports
A Linux server may generate PDFs in scheduled jobs or in response to user actions, such as clicking Download PDF.
In all of these cases, developers must decide whether to:
- render from existing HTML templates
- maintain a separate print-friendly HTML template
- build the PDF directly with a PDF API
Real Codebase Usage
In real projects, developers rarely send full browser-oriented pages directly into a PDF converter without adjustment.
Common patterns
Print-specific templates
Teams often create a separate HTML template just for PDF output.
Why:
- avoids unsupported CSS
- removes interactive UI elements
- simplifies layout
- controls page size better
Guard clauses for external resources
Before generating the PDF, code may validate that images, fonts, and data exist.
<?php
if (empty($reportRows)) {
throw new RuntimeException('Cannot generate PDF: no report data found.');
}
Early returns for invalid state
<?php
if (!is_readable($templatePath)) {
return 'Template file not found.';
}
Preprocessing HTML
Developers often clean HTML before passing it to the renderer:
- remove unsupported tags
- flatten nested tables
- convert relative image paths to absolute paths
- reduce CSS complexity
Fallback strategy
Common Mistakes
1. Assuming browser support equals PDF support
Broken assumption:
<?php
// Looks great in a browser, so it should work in PDF too.
That is not reliable. PDF engines often support only part of HTML/CSS.
2. Using overly complex layout
Broken example:
<table>
<tr>
<td>
<table>
<tr>
<td>
<div style="float:left; position:relative;">
Complex content
</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
Why it fails:
- nested tables increase layout complexity
- floats and positioning may be partially supported or unsupported
Better approach:
Comparisons
| Approach | Strengths | Weaknesses | Best for |
|---|---|---|---|
| HTML-to-PDF in pure PHP | Easy to integrate into PHP apps | Limited CSS support, memory issues on complex layouts | Simple reports, invoices, basic tables |
| Direct PDF generation | Precise control, predictable output | More manual work, not based on existing HTML | Structured business documents |
| Browser-based PDF rendering | Best visual match to browser output | Requires external tooling, heavier setup | Complex modern HTML/CSS |
| Basic HTML converters with low CSS support | Fast for simple markup | Poor styling support | Legacy or very plain documents |
HTML-to-PDF vs direct PDF generation
| Option | Use when |
|---|
Cheat Sheet
Quick reference
- HTML-to-PDF is a rendering problem, not a simple conversion.
- Browser output does not guarantee matching PDF output.
- Simpler HTML usually produces better PDFs.
- Tables are often supported better than advanced CSS layout tricks.
- Nested tables can increase memory use and break layout.
- Use accessible image paths: absolute URLs or valid local paths.
- Keep HTML well formed, even if browsers are forgiving.
- Consider a separate print/PDF template.
Good practices
<?php
$html = '<h1>Invoice</h1><table border="1"><tr><td>Item</td></tr></table>';
- prefer simple headings and tables
- keep CSS basic
- test with real data volume
- log rendering failures
Warning signs
- converter crashes on large tables
- images appear in browser but not in PDF
- page breaks split rows badly
- memory usage grows rapidly
- unsupported CSS properties are ignored
Decision rule
- Simple document? Use HTML-to-PDF.
- Strict layout? Generate PDF directly.
- Complex modern HTML/CSS? Use a browser-based renderer.
FAQ
Why does my HTML look correct in the browser but break in PDF?
Browsers and PDF libraries use different rendering engines. Many PDF tools support only part of HTML and CSS.
Why do tables often cause problems in HTML-to-PDF tools?
Tables require complex layout calculations, especially when nested, wide, or split across pages. Some libraries struggle with that.
Do I need XHTML instead of HTML?
Some older tools are stricter and work better with clean, well-formed markup. Even if XHTML is not required, valid HTML helps.
Why are my images missing in the PDF?
Common causes include invalid paths, blocked remote URLs, unsupported image formats, or filesystem permission issues.
Is using the same template for screen and PDF a good idea?
Sometimes for simple pages, yes. For anything non-trivial, a separate PDF-friendly template is usually more reliable.
Should I increase the PHP memory limit to fix PDF generation?
Only if needed, and only after simplifying the document. More memory can help, but it does not fix unsupported layout features.
What is the most reliable strategy for production apps?
Use simple print-focused HTML for PDF generation, validate resources before rendering, and choose a tool that matches your document complexity.
Mini Project
Description
Build a small PHP report exporter that turns a simple HTML report into a PDF-friendly document structure. The focus is not on a specific third-party library API, but on preparing HTML in a way that PDF tools usually handle more reliably: clear headings, simple tables, and accessible image URLs.
Goal
Create a PHP script that builds a clean HTML report template ready for PDF conversion and saves it as an HTML file for testing.
Requirements
- Create a PHP array containing report rows with a product name and total.
- Generate HTML for a report with a heading and a single simple table.
- Include basic embedded CSS for fonts, borders, and spacing.
- Save the generated HTML to a file named
report.html. - Use only simple, PDF-friendly markup without nested tables or advanced CSS.
Keep learning
Related questions
Can You Style Half a Character in CSS? Text Effects with CSS and JavaScript
Learn how to style half of a character using CSS and JavaScript, including overlay techniques for dynamic text effects.
Check If a Checkbox Is Checked with jQuery
Learn how to check whether a checkbox is checked in jQuery using the correct selector, with examples, mistakes, and practical patterns.
Get Screen, Page, and Browser Window Size in JavaScript
Learn how to get screen size, viewport size, page size, and scroll position in JavaScript across major browsers.