Reputation: 2208

Wkhtmltopdf Characters in single line partially cut between pages

I am working in a project using ruby on rails(3.1). My requirement is to produce pdf from the html content. So I use pdfkit gem.

In some pages, characters in single line partially cut between pages. When I convert html convert to pdf using pdfkit gem

version of wkhtmltopdf: wkhtmltopdf -- 0.11.0 rc1

operating system: Linux CentOS 5.5

In the image below showing character partially cut between pages.

Please suggest a solution.

Example 1

enter image description here

Example 2

enter image description here

Upvotes: 48

Answers (12)

Atul sen

Reputation: 1

When converting HTML to PDF using wkhtmltopdf, using display: table; and display: table-cell; instead of vertical-align and display: inline or display: inline-block in your CSS code helps prevent word-cutting issues. This approach ensures that words are not cut off at the end of lines in the generated PDF.

try this its work

Upvotes: 0

Geralt43

Reputation: 1

I was able to find a workaround to this issue by installing wkhtmltox_0.12.6-1.bionic_amd64.deb (for Ubuntu) from https://github.com/wkhtmltopdf/packaging/releases/0.12.6-1

After updating this wkhtmltox package, the tables and text will not cut off at the end of the page anymore. This fix introduced a different issue for me, now the generated pdf has no styling. For example font-family, font-size or even text alignment are all gone, and are using some default setting.

Upvotes: 0

jpmc

Reputation: 1193

Have been putting up with this for months and finally found a fix for my situation. I'm using the github css stylesheet in the html file I'm converting, and code blocks that span multiple pages get the text cut if. Nothing is missing, it's just cut in half.

Bottom of a page:

Start of next page:

So in the github stylesheet overflow is set to auto for <pre> tags.

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: auto;
...

Switching the overflow property to hidden solved it for me!

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: hidden;

Think I tried all the other answers on this page, but this is solved for me. Hope it helps someone else out :)

Upvotes: 0

Ben Wong

Reputation: 701

https://github.com/ArthurHub/HTML-Renderer/issues/38

                    **var head = "<head><style type=\"text/css\"> td, h1, h2, h3, p, b, div, i, span, label, ul, li, tr, table { page-break-inside: avoid; } </style></head>";**

                    PdfDocument pdf = PdfGenerator.GeneratePdf("html>" + head + "<body>" +  m42Notes + "</body></html>", configurationOptions);

Upvotes: 2

Pawel Kolodziejuk

Reputation: 31

I solved problem adding margin-top and margin-bottom, like this:

$this->get('knp_snappy.pdf')->generateFromHtml($html, $pdfFilepath, [
        'default-header' => false,
        'header-line' => false,
        'footer-line' => false,
        'disable-javascript' => true,
        'margin-top' => '3mm',
        'margin-bottom' => '3mm',
        'margin-right' => '5mm',
        'margin-left' => '5mm',
        'orientation' => 'Landscape',
    ], true);

Upvotes: 1

Rob

Reputation: 479

This is old but hopefully will help someone - I was having the issues too, tried everything - even resorting back to old versions mentioned (12.1) but to no avail. I kept tweaking css to play around, trying to throw in page-break avoids everywhere, not having much progress. Then I tweaked css that was on the root div of my html, and it fixed it. I made so many tweaks trying to get it to work so I can't be 100% sure, but I believe the issue was it set to 'display:table' with margin: 0 auto and a specific width on the main outer div. It started working and not cutting off either images or tables mid-row once I removed that. Then the page-break-inside: avoid was working after that as expected.

I believe ultimately the code is trying to guess as best as it can exactly how many pixels high each page is, and where exactly (down to the pixel) is your content. We have to make it easy for the library to detect this by removing as much odd css in there as possible, so it's as simple as possible to calculate down to the pixel where the content lies. That's my guess.

Upvotes: 2

Mike Caputo

Reputation: 36

I scoured the internet for a couple of weeks, trying to overcome this issue. None of the solutions I found worked for me, but something else did.

I had a two column layout where the text was getting cut off mid-text. In the broken state, my basic structure looked like this:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}
.col-9{
  display: inline-block;
  width: 70%;
}
.col-9{
  display: inline-block;
  width: 25%;
}

<div class="col-9">
  [a lot of text here, that would spill over multiple pages]
</div>
<div class="col-3">
  [a short sidebar here]
</div>

I fixed it by changing it to:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}

.col-9{
  display: block;
  float: left;
  width: 70%;
}
.col-9{
  display: block;
  float: left;
  width: 25%;
}
.clear{
  clear: both;
}

<div class="col-9">
  [a lot of text here, that no longer split mid-line.]
</div>
<div class="col-3">
  [a short sidebar here]
</div>
<div class="clear"></div>

For some reason, the tool could not handle the display: inline-block setup. It works with floats. I'm running version 0.12.4.

Upvotes: 1

Pedro M Duarte

Reputation: 28093

In my case, the issue was resolved by commenting out the following css:

html, body {
  overflow-x: hidden;
}

In general, check if any tags have overflow set as hidden and remove it or set it to visible.

Btw, I am using wkhtmltopdf version 0.12.2.1 on Windows 8.

Upvotes: 5

Besi

Reputation: 22939

I did have this problem with a table:

enter image description here

Then I added this to my CSS:

table, img, blockquote {page-break-inside: avoid;}

This fixed the problem:

enter image description here

Upvotes: 17

Dragos Rusu

Reputation: 1568

The cut text problem is a known webkit problem and it seems developers found a solution inside wkhtmltopdf. Updating to 0.12.1 will fix the cut-text problem (if you don't want to waste time with compilations, you can just take the binary file from here: https://github.com/h4cc/wkhtmltopdf-amd64 ).

Upvotes: 0

nvahalik

Reputation: 559

I just ran across this and found something that resolved the issue for me. In my particular case, there were divs with display: inline-block; margin-bottom: -20px;. Once I changed them to block and reset the margin-bottom, the line splitting disappeared. YMMV.

Upvotes: 12

Peter Brown

Reputation: 51707

According to some documentation I found (see Page Breaking), this is a known issue and suggests using CSS page breaks to insert page breaks (assuming you are using patched version of QT):

The current page breaking algorithm of WebKit leaves much to be desired. Basically webkit will render everything into one long page, and then cut it up into pages. This means that if you have two columns of text where one is vertically shifted by half a line. Then webkit will cut a line into to pieces display the top half on one page. And the bottom half on another page. It will also break image in two and so on. If you are using the patched version of QT you can use the CSS page-break-inside property to remedy this somewhat. There is no easy solution to this problem, until this is solved try organising your HTML documents such that it contains many lines on which pages can be cut cleanly.

See also: http://code.google.com/p/wkhtmltopdf/issues/detail?id=9, http://code.google.com/p/wkhtmltopdf/issues/detail?id=33 and http://code.google.com/p/wkhtmltopdf/issues/detail?id=57.

Upvotes: 9

Wkhtmltopdf Characters in single line partially cut between pages

Answers (12)

Related Questions