Charles Roper
Charles Roper

Reputation: 20625

Is it possible to combine a series of PDFs into one using Ruby?

I have a series of PDFs named sequentially like so:

Using Ruby, is it possible to combine these into one big PDF while keeping them in sequence? I don't mind installing any necessary gems to do the job.

If this isn't possible in Ruby, how about another language? No commercial components, if possible.


Update: Jason Navarrete's suggestion lead to the perfect solution:

Place the PDF files needing to be combined in a directory along with pdftk (or make sure pdftk is in your PATH), then run the following script:

pdfs = Dir["[0-9][0-9]_*"].sort.join(" ")
`pdftk #{pdfs} output combined.pdf`

Or I could even do it as a one-liner from the command-line:

ruby -e '`pdftk #{Dir["[0-9][0-9]_*"].sort.join(" ")} output combined.pdf`'

Great suggestion Jason, perfect solution, thanks. Give him an up-vote people.

Upvotes: 10

Views: 3614

Answers (7)

Gordon Isnor
Gordon Isnor

Reputation: 2105

I tried the pdftk solution and had problems on both SnowLeopard and Tiger. Installing on Tiger actually wreaked havoc on my system and left me unable to run script/server, fortunately it’s a machine retired from web development.

Subsequently found another option: - joinPDF. Was an absolutely painless and fast install and it works perfectly.

Also tried GhostScript and it failed miserably (could not read the fonts and I ended up with PDFs that had images only).

But if you’re looking for a solution to this problem, you might want to try joinPDF.

Upvotes: 2

Steve Hanov
Steve Hanov

Reputation: 11574

If you have ghostscript on your platform, shell out and execute this command:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf <your source pdf files>

Upvotes: 2

Dan Harper
Dan Harper

Reputation: 1130

Any Ruby code to do this in a real application is probably going to be painfully slow. I would try and hunt down unix tools to do the job. This is one of the beauties of using Mac OS X, it has very fast PDF capabilities built-in. The next best thing is probably a unix tool.

Actually, I've had some success with rtex. If you look here you'll find some information about it. It is much faster than any Ruby library that I've used and I'm pretty sure latex has a function to bring in PDF data from other sources.

Upvotes: -1

Jason Navarrete
Jason Navarrete

Reputation: 7523

A Ruby-Talk post suggests using the pdftk toolkit to merge the PDFs.

It should be relatively straightforward to call pdftk as an external process and have it handle the merging. PDF::Writer may be overkill because all you're looking to accomplish is a simple append.

Upvotes: 14

JasonTrue
JasonTrue

Reputation: 19599

I'd suggest looking at the code for PDFCreator (VB, if I'm not mistaken, but that shouldn't matter since you'd just be implementing similar code in another language), which uses GhostScript (GNU license). Or just dig straight into GhostScript itself; there's also a facade layer available called GhostPDF, which may do what you want.

If you can control GhostScript with VB, you can do it with C, which means you can do it with Ruby.

Ruby also has IO.popen, which allows you to call out to external programs that can do this.

Upvotes: 0

Adam Rosenfield
Adam Rosenfield

Reputation: 400146

You can do this by converting to PostScript and back. PostScript files can be concatenated trivially. For example, here's a Bash script that uses the Ghostscript tools ps2pdf and pdf2ps:

#!/bin/bash
for file in 01_foo.pdf 02_bar.pdf 03_baz.pdf; do
    pdf2ps $file - >> temp.ps
done

ps2pdf temp.ps output.pdf
rm temp.ps

I'm not familiar with Ruby, but there's almost certainly some function (might be called system() (just a guess)) that will invoke a given command line.

Upvotes: 2

akauppi
akauppi

Reputation: 18036

I don't think Ruby has tools for that. You might check ImageMagick and Cairo. ImageMagick can be used for binding multiple pictures/documents together, but I'm not sure about the PDF case.

Then again, there are surely Windows tools (commercial) to do this kind of thing.

I use Cairo myself for generating PDF's. If the PDF's are coming from you, maybe that would be a solution (it does support multiple pages). Good luck!

Upvotes: 0

Related Questions