Reputation: 268
Given a directory with several jpg files (photos), I would
like to create a single pdf file with one photo per page.
However, I would like the photos to be stored in the pdf file unchanged; i.e., I would like to avoid decoding and recoding.
So ideally I would like to be able to extract the original jpg files (maybe minus the metadata) from the pdf file, using, e.g., a linux command line too like pdfimages
.
My ideas so far:
imagemagick convert
. However, I am confused by the compression options: If I choose 100% quality
, does it mean that the jpg is internally decoded, and then encoded lossless? (Which is obviously not what I want?) pdflatex
. Some people claim that the graphics package includes images lossless, while other dispute that. In any case, pdflatex would be slightly more cumbersome (I would first have to find out the dimensions of the photos, then set the page size accordingly, make sure that ther are no margins, headers etc etc).Upvotes: 8
Views: 3787
Reputation: 11939
Depending on what you wish to do with the files, on windows, if the images are simpler jpeg/gif/tif/png you can store in a cbz, zip, folder or zipped folder and view with SumatraPDF which has the SaveAs PDF option thus all done with one exe.
It will fail with files that are viewable but not acceptable as PDF inputs such as webp or heic, so check in the viewer what the filename extension is before.
It should in practically all cases be lossless, however you should roundtrip with pdfimage -all to do a file compare between input and output to check there was no need to convert any bytes.
Upvotes: 1
Reputation: 55
Another possibility for storing jpg images into a pdf file in a "lossless" way is provided by PoDoFo:
podofoimg2pdf
is able to perform lossless conversion from JPEG to PDF by embedding the jpg file into the pdf container.
podofoimg2pdf
Usage: podofoimg2pdf [output.pdf] [-useimgsize] [image1 image2 image3 ...]
Options:
-useimgsize Use the imagesize as page size, instead of A4
Upvotes: 3
Reputation: 5656
Losslessly convert raster images to PDF without re-encoding PNG, JPEG, and JPEG2000 images. This leads to a lossless conversion of PNG, JPEG and JPEG2000 images with the only added file size coming from the PDF container itself. Other raster graphics formats are losslessly stored using the same encoding that PNG uses. Since PDF does not support images with transparency and since img2pdf aims to never be lossy, input images with an alpha channel are not supported.
(pdfimages -all
does the exact opposite.)
Upvotes: 15
Reputation: 830
You could use the following small script which relies on HexaPDF (note: I'm the author of HexaPDF) to do this.
Note: Make sure you have Ruby 2.4 installed, then run gem install hexapdf
to install hexapdf.
Here is the script:
require 'hexapdf'
doc = HexaPDF::Document.new
ARGV.each do |image_file|
image = doc.images.add(image_file)
page = doc.pages.add
iw = image.info.width.to_f
ih = image.info.height.to_f
pw = page.box(:media).width.to_f
ph = page.box(:media).height.to_f
rw, rh = pw / iw, ph / ih
ratio = [rw, rh].min
iw, ih = iw * ratio, ih * ratio
x, y = (pw - iw) / 2, (ph - ih) / 2
page.canvas.image(image, at: [x, y], width: iw, height: ih)
end
doc.write('images.pdf')
Just supply the images as arguments on the command line, the output file will be named images.pdf
. Most of the code deals with centering and scaling the images to nicely fit onto the pages.
Upvotes: 2