PDF To PNG Or JPG Conversion

From Wiki
Jump to: navigation, search

Overview

The following programs can be used under Linux to convert PDF files to JPG or PNG.

For the purposes of the examples, the input file is named 'HP_HDSP-2112.pdf', and is available here. This is a datasheet for a part that has a few graphics, text, but no color.

ImageMagick

Using the density option will render the PDF to JPG with a specified number of dots-per-inch resolution. Nominally, for a letter sized page, at 300 DPI, this will result in a JPG that is approximately 2550 x 3300 pixels.

convert -density 300 HP_HDSP-2112.pdf HP_HDSP-2112.jpg

To control the dimension of the output file(s), specify -size and the dimensions. This example will generate a JPG of 1024 x 768.

convert -size 1024x768 HP_HDSP-2112.pdf HP_HDSP-2112.jpg

Note that for the examples above, if the source PDF (HP_HDSP-2112.pdf) is 16 pages, 16 individual output files will be created, named HP_HDSP-2112-x.jpg, where 'x' is the page number.

To produce PNG files instead of JPG, replace the HP_HDSP-2112.jpg with HP_HDSP-2112.png.

Ghostscript

Ghostscript does font substitution when it can't find a font used by the PDF file. This will cause pages to render differently than intended. I don't know why, but ImageMagick hasn't had this problem on the pages I've tried converting. Perhaps it's smarter about finding fonts. This needs more research.

This '%03d' causes the pages to be named 'out- 1.jpg' for the first one, 'out- 10.jpg' for the tenth, etc. This results in pages that sort correctly for 'ls'. '%d' can be used instead, which will name the pages without a leading space, but 'ls' will not sort them in numerical order.

The '-dFirstPage' and '-dLastPage' can be dropped if the full document is to be converted.

gs -sDEVICE=jpeg -dFirstPage=2 -dLastPage=11 -o out-%03d.jpg HP_HDSP-2112.pdf

To specify the output size of the image, specify the '-g' argument, as shown below. Notice they are reversed from what you would expect.

gs -sDEVICE=jpeg -g768x1024 -dFirstPage=1 -dLastPage=1 -o out-%d.jpg HP_HDSP-2112.pdf

This produces the following output, which shows the font substitution:

GPL Ghostscript 8.63 (2008-08-01)
Copyright (C) 2008 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Substituting font Helvetica-Narrow-Bold for AgilentCond-Bold.
Loading NimbusSanL-BoldCond font from /usr/share/fonts/default/ghostscript/n019044l.pfb... 2579384 1094648 11389452 10100772 3 done.
Substituting font Helvetica-Narrow for AgilentCond-Regular.
Loading NimbusSanL-ReguCond font from /usr/share/fonts/default/ghostscript/n019043l.pfb... 2683852 1283689 16460520 15160316 3 done.
Substituting font Helvetica-Narrow-Bold for Myriad-CnBold.
Substituting font Helvetica-Narrow for Myriad-Condensed.

Poppler

http://poppler.freedesktop.org/

Adobe SDK

http://www.adobe.com/devnet/pdf/library/