Get your legacy laserjet (pcl5) or postscript output converted to PDF format.
Legacy systems are still out there! Some are still generating reams of paper and this can be easily avoided with a variety of Linux tools to convert legacy output into PDF format.
This article discusses some simple linux tools that can be used to change print outputs into more useful PDF files. Let’s begin with a quick look at some of simple Linux tools that can be used to convert text/html to PDF format.
Converting to PDF format
Perhaps the easiest way to get your legacy print jobs into PDF format is to pipe your text (or laserjet coded) print file through one of the many of the linux utilities available in most distributions (or easily installed). Let’s take a very quick look at a few of your options.
Postscript output and Ghostscript
Most Linux distributions pre-install Ghostscript (the binary is ‘gs’) or make it readily available via their package management systems. If you prefer you can also install it yourself from source obtained from the ghostscript web site.
Ghostscript will accept postscript input and will output to PDF format. If you’ve been printing to postscript printers you’ll be able to preserve all formatting.
Most Linux printers pipe their output to a spooler; something like this:
cat samplePS.ps | gs -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=OutputFromPS.pdf -
or, if you’re dealing with an final file
gs -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=OutputFromPS.pdf samplePS.ps
PCL (laserjet) Output and GhostPCL (pcl6)
GhostPCL is developedt by the same project that publishes the familiar ghostscript. The great feature of ghostpcl is that it accepts laserjet print codes. This makes it simple to take your print jobs formatted for laserjet printers and convert them to PDF without losing the formatting (or graphics or other special print formatting).
Ghostpcl is available the ghostscript website. The binary is ‘pcl6’.
Redefine your printer piping the output through pcl6 rather that to a printer
cat samplePCL.prn | pcl6 -dNOPAUSE -sDEVICE=pdfwrite \
pcl6 -dNOPAUSE -sDEVICE=pdfwrite \
PJL commands in the file seem to cause pcl6 some issues. There is a command line flag for PJL commands.
Text Output and enscript and PS2PDF
In legacy systems you’ll no doubt encounter many print jobs that are plain text — not postscript or pcl (laserjet). Linux has a number of tools to convert text to PDF.
enscript -p - -r -f Courier7 -M Letter -B -L 60 \
-c sampleTEXT.txt \ | gs -dNOPAUSE \
-sDEVICE=pdfwrite -OutputFile=OutputFromTXT.pdf -
Libreoffice to convert to PDF
libreoffice --headless --convert-to pdf sampleDocForPDFArticle.odt
This creates a file ‘sampleDocForPDFArticle.pdf’ in the current directory. The output directory can be specified on the command line.
Note that Libreoffice can convert to a number of other formats, including html.
In most Linux distributions the ‘headless’ version of Libreoffice is a separate installation from the more common version. If you find this command doesn’t work search for ‘libreoffice headless’ in your distribution’s repository.
Joining PDF files
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite \
gs -dNOPAUSE -sDEVICE=pdfwrite \
-dBATCH tmp_1,1.pdf tmp_1,2.pdf
See also: ‘pdftk’ which can join PDFs and much more, though I’ve found using ‘gs’ to be much more robust.
gs -dBATCH -sOutputFile="$4"
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=22 \
-dLastPage=36 -sOutputFile=outfile_p22-p36.pdf 100p \
See also: ‘pdftk’ which can split PDFs (and much more), though I’ve found ‘gs’ to be much more robust.
Rotating PDF files
This can be useful if you find your PDF’s aren’t being oriented properly.
It can also rotate PDF’s made from PCL (by converting to postscript first).
The desired gs command argument ‘AutoRotatePages’ requires Postscript input. The method is converts the original PDF to postscript and make the rotation.
pcl6 -dNOPAUSE -dBATCH -sDEVICE=ps2write \
-sOutputFile=- original.pcl | gs -dNOPAUSE -dBATCH \
-dAutoRotatePages=/All -q -sDEVICE=pdfwrite \
Also consider ‘pdftk’, available in most distributions, which can easily rotate PDF files.
Note that the pdftk method for rotating is faster.
pdftk original_wrong.pdf cat 1-endeast output new_orientation.pdf
I’ve come to prefer using ‘gs’ (and it’s sibling ‘pcl6’) over ‘pdftk’ as it’s likely to already be installed and it seems more reliable and produces a better PDF. Extract text from PDF
pdftotext -layout OutputFromTXT.pdf
Produces a file ‘OutputFromTXT.txt’
pdftotext -layout OutputFromTXT.pdf test.txt
Produces file text.txt in the current directory.
I’ve come to prefer using ‘gs’ (and it’s sibling ‘pcl6’) over ‘pdftk’ as it’s likely to already be installed and it seems more reliable and produces a better PDF.
However, ‘pdftk’ can do many useful things not covered in this article. ‘pdftk’ is worth considering if you need to:
- manipulate PDF metadata,
- set file encryption passwords,
- attach files to your PDF (not all PDF readers will recognize the attached files),
- include X/FDF Data, and
- set ‘stamps’ on the background of your PDF.
We’ll save ‘pdftk’ for a future article.