Political scientists increasingly use data visualizations to communicate statistical results. These visualizations make those results clear and intuitive, and perhaps beautiful. Yet the quality of the data story is sometimes lost in pixelated, low-resolution images. While the visualization itself might be strong, the physical rendering of that visualization is less than ideal. Though we are trained to make statistical graphics, the instructions we receive from publishers about how to make those graphics look their best in published form can be vague or confusing. This post discusses the formatting requirements for image files in journal publishing, points out some examples of high- and low-resolution graphics in a recent journal issue, and describes how to produce high-resolution image files from R, Stata, and Excel (possibly with the help of one or more simple add-on utilities).
File Formats and Resolution
Journal publishers have reached an industry consensus on how image files should be formatted for academic journals. As a few examples, Cambridge, Elsevier, Oxford, SAGE, Taylor and Francis, and Wiley agree that all image files should be in one of three formats: TIFF (Tagged Image File Format), EPS (Encapsulated PostScript), or PDF (Portable Document Format). All other formats are discouraged or not allowed, due to their lossy image compression.
Two of the recommended file formats (EPS and PDF) are vector graphics formats, which store an image as a series of lines of specified shape, color, and size between relative points in a space. Thus the image is rendered by drawing each of those vectors from the instructions encoded in the file. Vector graphics look sharp at any size because the vectors are simply drawn larger or smaller, depending on the requested size of the image. This differs from raster graphics, like TIFF, BMP, JPEG, and GIF. Unlike vector graphics, raster graphics are stored as a matrix of colored pixels.
Due to the scalability of vector graphics, they are the logical choice for producing high resolution graphics in academic publishing, and R, Stata, and Excel all produce these formats. When a vector graphic format is not available or for some reason not accepted, graphics need to be saved as high-resolution TIFF files of an appropriate size in order to avoid the grainy or pixelated images sometimes found in journals.
Academic publishers have a clear and shared standard for raster image resolution: a minimum 600 dots per inch (DPI) for most images, and 300 DPI for “halftone” figures, such as multicolor photographs (rare in political science journals). Many computer applications output image files at a default DPI far below this, often 72-96 DPI, which is the standard resolution for internet graphics. Thus, the default file settings for saving to TIFF are almost never appropriate for academic publishing.
Exporting Graphics from R
R offers drivers for all three of the required file formats in the
tiff functions. In RGui, one can also save images to any of these formats (or other formats) using the
File > Save as menu on the graphics device window.
For LaTeX users, the PDF format is probably the most familiar due to its easy incorporate into
.tex files. Indeed outputting an R plot to PDF is simple. For all of the below examples, we’ll consider the simple case of plotting a scatterplot. To produce a PDF, one simply needs to precede any plotting function(s) by a call to
Because PDF is a vector format, one need not specify
width arguments in order for the image to have desirable resolution. That said, the plot as shown in the R console will not map directly onto the plot saved to file. Specifying a larger image size (with
width) will yield a PDF with smaller text and more space between points. These arguments can also be used to control the aspect ratio (i.e., to produce a wider or taller rectangular plotting region).
The other approved vector format is EPS, which we can produce by replicating the above commands almost verbatim, replacing
postscript. The R documentation recommends, however, that you first call
setEPS() in order to instruct the
postscript device to render a single-page image (i.e.,
postscript can produce multi-page files and using
setEPS() restricts the result to a single page). We can see this code at work below:
Another strategy is to use the Cairo device (
cairo_ps) without the call to
Again, because EPS is a vector format, one need not specify a resolution, but both
width arguments for controlling aspect ratio.
Producing TIFF images with the tiff device (
tiff) is similar, but here we must explicitly specify a resolution (with the
res argument), as well as
width in order to yield an appropriately high resolution image at the intended output size (and for
width to be interpretable, one should also specify
units). To save space, these files can be compressed using lossless compression (i.e., with no damage to the resulting image). The preferred compression algorithm is called LZW, which can be specified by the
compression argument to
tiff. Here’s an example:
tiff("image.tif", res=600, compression = "lzw", height=5, width=5, units="in")
An alternative workflow to the above methods is to use
dev.print to send a currently open plot window to an image file:
dev.print(tiff, "image.tiff", res=600, height=5, width=5, units="in")
This can be especially useful for saving to multiple formats (as above) or for separately saving layers of a multi-layer plot (e.g., to use in a presentation where parts of the figure are revealed sequentially). This is the same general approach used by the ggplot2 graphics package, which offers a
ggsave function that is called after the plotting call. For TIFF formats, the
dpi argument should also be specified to produce an appropriately high resolution. Here are some examples:
ggsave("image.pdf", height=5, width=5, units='in')
ggsave("image.eps", height=5, width=5, units='in')
ggsave("image.tiff", height=5, width=5, units='in', dpi=600)
Exporting Graphics from Stata
Saving graphics interactively in Stata is straightforward. From the graphics window, one can simply press the save icon or select
File > Save and be presented with a simple menu to save the file. PDF, EPS, and TIFF formats are all available by default. Plots can also be saved from the console using
graph export after a plotting command:
graph twoway scatter x y
graph export image.pdf
graph export image.eps
graph export image.tif
Using the menu or icon, the TIFF format prints only at 96 DPI by default. Thus we should turn to the console and use the
width option in our call to
graph export. Unfortunately, it is not possible to specify a resolution directly, so we need to do some hackery. You need to calculate the width of the graph (in pixels) necessary to produce the desired resolution at final printed size. In a full-page two-column journal, a one-column width is about 3 inches and two-column width is about 6.5 inches, whereas in a small-format journal, a one-column width is about 4.25 inches. Thus, if we want 600 DPI, we would need pixel widths of 1800, 3900, or 2550, respectively:
graph twoway scatter x y
graph export image1.tif, width(1800)
graph export image2.tif, width(3900)
graph export image3.tif, width(2550)
The result of each of these is, however, a very large image with 96 DPI resolution (i.e., Stata simply produces a larger image with the same low resolution). Of course, because size and resolution are related, the images are visually equivalent, but you will need to modify the file with another utility to actually see the 600 DPI resolution at the intended output size.
Exporting Graphics from Excel
While versions of Microsoft Excel prior to 2007 allowed users to directly save charts as image files. This is no longer possible. Instead, one needs to use the File > Save As menu to output a chart to PDF (the only built-in file format for printing charts). To do this, select the chart (in need not be on its own tab), then follow
File > Save As. In the pop-up menu, select PDF from the Save as type drop-down menu and specify an appropriate filename. Clicking the
Options... menu will open another small window that allows one to confirm that only the
Selected chart will be output to PDF. Attempting to save a chart (as an object in a spreadsheet tab) without first selecting the chart will cause the
Options... pop-up to display a different set of options, none of which include
Selected chart. Another method is to move the chart to its own sheet, then under
File > Save As, select
Options..., and choose
Active sheet(s) to save just the chart to a PDF file. If, for some reason, a publisher will not accept a PDF file, one can use any of the options described in the next section for converting the PDF to TIFF.
With our files exported from R, Stata, Excel, or another software application, we may still need to make changes. For example, because Excel can only output PDF, we may need to convert our image to TIFF; or, because Stata can only output TIFF at 96 DPI, we may want to rescale the image to the appropriate size and resolution. Most publishers recommend using Adobe Photoshop to do this. Unfortunately, Adobe Photoshop is proprietary, expensive, and may not be readily accessible. Luckily several free, open-source, and easy-to-use alternatives exist.
For the most direct analogue to Photoshop, one should try GIMP (GNU Image Manipulation Program). For command line manipulation of images, GhostScript works well for PostScript (EPS) and PDF formats and ImageMagick offers diverse functionality for manipulating almost all image formats in addition to EPS and PDF. All of these programs should work on all modern operating systems.
GIMP: GNU Image Manipulation Program
GIMP allows you to easily adjust the resolution of images as well as save images into other file formats. For example, to convert a Stata graph saved at 96 DPI to 600 DPI, we can open GIMP, choose
File > Open to select and import the TIFF file. With the TIFF open, we can choose
Image > Print Size... and a small window will open describing the size and resolution of the file. From that menu, changing the resolution from its default to 600 makes GIMP adjust the image size accordingly.
We can then save the file using
File > Export..., specifying a filename, and selecting TIFF image from the file format drop-down. (We could also save the file in any other format.) If saving to TIFF, a small pop-up window will offer the option to choose file compression such as LZW. The new file will have the intended dimensions and resolution. GIMP is a fully featured image manipulation program, so can also be helpful for converting color to grayscale, cropping, etc.
Because GIMP can read almost any image format, if we have files in other formats, we can easily open them in GIMP and export them to any of the supported formats (e.g., to convert a TIFF to a PDF or vice versa). These features are straightforward and simply require opening the input file and using
File > Export... to save in an appropriate output format. Note, however, that GIMP does not – by default – support EPS format. To convert to or from EPS, we need to have the GhostScript command line utility installed first. Then, when opening an EPS or PDF in GIMP, you can specify the size and resolution at which to render the vector image (because GIMP will coerce the vector graphics to a raster).
Command-line utilities: GhostScript and ImageMagick
File conversion operations can also be performed on the command line. To convert PDF or EPS to TIFF, one can use GhostScript, a command-line utility for working with PDF and PostScript files. To use GhostScript, it must be installed and its directory must be on the system path. On Windows, you will probably have to manually add GhostScript to the system path after it is installed. You can check your path by opening Command Prompt (run it as an administrator) and typing:
That will output a long string of delimited directories that point to particular applications. The directory for GhostScript should be among them. For a current version of GhostScript on Windows 7, this directory is listed as:
C:/Program Files/gs/gs9.10/bin. If this (or a similar directory) is not listed in the path, one can easily add it to the Windows path by typing:
set PATH=%PATH%;C:\Program Files\gs\gs9.10\bin
You can also update the Windows system path by visiting
Control Panel > System, clicking on
Advanced System Settings, and pressing the
Environment Variables... button. This will open a small pop-up window where you can edit the
PATH variable for your user account and the
Path variable for all users on the machine. You can simply select the variable, press the
Edit... button, and paste the path to GhostScript (preceded by a semicolon) at the end of the current path.
With GhostScript on the path, we can reopen Command Prompt (or Terminal, on a UNIX-alike) and navigate to the directory containing our images. Let’s say we want to create
image.pdf, we can simply type the following:
gs -r600 -sDEVICE=tiffg4 -sOutputFile=image.tif image.pdf -dBATCH
The result is a high resolution TIFF file called
image.tif created in our working directory. Here’s a breakdown of the command:
gsrefers to GhostScript (On Windows,
gsmay instead need to be
gswin32, referring to the name of the actual GhostScript application.)
-r600requests a 600 DPI resolution
-sDEVICE=tiffg4says what image device to use (in our case, one for monochrome TIFF files, though if working with color graphics, another device might be appropriate such as
tiff24ncfor 24-bit color or
tiff12ncfor 12-bit color)
-dBATCHcleans up a little bit after everything is done.
image.eps, we can do the same conversion from EPS to TIFF:
gs -r600 -sDEVICE=tiffg4 -sOutputFile=image.tif image.eps -dBATCH
Thus both GIMP and GhostScript can convert between relevant formats. By installing both GhostScript and ImageMagick, you should also be able to do the conversion even more easily. To use ImageMagick, it must also be on the system path (and you can follow the above directions to ensure it is available on the path).
The following code for ImageMagick is equivalent to the above line for GhostScript to convert PDF to TIFF or to convert EPS to TIFF:
convert -density 600 image.pdf image.tif
convert -density 600 image.eps image.tif
We can also reverse the process to turn an EPS or TIFF into a PDF:
convert -density 600 image.eps image.pdf
convert -density 600 image.tif image.pdf
Note that this final conversion will not improve the resolution of a TIFF or convert it to a vector image. Instead it will simply encode the original raster into a PDF file. ImageMagick can also be used to perform a large number of other image manipulations.