CUG #381: JPEG Software by the Independent JPEG Group

Abstracted from documentation by Tom Lane

INTRODUCTION

The CUG Library volume #381 contains the fourth public release of the Independent JPEG Group's free JPEG software. You are welcome to redistribute this software and to use it for any purpose, subject to the conditions under LEGAL ISSUES, below. Serious users of this software (particularly those incorporating it into larger programs) should contact jpeg-info@uunet.uu.net to be added to our electronic mailing list. Mailing list members are notified of updates and have a chance to participate in technical discussions, etc. This software is the work of Tom Lane, Philip Gladstone, Luis Ortiz, Lee Crocker, Ge' Weijers, and other members of the Independent JPEG Group.

The JPEG Software distribution source code is written entirely in C. You can compile it on many platforms including IBM compatibles, Amiga, Macintosh, Atari ST, DEC VAX/VMS, Cray Y/MP, and most Unix platforms. Supported Unix platforms include, but are not limited to, Apollo, HP-UX, SGI Indigo, and SUN Sparcstation. The make system even includes a utility to convert the ANSI-style C code back to older K&R-style.

WHAT IS JPEG?

This software implements JPEG image compression and decompression. JPEG (pronounced "jay-peg") is a standardized compression method for full-color and gray-scale images. JPEG is intended for compressing "real-world" scenes; cartoons and other non-realistic images are not its strong suit. JPEG is lossy, meaning that the output image is not necessarily identical to the input image. Hence you must not use JPEG if you have to have identical output bits. However, on typical images of real-world scenes, very good compression levels can be obtained with no visible change, and amazingly high compression levels are possible if you can tolerate a low-quality image. For more details, see the references, or just experiment with various compression settings.

The software implements JPEG baseline and extended-sequential compression processes. Provision is made for supporting all variants of these processes, although some uncommon parameter settings aren't implemented yet. For legal reasons, we are not distributing code for the arithmetic-coding process; see LEGAL ISSUES. At present we have made no provision for supporting the progressive, hierarchical, or lossless processes defined in the standard.

In order to support file conversion and viewing software, we have included considerable functionality beyond the bare JPEG coding/decoding capability; for example, the color quantization modules are not strictly part of JPEG decoding, but they are essential for output to colormapped file formats or colormapped displays. These extra functions can be compiled out if not required for a particular application.

The emphasis in designing this software has been on achieving portability and flexibility, while also making it fast enough to be useful. In particular, the software is not intended to be read as a tutorial on JPEG. (See the REFERENCES section for introductory material.) While we hope that the entire package will someday be industrial-strength code, much remains to be done in performance tuning and in improving the capabilities of individual modules.

This software can be used on several levels:

As canned software for JPEG compression and decompression. Just edit the Makefile and configuration files as needed (see file SETUP), compile and go. Members of the Independent JPEG Group will improve the out-of-the-box functionality and speed as time goes on.
As the basis for other JPEG programs. For example, you could incorporate the decompressor into a general image viewing package by replacing the output module with write-to-screen functions. For an implementation on specific hardware, you might want to replace some of the inner loops with assembly code. For a non-command-line-driven system, you might want a different user interface. (Members of the group will be producing Macintosh and Amiga versions with more appropriate user interfaces, for example.)
As a toolkit for experimentation with JPEG and JPEG-like algorithms. Most of the individual decisions you might want to mess with are packaged up into separate modules. For example, the details of color-space conversion and subsampling techniques are each localized in one compressor and one decompressor module. You'd probably also want to extend the user interface to give you more detailed control over the JPEG compression parameters.

In particular, we welcome the use of this software as a component of commercial products; no royalty is required.

SUPPORTING SOFTWARE

You will probably want Jef Poskanzer's PBMPLUS image software, which provides many useful operations on PPM-format image files. In particular, it can convert PPM images to and from a wide range of other formats. You can FTP this free software from export.lcs.mit.edu (contrib/pbmplus*.tar.Z) or ftp.ee.lbl.gov (pbmplus*.tar.Z). Unfortunately PBMPLUS is not nearly as portable as the JPEG software is; you are likely to have difficulty making it work on any non-Unix machine.

If you are using X Windows you might want to use the xv or xloadimage viewers to save yourself the trouble of converting PPM to some other format. Both of these can be found in the contrib directory at export.lcs.mit.edu. Actually, xv version 2.00 and up incorporates our software and thus can read and write JPEG files directly. (NOTE: since xv internally reduces all images to 8 bits/pixel, a JPEG file written by xv will not be very high quality; and xv cannot fully exploit a 24-bit display. These problems are expected to go away in the next xv release, planned for early 1993. In the meantime, use xloadimage for 24-bit displays.)

For DOS machines, Lee Crocker's free Piclab program is a useful companion to the JPEG software. The latest version, currently 1.91, is available by FTP from SIMTEL20 and its various mirror sites, file piclb191.zip. CompuServe also has it, in the same library as the JPEG software.

JEPG IMPLEMENTATIONS AND FILE FORMATS

Unlike other file formats that are totally predefined, such as Windows .BMP files, JPEG doesn't specify the entire file format. Handmade Software's shareware PC program GIF2JPG produces files that are totally incompatible with our programs. They use a proprietary format that is an amalgam of GIF and JPEG representations. However, you can force GIF2JPG to produce compatible files with its -j switch, and their decompression program JPG2GIF can read our files (at least ones produced with our default option settings).

Some commercial JPEG implementations are also incompatible as of this writing, especially programs released before summer 1991. The root of the problem is that the ISO JPEG committee failed to specify a concrete file format. Some vendors "filled in the blanks" on their own, creating proprietary formats that no one else could read. (For example, none of the early commercial JPEG implementations for the Macintosh were able to exchange compressed files.)

The file format we have adopted is called JFIF (see REFERENCES). This format has been agreed to by a number of major commercial JPEG vendors, and we expect that it will become the de facto standard. JFIF is a minimal representation; work is also going forward to incorporate JPEG compression into the TIFF 6.0 standard, for use in "high end" applications that need to record a lot of additional data about an image. We intend to support TIFF 6.0 in the future. We hope that these two formats will be sufficient and that other, incompatible JPEG file formats will not proliferate.

Indeed, part of the reason for developing and releasing this free software is to help force rapid convergence to de facto standards for JPEG file formats. SUPPORT STANDARD, NON-PROPRIETARY FORMATS: demand JFIF or TIFF 6.0!

USING JPEG AS A SUBROUTINE IN A LARGER PROGRAM

You can readily incorporate the JPEG compression and decompression routines in a larger program. The file example.c provides a skeleton of the interface routines you'll need for this purpose. Essentially, you replace jcmain.c (for compression) and/or jdmain.c (for decompression) with your own code. Note that the fewer JPEG options you allow the user to twiddle, the less code you need; all the default options are set up automatically. (Alternately, if you know a lot about JPEG or have a special application, you may want to twiddle the default options even more extensively than jcmain/jdmain do.)

Most likely, you will want the uncompressed image to come from memory (for compression) or go to memory or the screen (for decompression). For this purpose you must provide image reading or writing routines that match the interface used by the image file I/O modules (jrdXXX or jwrXXX); again, example.c shows a skeleton of what is required. In this situation, you won't need any of the non-JPEG image file I/O modules used by cjpeg and djpeg.

By default, any error detected inside the JPEG routines will cause a message to be printed on stderr, followed by exit(). You can override this behavior by supplying your own message-printing and/or error-exit routines; again, example.c shows how.

We recommend you create libjpeg.a as shown in the Makefile, then link that with your surrounding program. (If your linker is at all reasonable, only the code you actually need will get loaded.) Include the files jconfig.h and jpegdata.h in C files that need to call the JPEG routines.

USING JPEG AS CANNED SOFTWARE

By default, the JPEG Software distribution builds a command-line driven translator. The currently supported image file formats are: PPM (PBMPLUS color format), PGM (PBMPLUS gray-scale format), GIF (up to 256 colors), Targa (up to 24-bit color), and RLE (Utah Raster Toolkit format). RLE is supported only if the URT library is available. The compression program "cjpeg" recognizes the input image format automatically, with the exception of some Targa-format files. Of course for the decompression program "djpeg", you have to tell it what file format to generate. The only JPEG file format currently supported is the JFIF format. Support for the TIFF 6.0 JPEG format will probably be added at some future date.

HINTS FOR ALL USAGE

Color GIF files are not the ideal input for JPEG; JPEG is really intended for compressing full-color (24-bit) images. In particular, don't try to convert cartoons, line drawings, and other images that have only a few distinct colors. GIF works great on these, JPEG does not. If you want to convert a GIF to JPEG, you should experiment with cjpeg's -quality and -smooth options to get a satisfactory conversion. -smooth 10 or so is often helpful.

Avoid running an image through a series of JPEG compression/decompression cycles. Image quality loss will accumulate; after ten or so cycles the image may be noticeably worse than it was after one cycle. It's best to use a lossless format while manipulating an image, then convert to JPEG format when you are ready to file the image away.

The -optimize option to cjpeg is worth using when you are making a "final" version for posting or archiving. It's also a win when you are using low quality settings to make very small JPEG files; the percentage improvement is often a lot more than it is on larger files.

The default memory usage limit (-maxmemory) is set when the software is compiled. If you get an "insufficient memory" error, try specifying a smaller -maxmemory value, even -maxmemory 0 to use the absolute minimum space. You may want to recompile with a smaller default value if this happens often.

On MS-DOS machines, -maxmemory is the amount of main (conventional) memory to use. (Extended or expanded memory is also used if available.) Most DOS-specific versions of this software do their own memory space estimation and do not need -maxmemory.

djpeg with two-pass color quantization requires a good deal of memory; on MS-DOS machines it may run out of memory even with -maxmemory 0. In that case you can still decompress, with some loss of image quality, by specifying -onepass for one-pass quantization.

If more space is needed than will fit in the available main memory (as determined by -maxmemory), temporary files will be used. (MS-DOS versions will try to get extended or expanded memory first.) The temporary files are often rather large: in typical cases they occupy three bytes per pixel, for example 3*800*600 = 1.44Mb for an 800x600 image. If you don't have enough free disk space, leave out -optimize (for cjpeg) or specify -onepass (for djpeg). On MS-DOS, the temporary files are created in the directory named by the TMP or TEMP environment variable, or in the current directory if neither of those exist. Amiga implementations put the temp files in the directory named by JPEGTMP:, so be sure to assign JPEGTMP: to a disk partition with adequate free space.

FOR MORE INFORMATION...

We highly recommend reading one or more of these references before trying to understand the innards of any JPEG software.

The JPEG FAQ (Frequently Asked Questions) article is a useful source of general information about JPEG. It is updated constantly and therefore is not included in this distribution. The FAQ is posted every two weeks to Usenet newsgroups comp.graphics, news.answers, and other groups. You can always obtain the latest version from the news.answers archive at rtfm.mit.edu (18.172.1.27). By FTP, fetch /pub/usenet/news.answers/jpeg-faq. If you don't have FTP, send e-mail to mail-server@rtfm.mit.edu with body "send usenet/news.answers/jpeg-faq".

The best short technical introduction to the JPEG compression algorithm is:

Wallace, Gregory K. "The JPEG Still Picture Compression Standard", Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.

(Adjacent articles in that issue discuss MPEG motion picture compression, applications of JPEG, and related topics.) If you don't have the CACM issue handy, a PostScript file containing a revised version of the article is available at ftp.uu.net, graphics/jpeg/wallace.ps.Z. The file (actually a preprint for an article to appear in IEEE Trans. Consumer Electronics) omits the sample images that appeared in CACM, but it includes corrections and some added material. Note: the Wallace article is copyright ACM and IEEE, and it may not be used for commercial purposes.

A somewhat less technical, more leisurely introduction to JPEG can be found in "The Data Compression Book" by Mark Nelson, published by M&T Books (Redwood City, CA), 1991, ISBN 1-55851-216-0. This book provides good explanations and example C code for a multitude of compression methods including JPEG. It is an excellent source if you are comfortable reading C code but don't know much about data compression in general. The book's JPEG sample code is far from industrial-strength, but when you are ready to look at a full implementation, you've got one here...

A new textbook about JPEG is "JPEG Still Image Data Compression Standard" by William B. Pennebaker and Joan L. Mitchell, published by Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1. Price US$59.95. This book includes the complete text of the ISO JPEG standards (DIS 10918-1 and draft DIS 10918-2). This is by far the most complete exposition of JPEG in existence, and I highly recommend it. If you read the entire book, you will probably know more about JPEG than I do.

The JPEG standard itself is not available electronically; you must order a paper copy through ISO. (Unless you are concerned about having a certified official copy, I recommend buying the Pennebaker and Mitchell book instead; it's much cheaper and includes a great deal of useful explanatory material.) In the US, copies of the standard may be ordered from ANSI Sales at (212) 642-4900. It's not cheap: as of 1992, Part 1 is $95 and Part 2 is $47, plus 7% shipping/handling. The standard is divided into two parts, Part 1 being the actual specification, while Part 2 covers compliance testing methods. As of early 1992, Part 1 has Draft International Standard status. It is titled "Digital Compression and Coding of Continuous-tone Still Images, Part 1: Requirements and guidelines" and has document number ISO/IEC DIS 10918-1. Part 2 is still at Committee Draft status. It is titled "Digital Compression and Coding of Continuous-tone Still Images, Part 2: Compliance testing" and has document number ISO/IEC CD 10918-2. (NOTE: I'm told that the final version of Part 2 will differ considerably from the CD draft.)

The JPEG standard does not specify all details of an interchangeable file format. For the omitted details we follow the "JFIF" conventions, revision 1.02. A copy of the JFIF spec is available from:

Literature Department
C-Cube Microsystems, Inc.
399A West Trimble Road
San Jose, CA 95131
(408) 944-6300

A PostScript version of this document is available at ftp.uu.net, file graphics/jpeg/jfif.ps.Z. It can also be obtained by e-mail from the C-Cube mail server, netlib@c3.pla.ca.us. Send the message "send jfif_ps from jpeg" to the server to obtain the JFIF document; send the message "help" if you have trouble.

The TIFF 6.0 file format specification can be obtained by FTP from sgi.com (192.48.153.1), file graphics/tiff/TIFF6.ps.Z; or you can order a printed copy from Aldus Corp. at (206) 628-6593. It should be noted that the TIFF 6.0 spec of 3-June-92 has a number of serious problems in its JPEG features. A clarification note will probably be needed to ensure that TIFF JPEG files are compatible across different implementations. The IJG does not intend to support TIFF 6.0 until these problems are resolved.

If you want to understand this implementation, start by reading the "architecture" documentation file. Please read "codingrules" if you want to contribute any code.

LEGAL ISSUES

The authors make NO WARRANTY or representation, either express or implied, with respect to this software, its quality, accuracy, merchantability, or fitness for a particular purpose. This software is provided "AS IS", and you, its user, assume the entire risk as to its quality and accuracy.

Permission is hereby granted to use, copy, modify, and distribute this software (or portions thereof) for any purpose, without fee, subject to these conditions:

(1) If any part of the source code for this software is distributed, then this README file must be included, with this copyright and no-warranty notice unaltered; and any additions, deletions, or changes to the original files must be clearly indicated in accompanying documentation.

(2) If only executable code is distributed, then the accompanying documentation must state that "this software is based in part on the work of the Independent JPEG Group".

(3) Permission for use of this software is granted only if the user accepts full responsibility for any undesirable consequences; the authors accept NO LIABILITY for damages of any kind.

Permission is NOT granted for the use of any IJG author's name or company name in advertising or publicity relating to this software or products derived from it. This software may be referred to only as "the Independent JPEG Group's software".

We specifically permit and encourage the use of this software as the basis of commercial products, provided that all warranty or liability claims are assumed by the product vendor.

It appears that the arithmetic coding option of the JPEG spec is covered by patents owned by IBM and AT&T, as well as a pending Japanese patent of Mitsubishi. Hence arithmetic coding cannot legally be used without obtaining one or more licenses. For this reason, support for arithmetic coding has been removed from the free JPEG software. (Since arithmetic coding provides only a marginal gain over the unpatented Huffman mode, it is unlikely that very many implementors will support it. If you do obtain the necessary licenses, contact jpeg-info@uunet.uu.net for a copy of our arithmetic coding modules.) So far as we are aware, there are no patent restrictions on the remaining code.

FUTURE PROJECTS

The next major release will probably be a significant rewrite to allow use of this code in conjunction with Sam Leffler's free TIFF library (assuming the bugs in the TIFF 6.0 specification get resolved).

Many of the modules need fleshing out to provide more complete implementations, or to provide faster paths for common cases. Speeding things up is still high on our priority list.

We'd appreciate it if people would compile and check out the code on as wide a variety of systems as possible, and report any portability problems encountered (with solutions, if possible). Checks of file compatibility with other JPEG implementations would also be of interest. Finally, we would appreciate code profiles showing where the most time is spent, especially on unusual systems.

Please send bug reports, offers of help, etc. to jpeg-info@uunet.uu.net.