文档库 最新最全的文档下载
当前位置:文档库 › jpg 文件格式详解

jpg 文件格式详解

jpg 文件格式详解
jpg 文件格式详解

This paper is a revised version of an article by the same title and author which appeared in the April 1991 issue of Communications of the ACM.

Abstract

For the past few years, a joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international compression standard for continuous-tone still images, both grayscale and color. JPEG’s proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each with various modes of operation. A DCT-based method is specified for “lossy’’ compression, and a predictive method for “lossless’’ compression. JPEG features a simple lossy technique known as the Baseline method, a subset of the other DCT-based modes of operation. The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications. This article provides an overview of the JPEG standard, and focuses in detail on the Baseline method.

1 Introduction

Advances over the past decade in many aspects of digital technology - especially devices for image acquisition, data storage, and bitmapped printing and display - have brought about many applications of digital imaging. However, these applications tend to be specialized due to their relatively high cost. With the possible exception of facsimile, digital images are not commonplace in general-purpose computing systems the way text and geometric graphics are. The majority of modern business and consumer usage of photographs and other types of images takes place through more traditional analog means.The key obstacle for many applications is the vast amount of data required to represent a digital image directly. A digitized version of a single, color picture at TV resolution contains on the order of one million bytes; 35mm resolution requires ten times that amount. Use of digital images often is not viable due to high storage or transmission costs, even when image capture and display devices are quite affordable.

Modern image compression technology offers a possible solution. State-of-the-art techniques can compress typical images from 1/10 to 1/50 their uncompressed size without visibly affecting image quality. But compression technology alone is not sufficient. For digital image applications involving storage or transmission to become widespread in today’s marketplace, a standard image compression method is needed to enable interoperability of equipment from different manufacturers. The CCITT recommendation for today’s ubiquitous Group 3 fax machines [17] is a dramatic example of how a standard compression method can enable an important image application. The Group 3 method, however, deals with bilevel images only and does not address photographic image compression.

For the past few years, a standardization effort known by the acronym JPEG, for Joint Photographic Experts Group, has been working toward establishing the first international digital image compression standard for continuous-tone (multilevel) still images, both grayscale and color. The “joint” in JPEG refers to a collaboration between CCITT and ISO. JPEG convenes officially as the ISO committee designated JTC1/SC2/WG10, but operates in close informal collaboration with CCITT SGVIII. JPEG will be both an ISO Standard and a CCITT Recommendation. The text of both will be identical.

Photovideotex, desktop publishing, graphic arts, color facsimile, newspaper wirephoto transmission, medical imaging, and many other continuous-tone image applications require a compression standard in order to

The JPEG Still Picture Compression Standard

Gregory K. Wallace

Multimedia Engineering

Digital Equipment Corporation

Maynard, Massachusetts

Submitted in December 1991 for publication in IEEE Transactions on Consumer Electronics

develop significantly beyond their present state. JPEG has undertaken the ambitious task of developing a

general-purpose compression standard to meet the needs of almost all continuous-tone still-image applications.

If this goal proves attainable, not only will individual applications flourish, but exchange of images across application boundaries will be facilitated. This latter feature will become increasingly important as more image applications are implemented on general-purpose computing systems, which are themselves becoming increasingly interoperable and internetworked. For applications which require specialized VLSI to meet their compression and decompression speed requirements, a common method will provide economies of scale not possible within a single application.

This article gives an overview of JPEG’s proposed image-compression standard. Readers without prior knowledge of JPEG or compression based on the Discrete Cosine Transform (DCT) are encouraged to study first the detailed description of the Baseline sequential codec, which is the basis for all of the DCT-based decoders. While this article provides many details, many more are necessarily omitted. The reader should refer to the ISO draft standard [2] before attempting implementation.

Some of the earliest industry attention to the JPEG proposal has been focused on the Baseline sequential codec as a motion image compression method - of the ‘‘intraframe’’ class, where each frame is encoded as a separate image. This class of motion image coding, while providing less compression than ‘‘interframe’’methods like MPEG, has greater flexibility for video editing. While this paper focuses only on JPEG as a still picture standard (as ISO intended), it is interesting to note that JPEG is likely to become a ‘‘de facto’’intraframe motion standard as well.

2 Background: Requirements and Selec-tion Process

JPEG’s goal has been to develop a method for continuous-tone image compression which meets the following requirements:

1)be at or near the state of the art with regard to

compression rate and accompanying image fidelity, over a wide range of image quality ratings, and especially in the range where visual fidelity to the original is characterized as “very good” to “excellent”; also, the encoder should be parameterizable, so that the application (or user) can set the desired compression/quality tradeoff;2)be applicable to practically any kind of

continuous-tone digital source image (i.e. for most practical purposes not be restricted to images of certain dimensions, color spaces, pixel aspect ratios, etc.) and not be limited to classes of imagery with restrictions on scene content, such as complexity, range of colors, or statistical properties;

3)have tractable computational complexity, to make

feasible software implementations with viable performance on a range of CPU’s, as well as hardware implementations with viable cost for applications requiring high performance;

4) have the following modes of operation:

?Sequential encoding: each image component is encoded in a single left-to-right, top-to-bottom

scan;

?Progressive encoding: the image is encoded in multiple scans for applications in which

transmission time is long, and the viewer

prefers to watch the image build up in multiple

coarse-to-clear passes;

?Lossless encoding: the image is encoded to guarantee exact recovery of every source

image sample value (even though the result is

low compression compared to the lossy

modes);

?Hierarchical encoding: the image is encoded at multiple resolutions so that lower-resolution

versions may be accessed without first having

to decompress the image at its full resolution. In June 1987, JPEG conducted a selection process based on a blind assessment of subjective picture quality, and narrowed 12 proposed methods to three. Three informal working groups formed to refine them, and in January 1988, a second, more rigorous selection process [19] revealed that the “ADCT” proposal [11], based on the 8x8 DCT, had produced the best picture quality.

At the time of its selection, the DCT-based method was only partially defined for some of the modes of operation. From 1988 through 1990, JPEG undertook the sizable task of defining, documenting, simulating, testing, validating, and simply agreeing on the plethora of details necessary for genuine interoperability and universality. Further history of the JPEG effort is contained in [6, 7, 9, 18].

3 Architecture of the Proposed Standard The proposed standard contains the four “modes of operation” identified previously. For each mode, one or more distinct codecs are specified. Codecs within a mode differ according to the precision of source image samples they can handle or the entropy coding method they use. Although the word codec (encoder/decoder) is used frequently in this article, there is no requirement that implementations must include both an encoder and a decoder. Many applications will have systems or devices which require only one or the other.

The four modes of operation and their various codecs have resulted from JPEG’s goal of being generic and from the diversity of image formats across applications. The multiple pieces can give the impression of undesirable complexity, but they should actually be regarded as a comprehensive “toolkit” which can span a wide range of continuous-tone image applications. It is unlikely that many implementations will utilize every tool -- indeed, most of the early implementations now on the market (even before final ISO approval) have implemented only the Baseline sequential codec.

The Baseline sequential codec is inherently a rich and sophisticated compression method which will be sufficient for many applications. Getting this minimum JPEG capability implemented properly and interoperably will provide the industry with an important initial capability for exchange of images across vendors and applications.

4 Processing Steps for DCT-Based Coding Figures 1 and 2 show the key processing steps which are the heart of the DCT-based modes of operation. These figures illustrate the special case of single-component (grayscale) image compression. The reader can grasp the essentials of DCT-based compression by thinking of it as essentially compression of a stream of 8x8 blocks of grayscale image samples. Color image compression can then be approximately regarded as compression of multiple grayscale images, which are either compressed entirely one at a time, or are compressed by alternately interleaving 8x8 sample blocks from each in turn.

For DCT sequential-mode codecs, which include the Baseline sequential codec, the simplified diagrams indicate how single-component compression works in a fairly complete way. Each 8x8 block is input, makes its way through each processing step, and yields output in compressed form into the data stream. For DCT progressive-mode codecs, an image buffer exists prior to the entropy coding step, so that an image can be stored and then parceled out in multiple scans with suc-cessively improving quality. For the hierarchical mode of operation, the steps shown are used as building blocks within a larger framework.

4.1 8x8 FDCT and IDCT

At the input to the encoder, source image samples are grouped into 8x8 blocks, shifted from unsigned integers with range [0, 2P - 1] to signed integers with range [-2P-1, 2P-1-1], and input to the Forward DCT (FDCT). At the output from the decoder, the Inverse DCT (IDCT) outputs 8x8 sample blocks to form the reconstructed image. The following equations are the idealized mathematical definitions of the 8x8 FDCT and 8x8 IDCT:

The DCT is related to the Discrete Fourier Transform (DFT). Some simple intuition for DCT-based compression can be obtained by viewing the FDCT as a harmonic analyzer and the IDCT as a harmonic synthesizer. Each 8x8 block of source image samples is effectively a 64-point discrete signal which is a function of the two spatial dimensions x and y. The FDCT takes such a signal as its input and decomposes it into 64 orthogonal basis signals. Each contains one of the 64 unique two-dimensional (2D) “spatial frequencies’’ which comprise the input signal’s “spectrum.” The ouput of the FDCT is the set of 64 basis-signal amplitudes or “DCT coefficients” whose values are uniquely determined by the particular 64-point input signal.

The DCT coefficient values can thus be regarded as the relative amount of the 2D spatial frequencies contained in the 64-point input signal. The coefficient with zero frequency in both dimensions is called the “DC coefficient” and the remaining 63 coefficients are called the “AC coefficients.’’ Because sample values

[

F(u,v)=1

4

C(u)C(v)

7

x=0

7

y=0

f(x,y)*

cos(2x+1)uπ

16

cos(2x+1)vπ

16

](1)

[

f(x,y)=1

4

7

u=0

7

v=0

C(u)C(v)F(u,v)*

cos(2x+1)uπ

16

cos(2x+1)vπ

16

](2) where:for

otherwise.

C(u),C(v)=1 2

u,

C(u),C(v)=1

v=;0

typically vary slowly from point to point across an image, the FDCT processing step lays the foundation for achieving data compression by concentrating most of the signal in the lower spatial frequencies. For a typical 8x8 sample block from a typical source image,most of the spatial frequencies have zero or near-zero amplitude and need not be encoded.

At the decoder the IDCT reverses this processing step. It takes the 64 DCT coefficients (which at that point have been quantized) and reconstructs a 64-point ouput image signal by summing the basis signals. Mathematically, the DCT is one-to-one mapping for 64-point vectors between the image and the frequency domains. If the FDCT and IDCT could be computed with perfect accuracy and if the DCT coefficients were not quantized as in the following description, the original 64-point signal could be exactly recovered. In principle, the DCT introduces no loss to the source image samples; it merely transforms them to a domain in which they can be more efficiently encoded.

Some properties of practical FDCT and IDCT implementations raise the issue of what precisely should be required by the JPEG standard. A fundamental property is that the FDCT and IDCT equations contain transcendental functions. Consequently, no physical implementation can compute them with perfect accuracy. Because of the DCT’s application importance and its relationship to the DFT, many different algorithms by which the

FDCT and IDCT may be approximately computed have been devised [16]. Indeed, research in fast DCT algorithms is ongoing and no single algorithm is optimal for all implementations. What is optimal in software for a general-purpose CPU is unlikely to be optimal in firmware for a programmable DSP and is certain to be suboptimal for dedicated VLSI.

Even in light of the finite precision of the DCT inputs and outputs, independently designed implementations of the very same FDCT or IDCT algorithm which differ even minutely in the precision by which they represent cosine terms or intermediate results, or in the way they sum and round fractional values, will eventually produce slightly different outputs from identical inputs.To preserve freedom for innovation and customization within implementations, JPEG has chosen to specify neither a unique FDCT algorithm or a unique IDCT algorithm in its proposed standard. This makes compliance somewhat more difficult to confirm,because two compliant encoders (or decoders)generally will not produce identical outputs given identical inputs. The JPEG standard will address this issue by specifying an accuracy test as part of its compliance tests for all DCT-based encoders and decoders; this is to ensure against crudely inaccurate cosine basis functions which would degrade image quality.

Entropy

Decoder

Dequantizer IDCT

DCT-Based Decoder

Table Table Specifications Specifications Compressed Image Data Reconstructed Image Data

Figure 1. DCT-Based Encoder Processing Steps

Figure 2. DCT-Based Decoder Processing Steps

For each DCT-based mode of operation, the JPEG proposal specifies separate codecs for images with 8-bit and 12-bit (per component) source image samples. The 12-bit codecs, needed to accommodate certain types of medical and other images, require greater computational resources to achieve the required FDCT or IDCT accuracy. Images with other sample precisions can usually be accommodated by either an 8-bit or 12-bit codec, but this must be done outside the JPEG standard. For example, it would be the responsibility of an application to decide how to fit or pad a 6-bit sample into the 8-bit encoder’s input interface, how to unpack it at the decoder’s output, and how to encode any necessary related information.4.2 Quantization

After output from the FDCT, each of the 64 DCT coefficients is uniformly quantized in conjunction with a 64-element Quantization Table, which must be specified by the application (or user) as an input to the encoder. Each element can be any integer value from 1to 255, which specifies the step size of the quantizer for its corresponding DCT coefficient. The purpose of quantization is to achieve further compression by representing DCT coefficients with no greater precision than is necessary to achieve the desired image quality. Stated another way, the goal of this processing step is to discard information which is not visually significant.Quantization is a many-to-one mapping, and therefore is fundamentally lossy. It is the principal source of lossiness in DCT-based encoders.

Quantization is defined as division of each DCT coefficient by its corresponding quantizer step size,followed by rounding to the nearest integer:

F Q (u ,v ) = Integer Round ( F (u ,v )

Q (u ,v

) )

(3)

This output value is normalized by the quantizer step size. Dequantization is the inverse function, which in this case means simply that the normalization is removed by multiplying by the step size, which returns the result to a representation appropriate for input to the IDCT:

When the aim is to compress the image as much as possible without visible artifacts, each step size ideally should be chosen as the perceptual threshold or “just noticeable difference” for the visual contribution of its corresponding cosine basis function. These thresholds are also functions of the source image characteristics,display characteristics and viewing distance. For applications in which these variables can be reasonably well defined, psychovisual experiments can be performed to determine the best thresholds. The experiment described in [12] has led to a set of Quantization Tables for CCIR-601 [4] images and displays. These have been used experimentally by JPEG members and will appear in the ISO standard as a matter of information, but not as a requirement.4.3 DC Coding and Zig-Zag Sequence

After quantization, the DC coefficient is treated separately from the 63 AC coefficients. The DC coefficient is a measure of the average value of the 64image samples. Because there is usually strong correlation between the DC coefficients of adjacent 8x8blocks, the quantized DC coefficient is encoded as the difference from the DC term of the previous block in the encoding order (defined in the following), as shown in Figure 3. This special treatment is worthwhile, as DC coefficients frequently contain a significant fraction of the total image energy.

F Q (u ,v ) =F Q

(u ,v )Q (u ,v )

*(4)

. . .

DIFF = DC i - DC i-1

i-1

i

Differential DC encoding Zig?zag sequence

. . .77

07

70

Figure 3. Preparation of Quantized Coefficients for Entropy Coding

Finally, all of the quantized coefficients are ordered into the “zig-zag” sequence, also shown in Figure 3. This ordering helps to facilitate entropy coding by placing low-frequency coefficients (which are more likely to be nonzero) before high-frequency coefficients.

4.4 Entropy Coding

The final DCT-based encoder processing step is entropy coding. This step achieves additional compression losslessly by encoding the quantized DCT coefficients more compactly based on their statistical characteristics. The JPEG proposal specifies two entropy coding methods - Huffman coding [8] and arithmetic coding [15]. The Baseline sequential codec uses Huffman coding, but codecs with both methods are specified for all modes of operation.

It is useful to consider entropy coding as a 2-step process. The first step converts the zig-zag sequence of quantized coefficients into an intermediate sequence of symbols. The second step converts the symbols to a data stream in which the symbols no longer have externally identifiable boundaries. The form and definition of the intermediate symbols is dependent on both the DCT-based mode of operation and the entropy coding method.

Huffman coding requires that one or more sets of Huffman code tables be specified by the application. The same tables used to compress an image are needed to decompress it. Huffman tables may be predefined and used within an application as defaults, or computed specifically for a given image in an initial statistics-gathering pass prior to compression. Such choices are the business of the applications which use JPEG; the JPEG proposal specifies no required Huffman tables. Huffman coding for the Baseline sequential encoder is described in detail in section 7.

By contrast, the particular arithmetic coding method specified in the JPEG proposal [2] requires no tables to be externally input, because it is able to adapt to the image statistics as it encodes the image. (If desired, statistical conditioning tables can be used as inputs for slightly better efficiency, but this is not required.) Arithmetic coding has produced 5-10% better compression than Huffman for many of the images which JPEG members have tested. However, some feel it is more complex than Huffman coding for certain implementations, for example, the highest-speed hardware implementations. (Throughout JPEG’s history, “complexity” has proved to be most elusive as a practical metric for comparing compression methods.) If the only difference between two JPEG codecs is the entropy coding method, transcoding between the two is possible by simply entropy decoding with one method and entropy recoding with the other.

4.5 Compression and Picture Quality

For color images with moderately complex scenes, all DCT-based modes of operation typically produce the following levels of picture quality for the indicated ranges of compression. These levels are only a guideline - quality and compression can vary significantly according to source image characteristics and scene content. (The units “bits/pixel” here mean the total number of bits in the compressed image -including the chrominance components - divided by the number of samples in the luminance component.)

?0.25-0.5 bits/pixel: moderate to good quality, sufficient for some applications;

?0.5-0.75 bits/pixel: good to very good quality, sufficient for many applications;

?0.75-1/5 bits/pixel: excellent quality, sufficient for most applications;

? 1.5-2.0 bits/pixel: usually indistinguishable from the original, sufficient for the most demanding applications.

5 Processing Steps for Predictive Lossless Coding

After its selection of a DCT-based method in 1988, JPEG discovered that a DCT-based lossless mode was difficult to define as a practical standard against which encoders and decoders could be independently implemented, without placing severe constraints on both encoder and decoder implementations.

JPEG, to meet its requirement for a lossless mode of operation, has chosen a simple predictive method which is wholly independent of the DCT processing described previously. Selection of this method was not the result of rigorous competitive evaluation as was the DCT-based method. Nevertheless, the JPEG lossless method produces results which, in light of its simplicity, are surprisingly close to the state of the art for lossless continuous-tone compression, as indicated by a recent technical report [5].

Figure 4 shows the main processing steps for a single-component image. A predictor combines the values of up to three neighboring samples (A, B, and C) to form a prediction of the sample indicated by X in Figure 5. This prediction is then subtracted from the actual value of sample X, and the difference is encoded

losslessly by either of the entropy coding methods -Huffman or arithmetic. Any one of the eight predictors listed in Table 1 (under “selection-value”) can be used.Selections 1, 2, and 3 are one-dimensional predictors and selections 4, 5, 6 and 7 are two-dimensional predictors. Selection-value 0 can only be used for differential coding in the hierarchical mode of operation. The entropy coding is nearly identical to that used for the DC coefficient as described in section 7.1 (for Huffman coding).

For the lossless mode of operation, two different codecs are specified - one for each entropy coding method. The encoders can use any source image precision from 2 to 16 bits/sample, and can use any of the predictors except selection-value 0. The decoders must handle any of the sample precisions and any of the predictors. Lossless codecs typically produce around 2:1compression for color images with moderately complex scenes.

Figure 5. 3-Sample Prediction Neighborhood C B A X

6 Multiple-Component Images

The previous sections discussed the key processing steps of the DCT-based and predictive lossless codecs for the case of single-component source images. These steps accomplish the image data compression. But a good deal of the JPEG proposal is also concerned with the handling and control of color (or other) images with multiple components. JPEG’s aim for a generic compression standard requires its proposal to accommodate a variety of source image formats.6.1 Source Image Formats

The source image model used in the JPEG proposal is an abstraction from a variety of image types and applications and consists of only what is necessary to compress and reconstruct digital image data. The reader should recognize that the JPEG compressed data format does not encode enough information to serve as a complete image representation. For example, JPEG does not specify or encode any information on pixel aspect ratio, color space, or image acquisition characteristics.

Table 1. Predictors for Lossless Coding

selection-value

prediction 0no prediction 1234567

A B C

A+B-C

A+((B-C)/2)B+((A-C)/2)(A+B)/2

Predictor

Entropy Encoder

Lossless Encoder

Source Image Data

Table Specifications

Compressed Image Data

Figure 4. Lossless Mode Encoder Processing Steps

Figure 6 illustrates the JPEG source image model. A source image contains from 1 to 255 image components, sometimes called color or spectral bands or channels. Each component consists of a rectangular array of samples. A sample is defined to be an unsigned integer with precision P bits, with any value in the range [0, 2P -1]. All samples of all components within the same source image must have the same precision P. P can be 8 or 12 for DCT-based codecs,and 2 to 16 for predictive codecs.

The ith component has sample dimensions x i by y i . To accommodate formats in which some image components are sampled at different rates than others,components can have different dimensions. The dimensions must have a mutual integral relationship defined by H i and V i , the relative horizontal and vertical sampling factors, which must be specified for each component. Overall image dimensions X and Y are defined as the maximum x i and y i for all components in the image, and can be any number up to 216. H and V are allowed only the integer values 1through 4. The encoded parameters are X, Y, and H i s and V i s for each components. The decoder reconstructs the dimensions x i and y i for each component, according to the following relationship shown in Equation 5:

where ? is the ceiling function.6.2 Encoding Order and Interleaving

A practical image compression standard must address how systems will need to handle the data during the process of decompression. Many applications need to pipeline the process of displaying or printing multiple-component images in parallel with the process

x i = X H i H and

y i =

Y V i V m

ax

m ax ×

×(5)of decompression. For many systems, this is only

feasible if the components are interleaved together within the compressed data stream.

To make the same interleaving machinery applicable to both DCT-based and predictive codecs, the JPEG proposal has defined the concept of “data unit.” A data unit is a sample in predictive codecs and an 8x8 block of samples in DCT-based codecs.

The order in which compressed data units are placed in the compressed data stream is a generalization of raster-scan order. Generally, data units are ordered from left-to-right and top-to-bottom according to the orientation shown in Figure 6. (It is the responsibility of applications to define which edges of a source image are top, bottom, left and right.) If an image component is noninterleaved (i.e., compressed without being interleaved with other components), compressed data units are ordered in a pure raster scan as shown in Figure 7.

When two or more components are interleaved, each component C i is partitioned into rectangular regions of H i by V i data units, as shown in the generalized example of Figure 8. Regions are ordered within a component from left-to-right and top-to-bottom, and within a region, data units are ordered from left-to-right and top-to-bottom. The JPEG proposal defines the term Minimum Coded Unit (MCU) to be the smallest

right bottom

left

Figure 7. Noninterleaved Data Ordering

top

line

(a) Source image with multiple components C Figure 6. JPEG Source Image Model

(b) Characteristics of an image component

C Nf

group of interleaved data units. For the example

shown, MCU 1 consists of data units taken first from the top-left-most region of C 1, followed by data units from the same region of C 2, and likewise for C 3 and C 4. MCU 2 continues the pattern as shown.

Thus, interleaved data is an ordered sequence of MCUs,and the number of data units contained in an MCU is determined by the number of components interleaved and their relative sampling factors. The maximum number of components which can be interleaved is 4and the maximum number of data units in an MCU is 10. The latter restriction is expressed as shown in Equation 6, where the summation is over the interleaved components:

Because of this restriction, not every combination of 4components which can be represented in noninterleaved order within a JPEG-compressed image is allowed to be interleaved. Also, note that the JPEG proposal allows some components to be interleaved and some to be noninterleaved within the same compressed image.

∑ H i × V i ≤ 10

all i in interleave

(6)

6.3 Multiple Tables

In addition to the interleaving control discussed previously, JPEG codecs must control application of the proper table data to the proper components. The same quantization table and the same entropy coding table (or set of tables) must be used to encode all samples within a component.

JPEG decoders can store up to 4 different quantization tables and up to 4 different (sets of) entropy coding tables simultaneously. (The Baseline sequential decoder is the exception; it can only store up to 2 sets of entropy coding tables.) This is necessary for switching between different tables during decompression of a scan containing multiple (interleaved) components, in order to apply the proper table to the proper component. (Tables cannot be loaded during decompression of a scan.) Figure 9illustrates the table-switching control that must be managed in conjunction with multiple-component interleaving for the encoder side. (This simplified view does not distinguish between quantization and entropy coding tables.)

0 1 2 3 4 50

123

0 1 2 3 4 5

01

0 1 20123

0 1 201

Cs 1: H 1=2, V 1=2Cs 2: H 2=2, V 2=1Cs 3: H 3=1, V 3=2

Cs 4: H 4=1, V 4=1

???

?

?

?

?????????????????????

??

??

??????????????????

????

?

MCU 1 =MCU 2 =MCU 3 =MCU 4 =

Cs 1 data units

Cs 2

Cs

3 Cs 4

d 100

d 101 d 110 d 111 d 200 d 201 d 3

00 d 3

10 d 4

00,d 1

02 d 1

03 d 1

12 d 1

13 d 2

02 d 2

03 d 3

01 d 3

11 d 4

01,d 1

04 d 1

05 d 1

14 d 1

15 d 2

04 d 2

05 d 3

02 d 3

12 d 4

02,d 1

20 d 1

21 d 1

30 d 1

31 d 2

10 d 2

11 d 3

20 d 3

30 d 4

10,Figure 8. Generalized Interleaved Data Ordering Example

7 Baseline and Other DCT Sequential Codecs

The DCT sequential mode of operation consists of the FDCT and Quantization steps from section 4, and the multiple-component control from section 6.3. In addition to the Baseline sequential codec, other DCT sequential codecs are defined to accommodate the two different sample precisions (8 and 12 bits) and the two different types of entropy coding methods (Huffman and arithmetic).

Baseline sequential coding is for images with 8-bit samples and uses Huffman coding only. It also differs from the other sequential DCT codecs in that its decoder can store only two sets of Huffman tables (one AC table and DC table per set). This restriction means that, for images with three or four interleaved components, at least one set of Huffman tables must be shared by two components. This restriction poses no limitation at all for noninterleaved components; a new set of tables can be loaded into the decoder before decompression of a noninterleaved component begins.For many applications which do need to interleave three color components, this restriction is hardly a limitation at all. Color spaces (YUV, CIELUV,CIELAB, and others) which represent the chromatic (‘‘color’’) information in two components and the achromatic (‘‘grayscale’’) information in a third are more efficient for compression than spaces like RGB. One Huffman table set can be used for the achromatic component and one for the chrominance components. DCT coefficient statistics are similar for the chrominance components of most images, and one set of Huffman tables can encode both almost as optimally as two.

The committee also felt that early availability of single-chip implementations at commodity prices would encourage early acceptance of the JPEG proposal in a variety of applications. In 1988 when

Encoding Process

A B C

Table Table ????

?

Compressed Image Data

Figure 9. Component-Interleave and

Source Image Data Spec. 1 Spec. 2

??

Table-Switching Control

Baseline sequential was defined, the committee’s VLSI

experts felt that current technology made the feasibility of crowding four sets of loadable Huffman tables - in addition to four sets of Quantization tables - onto a single commodity-priced codec chip a risky proposition.

The FDCT, Quantization, DC differencing, and zig-zag ordering processing steps for the Baseline sequential codec proceed just as described in section 4. Prior to entropy coding, there usually are few nonzero and many zero-valued coefficients. The task of entropy coding is to encode these few coefficients efficiently. The description of Baseline sequential entropy coding is given in two steps: conversion of the quantized DCT coefficients into an intermediate sequence of symbols and assignment of variable-length codes to the symbols.

7.1 Intermediate Entropy Coding Representations In the intermediate symbol sequence, each nonzero AC coefficient is represented in combination with the ‘‘runlength’’ (consecutive number) of zero-valued AC coefficients which precede it in the zig-zag sequence. Each such runlength/nonzero-coefficient combination is (usually) represented by a pair of symbols:

symbol-1

symbol-2(RUNLENGTH, SIZE)

(AMPLITUDE)

Symbol-1 represents two pieces of information, RUNLENGTH and SIZE. Symbol-2 represents the single piece of information designated AMPLITUDE,which is simply the amplitude of the nonzero AC coefficient. RUNLENGTH is the number of consecutive zero-valued AC coefficients in the zig-zag sequence preceding the nonzero AC coefficient being represented. SIZE is the number of bits used to encode AMPLITUDE - that is, to encoded symbol-2, by the signed-integer encoding used with JPEG’s particular method of Huffman coding.

RUNLENGTH represents zero-runs of length 0 to 15.Actual zero-runs in the zig-zag sequence can be greater than 15, so the symbol-1 value (15, 0) is interpreted as the extension symbol with runlength=16. There can be up to three consecutive (15, 0) extensions before the terminating symbol-1 whose RUNLENGTH value completes the actual runlength. The terminating symbol-1 is always followed by a single symbol-2,except for the case in which the last run of zeros includes the last (63d) AC coefficient. In this frequent case, the special symbol-1 value (0,0) means EOB (end of block), and can be viewed as an ‘‘escape’’ symbol which terminates the 8x8 sample block.

Thus, for each 8x8 block of samples, the zig-zag sequence of 63 quantized AC coefficients is represented as a sequence of symbol-1, symbol-2symbol pairs, though each ‘‘pair’’ can have repetitions of symbol-1 in the case of a long run-length or only one symbol-1 in the case of an EOB.

The possible range of quantized AC coefficients determines the range of values which both the AMPLITUDE and the SIZE information must represent. A numerical analysis of the 8x8 FDCT equation shows that, if the 64-point (8x8 block) input signal contains N-bit integers, then the nonfractional part of the output numbers (DCT coefficients) can grow by at most 3 bits. This is also the largest possible size of a quantized DCT coefficient when its quantizer step size has integer value 1.

Baseline sequential has 8-bit integer source samples in the range [-27, 27-1], so quantized AC coefficient amplitudes are covered by integers in the range [-210,210-1]. The signed-integer encoding uses symbol-2AMPLITUDE codes of 1 to 10 bits in length (so SIZE also represents values from 1 to 10), and RUNLENGTH represents values from 0 to 15 as discussed previously. For AC coefficients, the structure of the symbol-1 and symbol-2 intermediate representations is illustrated in Tables 2 and 3,respectively.

The intermediate representation for an 8x8 sample block’s differential DC coefficient is structured similarly. Symbol-1, however, represents only SIZE information; symbol-2 represents AMPLITUDE information as before:

symbol-1symbol-2(SIZE)

(AMPLITUDE)

Because the DC coefficient is differentially encoded, it is covered by twice as many integer values, [-211,211-1] as the AC coefficients, so one additional level must be added to the bottom of Table 3 for DC coefficients. Symbol-1 for DC coefficients thus represents a value from 1 to 11.

0 1 2 . . . 9 10RUN 0...15X

X X EOB ZRL

RUN-SIZE values

LENGTH Table 2. Baseline Huffman Coding

Symbol-1 Structure

SIZE

7.2 Variable-Length Entropy Coding

Once the quantized coefficient data for an 8x8 block is represented in the intermediate symbol sequence described above, variable-length codes are assigned. For each 8x8 block, the DC coefficient’s symbol-1 and symbol-2 representation is coded and output first.For both DC and AC coefficients, each symbol-1 is encoded with a variable-length code (VLC) from the Huffman table set assigned to the 8x8 block’s image component. Each symbol-2 is encoded with a “variable-length integer” (VLI) code whose length in bits is given in Table 3. VLCs and VLIs both are codes with variable lengths, but VLIs are not Huffman codes. An important distinction is that the length of a VLC (Huffman code) is not known until it is decoded, but the length of a VLI is stored in its preceding VLC.Huffman codes (VLCs) must be specified externally as an input to JPEG encoders. (Note that the form in which Huffman tables are represented in the data stream is an indirect specification with which the decoder must construct the tables themselves prior to decompression.) The JPEG proposal includes an example set of Huffman tables in its information annex,but because they are application-specific, it specifies none for required use. The VLI codes in contrast, are “hardwired” into the proposal. This is appropriate,because the VLI codes are far more numerous, can be computed rather than stored, and have not been shown to be appreciably more efficient when implemented as Huffman codes.

7.3 Baseline Encoding Example

This section gives an example of Baseline compression and encoding of a single 8x8 sample block. Note that a good deal of the operation of a complete JPEG Baseline encoder is omitted here, including creation of Interchange Format information (parameters, headers,quantization and Huffman tables), byte-stuffing,padding to byte-boundaries prior to a marker code, and other key operations. Nonetheless, this example should help to make concrete much of the foregoing explanation.

Figure 10(a) is an 8x8 block of 8-bit samples,aribtrarily extracted from a real image. The small variations from sample to sample indicate the predominance of low spatial frequencies. After subtracting 128 from each sample for the required level-shift, the 8x8 block is input to the FDCT,equation (1). Figure 10(b) shows (to one decimal place)the resulting DCT coefficients. Except for a few of the lowest frequency coefficients, the amplitudes are quite small.

Figure 10(c) is the example quantization table for luminance (grayscale) components included in the informational annex of the draft JPEG standard part 1 [2]. Figure 10(d) shows the quantized DCT coefficients, normalized by their quantization table entries, as specified by equation (3). At the decoder these numbers are “denormalized” according to equation (4), and input to the IDCT, equation (2). Finally, figure 10(f) shows the reconstructed sample values, remarkably similar to the originals in 10(a). Of course, the numbers in figure 10(d) must be Huffman-encoded before transmission to the decoder. The first number of the block to be encoded is the DC term, which must be differentially encoded. If the quantized DC term of the previous block is, for example, 12, then the difference is +3. Thus, the intermediate representation is (2)(3), for SIZE=2 and AMPLITUDE=3.

Next, the the quantized AC coefficients are encoded. Following the zig-zag order, the first non-zero coefficient is -2, preceded by a zero-run of 1. This yields an intermediate representation of (1,2)(-2). Next encountered in the zig-zag order are three consecutive non-zeros of amplitude -1. This means each is preceded by a zero-run of length zero, for intermediate symbols (0,1)(-1). The last non-zero coefficient is -1 preceded by two zeros, for (2,1)(-1). Because this is the last non-zero coefficient, the final symbol representing this 8x8 block is EOB, or (0,0). Thus, the intermediate sequence of symbols for this example 8x8 block is:

(2)(3), (1,2)(-2), (0,1)(-1), (0,1)(-1),

(0,1)(-1), (2,1)(-1), (0,0)

Next the codes themselves must be assigned. For this example, the VLCs (Huffman codes) from the informational annex of [2] will be used. The differential-DC VLC for this example is:

(2) 011

The AC luminance VLCs for this example are: (0,0)1010

(0,1)00

(1,2) 11011

(2,1)11100

139 144 150 159 159 161 162 162

144149153155155155155

151

155

161

160

161

162

162

153

160

162

161

161

161

161

156

163

160

162

161

163

161

159

158

160

162

160

162

163

156

156

159

155

157

157

158

156

156

159

155

157

157

158

156

156

159

155

157

157

158

235.6

-22.6

-10.9

-7.1

-0.6

1.8

-1.3

-2.6

-1.0-12.1-5.2 2.1-1.7-2.7 1.3

-17.5

-9.3

-1.9

-0.8

-0.2

-0.4

1.6

-6.2

-1.6

0.2

1.5

1.6

-0.3

-3.8

-3.2

1.5

1.5

1.6

-0.3

-1.5

-1.8

-2.9

0.2

0.9

-0.1

-0.8

-0.5

1.9

-0.1

-0.9

-0.1

-0.7

1.5

1.7

1.2

0.4

-0.6

0.0

0.6

1.0

1.1

-0.6

-1.2

-0.1

0.3

1.3

-1.0

-0.8

-0.4

(a) source image samples(b) forward DCT coefficients

16

12

14

14

18

24

49

72

11101624405161

12

13

17

22

35

64

92

14

16

22

37

55

78

95

19

24

29

56

64

87

98

26

40

51

68

81

103

112

58

57

87

109

104

121

100

60

69

80

103

113

120

103

55

56

62

77

92

101

99

(c) quantization table

15 -2 -1 0 0 0 0 00-100000

-1

-1

(d) normalized quantized(e) denormalized quantized

144

148

155

160

163

163

160

158

146149152154156156156

150

156

161

163

164

161

159

152

157

161

164

164

162

161

154

158

162

163

164

162

161

156

158

161

162

162

162

162

156

157

159

160

160

161

161

156

156

157

158

158

159

159

156

155

155

156

157

158

158

(f) reconstructed image samples

Figure 10. DCT and Quantization Examples

240

-24

-14

0-1000000

-12

-13

coefficients coefficients

The VLIs specified in [2] are related to the two’s complement representation. They are:

(3)11

(-2)01

(-1) 0

Thus, the bit-stream for this 8x8 example block is as follows. Note that 31 bits are required to represent 64 coefficients, which achieves compression of just under 0.5 bits/sample:

0111111011010000000001110001010

7.4 Other DCT Sequential Codecs

The structure of the 12-bit DCT sequential codec with Huffman coding is a straightforward extension of the entropy coding method described previously. Quantized DCT coefficients can be 4 bits larger, so the SIZE and AMPLITUDE information extend accordingly. DCT sequential with arithmetic coding is described in detail in [2].

8 DCT Progressive Mode

The DCT progressive mode of operation consists of the same FDCT and Quantization steps (from section 4) that are used by DCT sequential mode. The key difference is that each image component is encoded in multiple scans rather than in a single scan. The first scan(s) encode a rough but recognizable version of the image which can be transmitted quickly in comparison to the total transmission time, and are refined by succeeding scans until reaching a level of picture quality that was established by the quantization tables.

To achieve this requires the addition of an image-sized buffer memory at the output of the quantizer, before the input to entropy encoder. The buffer memory must be of sufficient size to store the image as quantized DCT coefficients, each of which (if stored straightforwardly) is 3 bits larger than the source image samples. After each block of DCT coefficients is quantized, it is stored in the coefficient buffer memory. The buffered coefficients are then partially encoded in each of multiple scans. There are two complementary methods by which a block of quantized DCT coefficients may be partially encoded. First, only a specified “band” of coefficients from the zig-zag sequence need be encoded within a given scan. This procedure is called “spectral selection,” because each band typically contains coefficients which occupy a lower or higher part of the spatial-frequency spectrum for that 8x8 block. Secondly, the coefficients within the current band need not be encoded to their full (quantized) accuracy in a given scan. Upon a coefficient’s first encoding, the N most significant bits can be encoded first, where N is specifiable. In subsequent scans, the less significant bits can then be encoded. This procedure is called ‘‘successive approximation.’’ Both procedures can be used separately, or mixed in flexible combinations. Some intuition for spectral selection and successive approximation can be obtained from Figure 11. The quantized DCT coefficient information can be viewed as a rectangle for which the axes are the DCT coefficients (in zig-zag order) and their amplitudes. Spectral selection slices the information in one dimension and successive approximation in the other.

9 Hierarchical Mode of Operation

The hierarchical mode provides a “pyramidal”encoding of an image at multiple resolutions, each differing in resolution from its adjacent encoding by a factor of two in either the horizontal or vertical dimension or both. The encoding procedure can be summarized as follows:

1)Filter and down-sample the original image by

the desired number of multiples of 2 in each dimension.

2)Encode this reduced-size image using one of the

sequential DCT, progressive DCT, or lossless encoders described previously.

3)Decode this reduced-size image and then

interpolate and up-sample it by 2 horizontally and/or vertically, using the identical interpolation filter which the receiver must use.

Table 3. Baseline Entropy Coding

Symbol-2 Structure

1

2

3

4

5

6

7

8

9

10

-1,1

-3,-2,2,3

-7..-4,4..7

-15..-8,8..15

-31..-16,16..31

-63..-32,32..63

-127..-64,64..127

-255..-128,128..255

-511..-256,256..511

-1023..-512,512..1023

SIZE AMPLITUDE

4)Use this up-sampled image as a prediction of the

original at this resolution, and encode the difference image using one of the sequential DCT, progressive DCT, or lossless encoders described previously.5)Repeat steps 3) and 4) until the full resolution of

the image has been encoded.The encoding in steps 2) and 4) must be done using only DCT-based processes, only lossless processes,

or DCT-based processes with a final lossless process for each component.

Hierarchical encoding is useful in applications in which a very high resolution image must be accessed by a lower-resolution display. An example is an image scanned and compressed at high resolution for a very high-quality printer, where the image must also be displayed on a low-resolution PC video screen.

DCT coefficients

DCT coefficients

1st

scan

3rd scan

3rd scan (LSB)6

th scan

01

MSB

LSB

3450Figure 11. Spectral Selection and Successive Approximation Methods of Progressive Encoding

n th scan

c) progressive encoding: spectral selection d) progressive encoding: successive approximation

(b) Sequential encoding

2nd scan

10 Other Aspects of the JPEG Proposal Some key aspects of the proposed standard can only be mentioned briefly. Foremost among these are points concerning the coded representation for compressed image data specified in addition to the encoding and decoding procedures.

Most importantly, an interchange format syntax is specified which ensures that a JPEG-compressed image can be exchanged successfully between different application environments. The format is structured in a consistent way for all modes of operation. The interchange format always includes all quantization and entropy-coding tables which were used to compress the image.

Applications (and application-specific standards) are the “users” of the JPEG standard. The JPEG standard imposes no requirement that, within an application’s environment, all or even any tables must be encoded with the compressed image data during storage or transmission. This leaves applications the freedom to specify default or referenced tables if they are considered appropriate. It also leaves them the responsibility to ensure that JPEG-compliant decoders used within their environment get loaded with the proper tables at the proper times, and that the proper tables are included in the interchange format when a compressed image is “exported” outside the application.

Some of the important applications that are already in the process of adopting JPEG compression or have stated their interest in doing so are Adobe’s PostScript language for printing systems [1], the Raster Content portion of the ISO Office Document Architecture and Interchange Format [13], the future CCITT color facsimile standard, and the European ETSI videotext standard [10].

11 Standardization Schedule

JPEG’s ISO standard will be divided into two parts. Part 1 [2] will specify the four modes of operation, the different codecs specified for those modes, and the interchange format. It will also contain a substantial informational section on implementation guidelines. Part 2 [3] will specify the compliance tests which will determine whether an encoder implementation, a decoder implementation, or a JPEG-compressed image in interchange format comply with the Part 1 specifications. In addition to the ISO documents referenced, the JPEG standard will also be issued as CCITT Recommendation T.81.There are two key balloting phases in the ISO standardization process: a Committee Draft (CD) is balloted to determine promotion to Draft International Standard (DIS), and a DIS is balloted to determine promotion to International Standard (IS).

A CD ballot requires four to six months of processing, and a DIS ballot requires six to nine months of processing. JPEG’s Part 1 began DIS ballot in November 1991, and Part 2 began CD ballot in December 1991.

Though there is no guarantee that the first ballot of each phase will result in promotion to the next, JPEG achieved promotion of CD Part 1 to DIS Part 1 in the first ballot. Moreover, JPEG’s DIS Part 1 has undergone no technical changes (other than some minor corrections) since JPEG’s final Working Draft (WD) [14]. Thus, Part 1 has remained unchanged from the final WD, through CD, and into DIS. If all goes well, Part 1 should receive final approval as an IS in mid-1992, with Part 2 getting final IS approval about nine months later.

12 Conclusions

The emerging JPEG continuous-tone image compression standard is not a panacea that will solve the myriad issues which must be addressed before digital images will be fully integrated within all the applications that will ultimately benefit from them. For example, if two applications cannot exchange uncompressed images because they use incompatible color spaces, aspect ratios, dimensions, etc. then a common compression method will not help. However, a great many applications are “stuck” be-cause of storage or transmission costs, because of ar-gument over which (nonstandard) compression method to use, or because VLSI codecs are too ex-pensive due to low volumes. For these applications, the thorough technical evaluation, testing, selection, validation, and documentation work which JPEG committee members have performed is expected to soon yield an approved international standard that will withstand the tests of quality and time. As di-verse imaging applications become increasingly im-plemented on open networked computing systems, the ultimate measure of the committee’s success will be when JPEG-compressed digital images come to be regarded and even taken for granted as “just an-other data type,” as text and graphics are today.

For more information

Information on how to obtain the ISO JPEG (draft) standards can be obtained by writing the author at the following address:

Digital Equipment Corporation

146 Main Street, ML01-2/U44

Maynard, MA 01754-2571

Internet: wallace@https://www.wendangku.net/doc/1b9651325.html,

Floppy disks containing uncompressed, compressed, and reconstructed data for the purpose of informally validating whether an encoder or decoder implementation conforms to the proposed standard are available. Thanks to the following JPEG committee member and his company who have agreed to provide these for a nominal fee on behalf of the committee until arrangements can be made for ISO to provide them:

Eric Hamilton

C-Cube Microsystems

1778 McCarthy Blvd.

Milpitas, CA 95035

Acknowledgments

The following longtime JPEG core members have spent untold hours (usually in addition to their ‘‘real jobs’’) to make this collaborative international effort succeed. Each has made specific substantive contri-butions to the JPEG proposal: Aharon Gill (Zoran, Israel), Eric Hamilton (C-Cube, USA), Alain Leger (CCETT, France), Adriaan Ligtenberg (Storm, USA), Herbert Lohscheller (ANT, Germany), Joan Mitchell (IBM, USA), Michael Nier (Kodak, USA), Takao Omachi (NEC, Japan), William Pennebaker (IBM, USA), Henning Poulsen (KTAS, Denmark), and Jorgen Vaaben (AutoGraph, Denmark). The leadership efforts of Hiroshi Yasuda (NTT, Japan), the Convenor of JTC1/SC2/WG8 from which JPEG was spawned, Istvan Sebestyen (Siemens, Germany), the Special Rapporteur from CCITT SGVIII, and Graham Hudson (British Telecom U.K.) former JPEG chair and founder of the effort which became JPEG. The author regrets that space does not permit recognition of the many other individuals who con-tributed to JPEG’s work.

Thanks to Majid Rabbani of Eastman Kodak for pro-viding the example in section 7.3.

The author’s role within JPEG has been supported in a great number of ways by Digital Equipment Cor-poration References

1.Adobe Systems Inc. PostScript Language Refer-

ence Manual. Second Ed. Addison Wesley,

Menlo Park, Calif. 1990

2.Digital Compression and Coding of Continuous-

tone Still Images, Part 1, Requirements and

Guidelines. ISO/IEC JTC1 Draft International

Standard 10918-1, Nov. 1991.

3.Digital Compression and Coding of Continuous-

tone Still Images, Part 2, Compliance Testing.

ISO/IEC JTC1 Committee Draft 10918-2, Dec.

1991.

4.Encoding parameters of digital television for

studios. CCIR Recommendations, Recommen-

dation 601, 1982.

5.Howard, P.G., and Vitter, J.S. New methods for

lossless image compression using arithmetic

coding. Brown University Dept. of Computer

Science Tech. Report No. CS-91-47, Aug. 1991.

6.Hudson, G.P. The development of photographic

videotex in the UK. In Proceedings of the IEEE

Global Telecommunications Conference, IEEE

Communication Society, 1983, pp. 319-322.

7.Hudson, G.P., Yasuda, H., and Sebestyén, I.

The international standardization of a still pic-

ture compression technique. In Proceedings of

the IEEE Global Telecommunications Confer-

ence, IEEE Communications Society, Nov.

1988, pp. 1016-1021.

8.Huffman, D.A. A method for the construction

of minimum redundancy codes. In Proceedings

IRE, vol. 40, 1962, pp. 1098-1101.

9.Léger, A. Implementations of fast discrete co-

sine transform for full color videotex services

and terminals. In Proceedings of the IEEE

Global Telecommunications Conference, IEEE

Communications Society, 1984, pp. 333-337. 10.Léger, A., Omachi, T., and Wallace, G. The

JPEG still picture compression algorithm. In

Optical Engineering, vol. 30, no. 7 (July 1991),

pp. 947-954.

11.Léger, A., Mitchell, M., and Yamazaki, Y. Still

picture compression algorithms evaluated for in-

ternational standardization. In P roceedings of

the IEEE Global Telecommunications Confer-

ence, IEEE Communications Society, Nov.

1988, pp. 1028-1032.

12.Lohscheller, H. A subjectively adapted image

communication system. IEEE Trans. Commun.

COM-32 (Dec. 1984), pp. 1316-1322.

13.Office Document Architecture (ODA) and Inter-

change Format, Part 7: Raster Graphics Content

Architectures. ISO/IEC JTC1 International

Standard 8613-7.

14.Pennebaker, W.B., JPEG Tech. Specification,

Revision 8. Informal Working paper JPEG-8-

R8, Aug. 1990.

15.Pennebaker, W.B., Mitchell, J.L., et. al. Arith-

metic coding articles. IBM J. Res. Dev., vol. 32,

no. 6 (Nov. 1988), pp. 717-774.

16.Rao, K.R., and Yip, P. Discrete Cosine

Transform--Algorithms, Advantages, Applica-

tions. Academic Press, Inc. London, 1990.

17.Standardization of Group 3 facsimile apparatus

for document transmission. CCITT Recommen-

dations, Fascicle VII.2, Recommendation T.4,

1980.

18.Wallace, G.K. Overview of the JPEG

(ISO/CCITT) still image compression standard.

Image Processing Algorithms and Techniques.

In Proceedings of the SPIE, vol. 1244 (Feb.

1990), pp. 220-233.

19.Wallace, G., Vivian, R,. and Poulsen, H. Sub-

jective testing results for still picture compres-

sion algorithms for international standardization.

In Proceedings of the IEEE Global Telecommu-

nications Conference. IEEE Communications

Society, Nov. 1988, pp. 1022-1027.

Biography

Gregory K. Wallace is currently Manager of Multimedia Engineering, Advanced Development, at Digital Equipment Corporation. Since 1988 he has served as Chair of the JPEG committee (ISO/IEC JTC1/SC2/WG10). For the past five years at DEC, he has worked on efficient software and hardware implementations of image compression and processing algorithms for incorporation in general-purpose computing systems. He received the BSEE and MSEE from Stanford University in 1977 and 1979. His current research interests are the integration of robust real-time multimedia capabilities into networked computing systems.

bmp文件格式详解

b m p文件格式详解 Company Document number:WTUT-WT88Y-W8BBGB-BWYTT-19998

BMP文件格式,又称为Bitmap(位图)或是DIB(Device-IndependentDevice,设备无关位图),是Windows系统中广泛使用的图像文件格式。由于它可以不作任何变换地保存图像像素域的数据,因此成为我们取得RAW数据的重要来源。Windows的图形用户界面(graphicaluserinterfaces)也在它的内建图像子系统GDI中对BMP格式提供了支持。 下面以Notepad++为分析工具,结合Windows的位图数据结构对BMP文件格式进行一个深度的剖析。 BMP文件的数据按照从文件头开始的先后顺序分为四个部分: bmp文件头(bmpfileheader):提供文件的格式、大小等信息 位图信息头(bitmapinformation):提供图像数据的尺寸、位平面数、压缩方式、颜色索引等信息 调色板(colorpalette):可选,如使用索引来表示图像,调色板就是索引与其对应的颜色的映射表 位图数据(bitmapdata):就是图像数据啦^_^ 下面结合Windows结构体的定义,通过一个表来分析这四个部分。 我们一般见到的图像以24位图像为主,即R、G、B三种颜色各用8 个bit来表示,这样的图像我们称为真彩色,这种情况下是不需要调色 板的,也就是所位图信息头后面紧跟的就是位图数据了。因此,我们 常常见到有这样一种说法:位图文件从文件头开始偏移54个字节就是

位图数据了,这其实说的是24或32位图的情况。这也就解释了我们 按照这种程序写出来的程序为什么对某些位图文件没用了。 下面针对一幅特定的图像进行分析,来看看在位图文件中这四个数据 段的排布以及组成。 我们使用的图像显示如下: 这是一幅16位的位图文件,因此它是含有调色板的。 在拉出图像数据进行分析之前,我们首先进行几个约定: 1.在BMP文件中,如果一个数据需要用几个字节来表示的话,那么该数据的存放字节顺序为“低地址村存放低位数据,高地址存放高位数据”。如数据 0x1756在内存中的存储顺序为: 这种存储方式称为小端方式(littleendian),与之相反的是大端方式(bigendian)。对两者的使用情况有兴趣的可以深究一下,其中还是有学问的。 2.以下所有分析均以字节为序号单位进行。 下面我们对从文件中拉出来的数据进行剖析: 一、bmp文件头 Windows为bmp文件头定义了如下结构体: typedef struct tagBITMAPFILEHEADER {?

广告设计常用图像文件格式

广告设计常用图像文件格式 平面设计中我们会接触到很多图像格式,可是你真正地了解它们吗?下面我们就平面设 计中常见的图像格式为大家分别做简单介绍。 BMP格式 BMP是英文Bitmap(位图)的简写,它是Windows操作系统中的标准图像文件格式,能够被多种Windows应用程序所支持。随着Windows操作系统的流行与丰富的Windows 应用程序的开发,BMP位图格式理所当然地被广泛应用。这种格式的特点是包含的图像信息较丰富,几乎不进行压缩,但由此导致了它与生 俱生来的缺点——占用磁盘空间过大。所以,目前BMP在单机上比较流行。 GIF格式 GIF是英文Graphics Interchange Format(图形交换格式)的缩写。顾名思义,这种格式是用来交换图片的。事实上也是如此,上世纪80年代,美国一家著名的在线信息服务机构CompuServe针对当时网络传输带宽的限制,开发出了这种GIF图像格式。 GIF格式的特点是压缩比高,磁盘空间占用较少,所以这种图像格式迅速得到了广泛的应用。最初的GIF只是简单地用来存储单幅静止图像(称为GIF87a),后来随着技术发展,可以同时存储若干幅静止图象进而形成连续的动画,使之成为当时支持2D动画为数不多的格式之一(称为GIF89a),而在GIF89a图像中可指定透明区域,使图像具有非同一般的显示效果,这更使GIF风光十足。目前Internet上大量采用的彩色动画文件多为这种 格式的文件,也称为GIF89a格式文件。 此外,考虑到网络传输中的实际情况,GIF图像格式还增加了渐显方式,也就是说,在图像传输过程中,用户可以先看到图像的大致轮廓,然后随着传输过程的继续而逐步看清图像中的细节部分,从而适应了用户的"从朦胧到清楚"的观赏心理。目前Internet上大量采 用的彩色动画文件多为这种格式的文件。 GIF格式只能保存最大8位色深的数码图像,所以它最多只能用256色来表现物体,对于色彩复杂的物体它就力不从心了。尽管如此,这种格式仍在网络上大行其道应用,这和GIF图像文件短小、下载速度快、可用许多具有同样大小的图像文件组成动画等优势是分不 开的。 JPEG格式 JPEG也是常见的一种图像格式,它由联合照片专家组(Joint Photographic Experts Group)开发并以命名为"ISO 10918-1",JPEG仅仅是一种俗称而已。JPEG文件的扩展名为。jpg或。jpeg,其压缩技术十分先进,它用有损压缩方式去除冗余的图像和彩色数据,获取得极高的压缩率的同时能展现十分丰富生动的图像,换句话说,就是可以用最少的磁盘

BMP格式结构详解

位图文件(B it m a p-File,BMP)格式是Windows采用的图像文件存储格式,在Windows环境下运行的所有图像处理软件都支持这种格式。Windows 3.0以前的BMP位图文件格式与显示设备有关,因此把它称为设备相关位图(d evice-d ependent b itmap,DDB)文件格式。Windows 3.0以后的BMP位图文件格式与显示设备无关,因此把这种BMP位图文件格式称为设备无关位图(d evice-i ndependent b itmap,DIB)格式,目的是为了让Windows能够在任何类型的显示设备上显示BMP位图文件。BMP位图文件默认的文件扩展名是BMP或者bmp。 6.1.2 文件结构 位图文件可看成由4个部分组成:位图文件头(bitmap-file header)、位图信息头(bitmap-information header)、彩色表(color table)和定义位图的字节阵列,它们的名称和符号如表6-01所示。 表6-01 BMP图像文件组成部分的名称和符号 位图文件的组成结构名称符号 位图文件头(bitmap-file header)BITMAPFILEHEADE R bmfh 位图信息头(bitmap-information header)BITMAPINFOHEADE R bmih 彩色表(color table)RGBQUAD aColors[] 图像数据阵列字节BYTE aBitmapBits[ ] 位图文件结构可综合在表6-02中。 表6-02 位图文件结构内容摘要 偏移量域的名称大小内容 图像文件头0000h标识符 (Identifie r) 2 bytes两字节的内容用来识别位图的类型: ‘BM’ : Windows 3.1x, 95, NT, linux ‘BA’ :OS/2 Bitmap Array ‘CI’ :OS/2 Color Icon ‘CP’ :OS/2 Color Pointer ‘IC’ : OS/2 Icon ‘PT’ :OS/2 Pointer 0002h File Size 1 dword用字节表示的整个文件的大小 0006h Reserved 1 dword保留,设置为0 000Ah Bitmap Data Offset 1 dword从文件开始到位图数据开始之间的数据(bitmap data)之间的偏移量 000Eh Bitmap Header Size 1 dword位图信息头(Bitmap Info Header)的长度,用来 描述位图的颜色、压缩方法等。下面的长度表示: 28h - Windows 3.1x, 95, NT, … 0Ch - OS/2 1.x F0h - OS/2 2.x 0012h Width 1 dword位图的宽度,以像素为单位 0016h Height 1 dword位图的高度,以像素为单位 001Ah Planes 1 word位图的位面数 图像001Ch Bits Per Pixel 1 word每个像素的位数 1 - Monochrome bitmap

Freescale S19文件格式

S19文件格式介绍 S-record格式文件是Freescale CodeWarrior编译器生成的后缀名为.S19的程序文件,是一段直接烧写进MCU的ASCII码,英文全称问Motorola format for EEPROM programming。 1、格式定义及含义 S-record每行最大是78个字节,156个字符 S-record form at type(类型):2个字符。用来描述记录的类型(S0,S1,S2,S3,S5,S7,S8,S9)。 count(计数):2个字符。用来组成和说明了一个16进制的值,显示了在记录中剩余成对字符的计数。 address(地址):4或6或8个字节。用来组成和说明了一个16进制的值,显示了数据应该装载的地址,这部分的长度取决于载入地址的字节数。2个字节的地址占用4个字符,3个字节的地址占用6个字符,4个字节的地址占用8个字符。 data(数据):0—64字符。用来组成和说明一个代表了内存载入数据或者描述信息的16进制的值。 checksum(校验和):2个字符。这些字符当被配对并换算成16进制数据的时候形成了一个最低有效字符节,该字符节用来表达作为补充数据,地址和数据库的字符对所代表的(字节的)补码的byte总和。即计数值、地址场和数据场的若干字符以两个字符为一对,将它们相加求和,和的溢出部分不计,只保留最低两位字符NN,checksum =0xFF-0xNN。 S0 Record:记录类型是“S0” (0x5330)。地址场没有被用,用零置位(0x0000)。数据场中的信息被划分为以下四个子域: nam e(名称):20个字符,用来编码单元名称 ver(版本):2个字符,用来编码版本号 rev(修订版本):2个字符,用来编码修订版本号 description(描述):0-36个字符,用来编码文本注释 此行表示程序的开始,不需烧入memory。

常用图片文件格式

总的来说,有两种截然不同的图像格式类型:即有损压缩和无损压缩。 1.有损压缩 有损压缩可以减少图像在内存和磁盘中占用的空间,在屏幕上观看图像时,不会发现它对图像的外观产生太大的不利影响。因为人的眼睛对光线比较敏感,光线对景物的作用比颜色的作用更为重要,这就是有损压缩技术的基本依据。 有损压缩的特点是保持颜色的逐渐变化,删除图像中颜色的突然变化。生物学中的大量实验证明,人类大脑会利用与附近最接近的颜色来填补所丢失的颜色。例如,对于蓝色天空背景上的一朵白云,有损压缩的方法就是删除图像中景物边缘的某些颜色部分。当在·屏幕上看这幅图时,大脑会利用在景物上看到的颜色填补所丢失的颜色部分。利用有损压缩技术,某些数据被有意地删除了,而被取消的数据也不再恢复。 无可否认,利用有损压缩技术可以大大地压缩文件的数据,但是会影响图像质量。如果使用了有损压缩的图像仅在屏幕上显示,可能对图像质量影响不太大,至少对于人类眼睛的识别程度来说区别不大。可是,如果要把一幅经过有损压缩技术处理的图像用高分辨率打印机打印出来,那么图像质量就会有明显的受损痕迹。 2.无损压缩 无损压缩的基本原理是相同的颜色信息只需保存一次。压缩图像的软件首先会确定图像中哪些区域是相同的,哪些是不同的。包括了重复数据的图像(如蓝天) 就可以被压缩,只有蓝天的起始点和终结点需要被记录下来。但是蓝色可能还会有不同的深浅,天空有时也可能被树木、山峰或其他的对象掩盖,这些就需要另外记录。从本质上看,无损压缩的方法可以删除一些重复数据,大大减少要在磁盘上保存的图像尺寸。但是,无损压缩的方法并不能减少图像的内存占用量,这是因为,当从磁盘上读取图像时,软件又会把丢失的像素用适当的颜色信息填充进来。如果要减少图像占用内存的容量,就必须使用有损压缩方法。 无损压缩方法的优点是能够比较好地保存图像的质量,但是相对来说这种方法的压缩率比较低。但是,如果需要把图像用高分辨率的打印机打印出来,最好还是使用无损压缩几乎所有的图像文件都采用各自简化的格式名作为文件扩展名。从扩展名就可知道这幅图像是按什么格式存储的,应该用什么样的软件去读/写等等。 一、BMP图像文件格式 BMP是一种与硬件设备无关的图像文件格式,使用非常广。它采用位映射存储格式,除了图像深度可选以外,不采用其他任何压缩,因此,BblP文件所占用的空间很大。BMP文件的图像深度可选lbit、4bit、8bit及24bit。BMP文件存储数据时,图像的扫描方式是按从左到右、从下到上的顺序。 由于BMP文件格式是Windows环境中交换与图有关的数据的一种标准,因此在Windows 环境中运行的图形图像软件都支持BMP图像格式。

BMP图像格式详解

BMP格式图像文件详析 首先请注意所有的数值在存储上都是按“高位放高位、低位放低位的原则”,如12345678h放在存储器中就是7856 3412)。下图是导出来的开机动画的第一张图加上文件头后的16进制数据,以此为例进行分析。T408中的图像有点怪,图像是在电脑上看是垂直翻转的。在分析中为了简化叙述,以一个字(两个字节为单位,如424D就是一个字)为序号单位进行,“h”表示是16进制数。 424D 4690 0000 0000 0000 4600 0000 2800 0000 8000 0000 9000 0000 0100*1000 0300 0000 0090 0000 A00F 0000 A00F 0000 0000 0000 0000 0000*00F8 0000 E007 0000 1F00 0000 0000 0000*02F1 84F1 04F1 84F1 84F1 06F2 84F1 06F2 04F2 86F2 06F2 86F2 86F2 ...... BMP文件可分为四个部分:位图文件头、位图信息头、彩色板、图像数据阵列,在上图中已用*分隔。 一、图像文件头 1)1:图像文件头。424Dh=’BM’,表示是Windows支持的BMP 格式。

2)2-3:整个文件大小。4690 0000,为00009046h=36934。 3)4-5:保留,必须设置为0。 4)6-7:从文件开始到位图数据之间的偏移量。4600 0000,为00000046h=70,上面的文件头就是35字=70字节。 5)8-9:位图图信息头长度。 6)10-11:位图宽度,以像素为单位。8000 0000,为00000080h=128。 7)12-13:位图高度,以像素为单位。9000 0000,为00000090h=144。 8)14:位图的位面数,该值总是1。0100,为0001h=1。 二、位图信息头 9)15:每个像素的位数。有1(单色),4(16色),8(256色),16(64K色,高彩色),24(16M色,真彩色),32(4096M色,增强

通用编程器软件使用说明(精)

通用编程器软件使用说明 V1.0 -------- -------- (一安装与删除 运行A:\Setup.exe即完成自动安装。 注意:本软件目前支持操作系统Win95/98/2000。XP下未测试。 删除时可用控制面板中的“添加/删除程序”,找到“通用编程器”, 单击“确定”即可。 (二运行 安装好软件后,“开始”菜单中会出现“通用编程器”菜单项,用户 点击该项即进入通用编程器用户界面。 (三功能介绍 系统界面采用windows窗口形式,软件运行后会自动检测串行口COM1、COM2,确认与编程器的连接;同时新建一个缓冲区文件,内容为全1。 系统主菜单包括:文件、缓冲区、通信串口选择、新增芯片类型、芯 片型号选择、自动操作编辑、操作选择和帮助八个菜单项。 文件菜单项包括打开、保存、另存为、退出以及以前打开的文件的快 速链接(最多4个。用于处理缓冲区的内容与磁盘文件的交互。目前

只支持Binary、Hex(H16、S19格式的磁盘文件。 缓冲区菜单项包括数据复制、数据填充、地址定位、查找。用于处理 缓冲区的内容。数据复制允许用户通过起始、结束地址指定缓冲区中 一段内容,复制到用户指定的地址后面区域。数据填充允许用户通过 起始、结束地址指定缓冲区中一段区域,以用户输入的数据填充其内 容。地址定位用于快速将屏幕显示调整到用户指定的地址。查找用于 在缓冲区中查找特定的数据。 通信串口选择用于设置与编程器连接的串口号。 新增芯片类型用于增加芯片类型,保存在chip.ini文件中。格式如下: 最大选择号 可选功能0芯片选择1编程2空片检查3读入4校验5加密6擦除7加锁一 8加锁二9加锁三 芯片名称 起始地址(16进制 长度(K为单位 芯片选择号 芯片型号选择用于选择编程器中的芯片型号,保存在type.ini文件中。格式如下:芯片选择号(回车 自动操作编辑用于编辑“自动”操作中的内容,保存在auto.ini文件。格式如下:

常见医学图像格式

附录C 图像格式 译者:Synge 发表时间:2012-05-03浏览量:1604评论数:0挑错数:0 翻译:xiaoqiao 在fMRI的早期,由于大多数据都用不同研究脉冲序列采集,然后离线大量重建,而且各研究中心文件格式各不相同、大多数的分析软件也都是各研究单位内部编写运用。如果这些数据不同其他中心交流,数据的格式不影响他们的使用。因此图像格式就像巴别塔似的多式多样。随着fMRI领域的不断发展,几种标准的文件格式逐渐得到了应用,数据分析软件包的使用促进了这些文件格式在不同研究中心和实验室的广泛运用,直到近期仍有多种形式的文件格式存在。这种境况在过去的10年里随着公认的NIfTI格式的发展和广泛认可而优化。该附录就fMRI资料存储的常见问题以及重要的文件格式做一概述, 3.1 数据存储 正如第2章所述,MRI数据的存储常采用二进制数据格式,如8位或16位。因此,磁盘上数据文件的大小就是数据图像的大小和维度,如保存维度128 ×128×96的16位图像需要25,165,824位(3 兆字节)。为了保存图像的更多信息,我们希望保存原始数据,即元数据。元数据包含了图像的各种信息,如图像维度及数据类型等。这点很重要,因为可以获得二进制数据所不知道的信息,例如,图像是128 ×128×96维度的16位图像采集还是128 ×128×192维度的8位图像采集。在这里我们主要讨论不同的图像格式保存不同的数量及种类的元数据。

MRI的结构图像通常保存为三维的资料格式。fMRI数据是一系列的图像采集,可以保存为三维格式,也可以保存为四维文件格式(第4维为时间)。通常,我们尽可能保存为四维数据格式,这样可以减少文件数量,但是有些数据分析软件包不能处理四维数据。 3.2 文件格式 神经影像的发展中出现了很多不同图像格式,常见的格式见表1.在这里我们就DICOM、Analyze和NIfTI最重要的三种格式做一讨论。 表1. 常见医学图像格式 Analyze .img/.hdr Analyze软件, 梅奥临床医学中心 DICOM 无ACR/NEMA协会 NIfTI .nii或.img/.hdr NIH影像学信息工具倡议 MINC .mnc 蒙特利尔神经学研究所(MNI,扩展名NetCDF) 3.2.1 DICOM格式 现今大多MRI仪器采集后的重建数据为DICOM格式。该数据格式源于美国放射学协会(ACR)和国际电子产品制造商协会(NEMA)。DICOM不仅仅是图像的存储格式,而且是不同成像系统的不同形式数据之间转换的模式,MRI图像只是其中一种特殊形式。目前使用的DICOM遵照1993年协议,且目前主要的MRI仪器供应商都支持该格式。 通常,DICOM把每一层图像都作为一个独立的文件,这些文件用数字命名从而反映相对应的图像层数(在不同的系统有一定差异)。文件中包含文件头信息,且必须要特定的软

BMP头文件格式

bmp头文件格式 1:BMP文件组成 BMP文件由文件头、位图信息头、颜色信息和图形数据四部分组成。2:BMP文件头(14字节) BMP文件头数据结构含有BMP文件的类型、文件大小和位图起始位置等信息。 其结构定义如下: typedef struct tagBITMAPFILEHEADER { WORDbf Type; // 位图文件的类型,必须为BMP(0-1字节) DWORD bfSize; // 位图文件的大小,以字节为单位(2-5字节) WORD bfReserved1; // 位图文件保留字,必须为0(6-7字节) WORD bfReserved2; // 位图文件保留字,必须为0(8-9字节) DWORD bfOffBits; // 位图数据的起始位置,以相对于位图(10-13字节) // 文件头的偏移量表示,以字节为单位 } BITMAPFILEHEADER; 3:位图信息头(40字节) BMP位图信息头数据用于说明位图的尺寸等信息。 typedef struct tagBITMAPINFOHEADER{ DWORD biSize; // 本结构所占用字节数(14-17字节) LONG biWidth; // 位图的宽度,以像素为单位(18-21字节)

LONG biHeight; // 位图的高度,以像素为单位(22-25字节) WORD biPlanes; // 目标设备的级别,必须为1(26-27字节) WORD biBitCount;// 每个像素所需的位数,必须是1(双色),(28-29字节) // 4(16色),8(256色)或24(真彩色)之一 DWORD biCompression; // 位图压缩类型,必须是0(不压缩),(30-33字节) // 1(BI_RLE8压缩类型)或2(BI_RLE4压缩类型)之一 DWORD biSizeImage; // 位图的大小,以字节为单位(34-37字节) LONG biXPelsPerMeter; // 位图水平分辨率,每米像素数(38-41字节) LONG biYPelsPerMeter; // 位图垂直分辨率,每米像素数(42-45字节) DWORD biClrUsed;// 位图实际使用的颜色表中的颜色数(46-49字节) DWORD biClrImportant;// 位图显示过程中重要的颜色数(50-53字节) } BITMAPINFOHEADER; 4:颜色表 颜色表用于说明位图中的颜色,它有若干个表项,每一个表项是一个RGBQUAD类型的结构,定义一种颜色。RGBQUAD结构的定义如下: typedef struct tagRGBQUAD {

图片常用文件几种格式

图片文件格式简介 一、格式 是英文(位图)地简写,它是操作系统中地标准图像文件格式,能够被多种应用程序所支持.随着操作系统地流行与丰富地应用程序地开发,位图格式理所当然地被广泛应用.这种格式地特点是包含地图像信息较丰富,几乎不进行压缩,但由此导致了它与生俱生来地缺点占用磁盘空间过大.所以,目前在单机上比较流行. 二、格式 是英文(图形交换格式)地缩写.顾名思义,这种格式是用来交换图片地.事实上也是如此,上世纪年代,美国一家著名地在线信息服务机构针对当时网络传输带宽地限制,开发出了这种图像格式. 格式地特点是压缩比高,磁盘空间占用较少,所以这种图像格式迅速得到了广泛地应用. 最初地只是简单地用来存储单幅静止图像(称为),后来随着技术发展,可以同时存储若干幅静止图象进而形成连续地动画,使之成为当时支持动画为数不多地格式之一(称为),而在图像中可指定透明区域,使图像具有非同一般地显示效果,这更使风光十足.目前上大量采用地彩色动画文件多为这种格式地文件,也称为格式文件. 此外,考虑到网络传输中地实际情况,图像格式还增加了渐显方式,也就是说,在图像传输过程中,用户可以先看到图像地大致轮廓,然后随着传输过程地继续而逐步看清图像中地细节部分,从而适应了用户地"从朦胧到清楚"地观赏心理.目前上大量采用地彩色动画文件多为这种格式地文件. 但有个小小地缺点,即不能存储超过色地图像.尽管如此,这种格式仍在网络上大行其道应用,这和图像文件短小、下载速度快、可用许多具有同样大小地图像文件组成动画等优势是分不开地. 三、格式 也是常见地一种图像格式,它由联合照片专家组()开发并以命名为" ",仅仅是一种俗称而已.文件地扩展名为或,其压缩技术十分先进,它用有损压缩方式去除冗余地图像和彩色数据,获取得极高地压缩率地同时能展现十分丰富生动地图像,换句话说,就是可以用最少地磁盘空间得到较好地图像质量. 同时还是一种很灵活地格式,具有调节图像质量地功能,允许你用不同地压缩比例对这种文件压缩,比如我们最高可以把地位图文件压缩至.当然我们完全可以在图像质量和文件尺寸之间找到平衡点. 由于优异地品质和杰出地表现,它地应用也非常广泛,特别是在网络和光盘读物上,肯定都能找到它地影子.目前各类浏览器均支持这种图像格式,因为格式地文件尺寸较小,下载速度快,使得页有可能以较短地下载时间提供大量美观地图像,同时也就顺理成章地成为网络上最受欢迎地图像格式. 四、格式 同样是由组织负责制定地,它有一个正式名称叫做" ",与相比,它具备更高压缩率以及更多新功能地新一代静态影像压缩技术. 作为地升级版,其压缩率比高约左右.与不同地是,同时支持有损和无损压缩,而只能

BMP文件格式

BMP文件格式 简介 BMP(Bitmap-File)图形文件是Windows采用的图形文件格式,在Windows环境下运行的所有图象处理软件都支持BMP图象文件格式。Wi ndows系统内部各图像绘制操作都是以BMP为基础的。Windows 3.0以前的BMP图文件格式与显示设备有关,因此把这种BMP图象文件格式称为设备相关位图DDB(device-dependent bitmap)文件格式。Windows 3.0以后的BMP图象文件与显示设备无关,因此把这种BM P图象文件格式称为设备无关位图DIB(device-independent bitmap)格式(注:Windows 3.0以后,在系统中仍然存在DDB位图,象BitBl t()这种函数就是基于DDB位图的,只不过如果你想将图像以BMP格式保存到磁盘文件中时,微软极力推荐你以DIB格式保存),目的是为了让Windows能够在任何类型的显示设备上显示所存储的图象。BMP位图文件默认的文件扩展名是BMP或者bmp(有时它也会以.DIB 或.RLE作扩展名)。 此图用WinHex软件打开后结果如下:(在介绍完bmp文件格式后会具体分析这些数字,最后也有matlab对此图的分析)注:此图是24位真彩色图。 文件结构 位图文件可看成由4个部分组成:位图文件头(bitmap-file header)、位图信息头(bitmap-information header)、彩色表(color table)和定义位图的字节阵列,它具有如下所示的形式。

位图文件结构可综合在表6-01中。表01 位图文件结构内容摘要

构件详解 1. 位图文件头 位图文件头包含有关于文件类型、文件大小、存放位置等信息,在Windows 3.0以上版本的位图文件中用BITMAPFILEHEADER结构来定义: typedef struct tagBITMAPFILEHEADER { /* bmfh */ UINT bfType; DWORD bfSize; UINT bfReserved1; UINT bfReserved2; DWORD bfOffBits; } BITMAPFILEHEADER; 其中: bfType 说明文件的类型.(该值必需是0x4D42,也就是字符'BM'。我们不需要判断OS/2的位图标识,这么做现在来看似乎已经没有什么意义了,而且如果要支持OS/2的位图,程序将变得很繁琐。所以,在此只建议你检察'BM'标识) bfSize 说明文件的大小,用字节为单位bfReserved1 保留,必须设置为0

photoshop常用图像文件格式

常用图像文件格式 1.PSD格式 PSD格式是Photoshop的专用格式,能保存图像数据的每一个细小部分,包括像素信息、图层信息、通道信息、蒙版信息、色彩模式信息,所以PSD格式的文件较大。而其中的一些内容在转存为其他格式时将会丢失,并且在储存为其他格式的文件时,有时会合并图像中的各图层及附加的蒙版信息,当再次编辑时会产生不少麻烦。因此,最好再备份一个PSD 格式的文件后再进行格式转换。 2.TIFF格式 TIFF格式是一种通用的图像文件格式,是除PSD格式外唯一能存储多个通道的文件格式。几乎所有的扫描仪和多数图像软件都支持该格式。该种格式支持RGB、CMYK、Lab 和灰度等色彩模式,它包含有非压缩方式和LZW压缩方式两种。 3.JPEG格式 JPEG格式也是比较常用的图像格式,压缩比例可大可小,被大多数的图形处理软件所支持。JPEG格式的图像还被广泛应用于网页的制作。该格式还支持CMYK、RGB和灰度色彩模式,但不支持Alpha通道。 4.BMP格式 BMP格式是标准的Windows及OS/2的图像文件格式,是Photoshop中最常用的位图格式。此种格式在保存文件时几乎不经过压缩,因此它的文件体积较大,占用的磁盘空间也较大。此种存储格式支持RGB、灰度、索引、位图等色彩模式,但不支持Alpha通道。它是Windows环境下最不容易出错的文件保存格式。 5.GIF格式 GIF格式是由CompuServe公司制定的,能保存背景透明化的图像形式,但只能处理256种色彩,常用于网络传输,其传输速度要比其他格式的文件快很多,并且可以将多张图像存储为一个文件形成动画效果。 6.PNG格式 PNG格式是CompuServe公司开发出来的格式,广泛应用于网络图像的编辑。它不同于GIF格式图像,除了能保存256色,还可以保存24位的真彩色图像,具有支持透明背景和消除锯齿边缘的功能,可在不失真的情况下进行压缩保存图像。在不久将来,PNG格式将会是未来网页中使用的一种标准图像格式。 PNG格式文件在RGB和灰度模式下支持Alpha通道,但是在索引颜色和位图模式下,不支持Alpha通道。 7.EPS格式 EPS格式为压缩的PostScript格式,可用于绘图或者排版,它最大的优点是可以在排版软件中以低分辨率预览,打印或者出胶片时以高分辨率输出,可以达到效果和图像输出质量两不耽误。EPS格式支持Photoshop里所有的颜色模式,其中在位图模式下还可以支持透明,并可以用来存储点阵图和向量图形。但不支持Alpha通道。 8.PDF格式 PDF格式是Adobe公司开发的Windows,MAC OS,UNIX和DOS系统的一种电子出版软件的文档格式。该格式源于PostScript Level2语言,因此可以覆盖矢量式图像和点阵式图像,且支持超链接。此文件是由Adobe Acrobat软件生成的文件格式,该格式文件可以存储多页信息,包含图形,文档的查找和导航功能。因此在使用该软件时不需要排版就可以获得图文混排的版面。由于该格式支持超文本链接,所以是网络下载经常使用的文件。

BMP图像格式分析

BMP图像格式分析 BMP图像文件格式是微软公司为其Windows环境设置的标准图像格式,而且 Windows系统软件中还同时内含了一系列支持BMP图像处理的API函数,随着Windows 在世界范围内的不断普及,BMP文件格式无疑也已经成为PC机上的流行图像文件格式。它的主要特点可以概括为:文件结构与PCX文件格式类似,每个文件只能存放一幅图像;图像数据是否采用压缩方式存放,取决于文件的大小与格式,即压缩处理成为图像文件的一个选项,用户可以根据需要进行选择。其中,非压缩格式是BMP图像文件所采用的一种通用格式。但是,如果用户确定将BMP文件格式压缩处理,则Windows设计了两种压缩方式:如果图像为16色模式,则采用RLE4压缩方式,若图像为256色模式,则采用RLE8压缩方式。同时,BMP 图像文件格式可以存储单色、16色、256色以及真彩色四种图像数据,,其数据的排列顺序与一般文件不同,它以图像的左下角为起点存储图像,而不是以图像的左上角为起点;而且BMP图像文件格式中还存在另外一个与众不同的特点,即其调色板数据所采用的数据结构中,红、绿、蓝三种基色数据的排列顺序也恰好与其它图像文件格式相反。总之,BMP图像文件格式拥有许多适合于Windows环境的新特色,而且随着Windows版本的不断更新,微软公司也在不断改进其BMP 图像文件格式,例如:当前BMP图像文件版本中允许采用32位颜色表,而且针对32位Windows 的产生,相应的API 函数也在不断地报陈出新,这些无疑都同时促成了BMP文件格式的不断风靡。但由于BMP文件格式只适合于Windows上的应用软件,而对于DOS环境中的各种应用软件则无法提供相应的支持手段,因此这无疑是阻碍BMP文件格式的流通程度超过PCX文件格式的一个重要因素。 Windows中定义了两种位图文件类型,即一般位图文件格式与设备无关位图文件格式。其中,由于设备无关位图(DIB)文件格式具有更强的灵活性与完整的图像数据、压缩方式等定义。BMP图像文件的结构可以分为如下三个部分:文件头、调色板数据以及图像数据。其中文件头的长度为固定值54个字节;调色板数据对所有不超过256色的图像模式都需要进行设置,即使是单色图像模式也不例外,但是对于真彩色图像模式,其对应的BMP文件结构中却不存在相应调色板数据的设置信息;图像数据既可以采用一定的压缩算法进行处理,也可以不必对图像数据进行压缩处理,这不仅与图像文件的大小相关,而且也与对应的图像处理软件是否支持经过压缩处理的BMP图像文件相关。以下将分别介绍BMP图像文件结构中的这三个重要组成部分。特别值得注意的是:BMP 图像文件结构设计得相当简单,这无疑有利于图像文件的处理速度,但是同时也使得 BMP图像文件格式具有一定的局限性,即一个BMP图像文件只能存储一幅图像。 BMP图像文件的文件头定义 Windows中将BMP图像文件的文件头分成两个数据结构,其中一个数据结构中包含BMP文件的类型、大小和打印格式等信息,称为BITMAPFILEHEADERl另外一个数据结构中则包含BMP文件的尺寸定义等信息,称为BITMAPINFOHEADERl 如果图像文件还需要调色板数据,则将其存放在文件头信息之后。 BITMAPFIlEHEADER数据结构在Windows.h中的定义为: typedef struCttagBITMAPFIlEHEADER { WORD bftype; DWORD bfsiZe: WORD bfReservedl; WORD bgReserved2: DWORD bfoffBits: }BITMAPFILEHEADER; 其中,bfrype在图像文件存储空间中的数据地址为0,数据类型为unsignedchar,内容为固定值“BM”,用于标志文件格式,表示该图像文件为BMP文件。 bfsize的数据地址为2,类型为unsignedlong,它以字节为单位,定义位图文件的大小。 bfReservedl与bfReserved2的数据地址分别为6和8,数据类型则都为unsignedint,二者都是BMP文件的保留字,没有任何意义,其值必须为0. bfoffBits的数据地址为10,数据类型为unsignedlong,它以字节为单位,指示图像数据在文件内的起始地址,即图像数

印前常用图像文件格式

印前常用图像文件格式 EPS DCS 是EPS格式的一种,会将档案储存为五个档案,分别是CMYK各版及预视的影像档案 特性:全名是Desktop Color Separation,是EPS格式的一种,在PhotoShop内可以储存这格式。档案储存DCS后,会共有5个档案出现,包括有CMYK各版以及预视的72dpi影像档案;即所谓“Master file”,这样便合成5个档案格式。 用途:EPS DCS最大的优点是输出比较快,因为档案已分成四色的档案,在输出分色菲林计算机时,影像传送时间可最高缩短75%,所以适合于大档案分色输出。 另一个优点是制作速度亦较快,其实DCS格式是OPI(Open Prepress Interface)工作流程概念的一个重要部份,OPI是指制作时会置入低解像度的图像,到输出时才连接高解度图像,这样便可令制作速度加快,这种工作流程概念尤其是适合一些多图像的书刊或大盒制作,所以DCS 格式亦只是与OPI概念相似,将低像度图像置入文档,至输出时,输出机便会连接高解像度图像。 所有的常用软件都能支援DCS格式。由于五个档案才是合成一个图像,所以要注意五个档案的名称一定要一致,只是多了C、M、Y、K在原本名称之后,不能改动任何一个的名称。 TIFF 是Aldus公司开发,不仅是Mac,连IBM PC相容电脑排版软版也广泛采用 特性:全名是Tagged Image File Format,是由Aldus公司开发,是一个压缩图像格式。不仅是Mac,连IBM PC相容电脑排版软件也广泛采用,所以在PhotoShop内储存TIFF时可以选择IBM或Mac。主要是描述图像的资料,包括黑白、彩色及灰度的图像。 用途:大部分的软件都支援TIFF格式,只有Illustrator5.5及5.0C不可以置入此格式,但较新的版本6.0及7.0已经可以,而且7.0更可以把档案储存成RGB或CMYK的TIFF格式,以便其他软件所使用。 在桌面排版上,TIFF及EPS都是最受欢迎的档案格式,笔者建议正常情况下可以选择TIFF格式,因为档案较细,传送时间会较快。正如上述所说,如果有clipping path的PhotoShop档案置入其它软件,应该要储存为EPS。但其实PhotoShop可以将有clipping path的相片储存为TI FF,不过并非所有软件支援,只有PageMaker6.0及6.0C能够支援有clipping path的TIFF格式,所以如果需要退出的话,可能需要考虑用EPS 格式。 当你选择TIFF的格式时,可选用IBM PC或Macintosh,而且更可选用LZW Compression,这是TIFF档案格式的压缩方式,LZW压缩后的档案品质不会劣化的,但并非所有软件及输出设备能够支援这个压缩档案,因此选用的时候必须要小心。 JPEG 是Apple公司发明,是一种高度压缩的格式,但压缩后的图像的颜色质素会较低,一个20MB的TIFF可存成4.5MB的JPEG 特性:全名是Join PhotoGraphic Experts Group,乃Apple公司其中一项重要发明,JPEG是一种高度压缩格式。在PhotoShop4.0中,当你选JPEG时,可以选择压缩后档案的质素,有高中低及最高(high、medium、low及maximum)四种选项,选最高即是颜色质素最好,但MB数会最多,如果选低颜色质素,压缩得最大,MB数会最少,例如一个20MB的TIFF压缩成最低质素后只是814Kilobytes,如果压缩成最高质素例便成为4.5MB。 用途:最初推出的JPEG格式主要地用作压缩在QuickTime上所用的图相,而后来亦备受排版及设计所使用,因为压缩后的档案传送速度较快。虽然档案传送速度快,但压缩后的图像颜色质素较低,所以一般设计师未必使用此格式。因一般的报纸所用的印刷精度较低,而压缩后的图像颜色质素亦较低,因此这格式会较多报纸商使用。 较新版本的软件能够置入JPEG格式的档案,另外亦只是采用PostScript Level II的输出设备才可支援此格式,而较特别的就是lllustrator 6. 0、7.0及Freehand7.0均可输出JPEG格式,但只是RGB模式。在PhotoShop内除了可以存JPEG压缩,只要在Emcoding上选JPEG,更可以同时使用clipping path。另外JPEG也是Internet上常用的档案格式,但观看者的电脑要有QuickTime。 PICT 主要是描述灰阶及黑白的图像 特性:主要是描述灰阶Gray Scale及黑白的图像的档案格式。储存成可以选择不同的解像度和压缩档案与否。 用途:全部软件都能够支援此格式,但是置入lllustrator 5.0 C就只能将PICT档案格式当作为模板(Template),即只是一个底板做绘图,令用者容易绘画图像而已。几个版本的Freehand都可储存成PICT的格式。 Scitex CT 是专为Scitex的产品与影像资料能直接互通的档案格式 特性:Scitex是一间以色列公司,主要产品包括一系列的高档印前系统,如输出机、扫描机及高解像度彩印机等。Scitex CT是专为Scitex 的产品与影像资料能直接互通的档案格式,通常使用于灰阶或CMYK模式的影像。 用途:在常见的软件当中,排版软件如QuarkXPress及PageMaker、图像修描软件像PhotoShop及Live Picture都能支援此格式。另外需注意,输出设备是否支援此格式,当然Scitex的输出设备是支援。

Bmp图像存储格式

摘要:本文简单介绍了位图文件的两种存储格式,并且在VC++6.0下实现了读取位图文件中的数据,用SetPixel()函数在窗口中重现图像,最后在 程序中实现了一种存储格式到另一种存储格式的转换。 关键字:BMP、灰度位图、24位真彩色位图、存储格式 一、前言 BMP(Bitmap的缩写)图像是指文件名后缀为BMP的位图图像。位图图像在计算机中使用很广泛,例如在windows中,记事本、写字板中的文字就是用位图图像表示出来的。许多以其它格式存储的图像,就是在位图图像的基础上,进行优化处理后得到的,例如JPEG图像等。 在数字图像处理中,许多算法就是针对24位真彩色位图或灰度位图设计的。因此,很有必要介绍一下位图文件的这两种存储格式。 二、24位真彩色图像存储格式 把下图的24位真彩色图像格式在16位编辑器(例如VC编辑器)中打开,可以看到图像的二进制数据。 24位真彩色的二进制数据为: 这是24位真彩色位图文件数据一部分。这一部分数据包括位图文件头、位图信息头和位图阵列三部分。 (一)位图文件头 位图文件头用来记录标志文件大小的一些信息,在文件中占14个字节,存储的内容如下: 字节 1 2 3 4 5 6 7 8 9 10 11 12 13 14 000000 42 4D CC B4 02 00 00 00 00 00 36 00 00 00 其中: 42 4D 为位图的标志,即ASCII码为BM CC B4 02 表示位图文件的总字节数,换算成十进制为 (02B4CC)H=(177356)10,即这副图像的大小为177356字节。 00 00 00 00 00 为保留字节,用来存储文件大小的数据。 36 00 00 00 00 表示位图阵列的起始位置,(36)H=(54)10即54字节开始为位 图阵列。 (二) 位图信息头 位图信息头记录和位图相关的一些信息,在文件中占40个字节,存储的内容如下: 字节 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 00000 0 2 8 00001 6 0 2 C 1 C 5 1 1 8 00003 2 0 1 2 B 1 2 B 00004 8 0 其中:

相关文档