ArticlePDF Available

Using JPEG quantization tables to identify imagery processed by software

September 2008
Digital Investigation 5(2)

September 2008
5(2)

DOI:10.1016/j.diin.2008.05.004

License
CC BY-NC-ND 4.0

Authors:

a b s t r a c t The quantization tables used for JPEG compression can also be used to help separate im-ages that have been processed by software from those that have not. This loose classifica-tion is sufficient to greatly reduce the number of images an examiner must consider during an investigation. As illicit imagery prosecutions depend on the authenticity of the images involved, this capability is an advantage for forensic examiners. This paper explains how quantization tables work, how they can be used for image source identification, and the implications for computer forensics.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

Using JPEG quantization tables to identify imagery

processed by software

Jesse D. Kornblum

Defense Cyber Crime Institute, United States

Keywords:

JPEG

Quantization

Digital ballistics

Calvin

Image authentication

abstract

The quantization tables used for JPEG compression can also be used to help separate im-

ages that have been processed by software from those that have not. This loose classiﬁca-

tion is sufﬁcient to greatly reduce the number of images an examiner must consider during

an investigation. As illicit imagery prosecutions depend on the authenticity of the images

involved, this capability is an advantage for forensic examiners. This paper explains how

quantization tables work, how they can be used for image source identiﬁcation, and the

implications for computer forensics.

1. Introduction

Illicit imagery cases became more difﬁcult to prosecute in the

United States in the wake of the Ashcroft v. Free Speech Coalition

Supreme Court decision in 2002 (Supreme Court of the United

States, 2002). In the court’s opinion, the prosecution must

prove that a real child was harmed by the illicit imagery in

order to get a conviction. As such, forensic examiners have

been increasingly asked to prove that suspected illicit imagery

contains real victims and not computer generated people. On

the other hand, the prosecution still only needs to introduce

a handful of such images in order to secure a conviction.

One of the more compelling arguments for proving the

authenticity of a pictured individual is to show that the image

came from a camera and has not been edited. Although a cam-

era could conceivably be used to capture an image of an

artiﬁcial person (e.g. photographing a computer screen),

such an image would hopefully be obviously identiﬁable.

The science of performing digital ballistics, or matching an

image to the individual device that created it, would be an ideal

method for ﬁnding real pictures to use in legal proceedings. Un-

fortunately such identiﬁcations are not easy. Instead this paper

demonstrates how an examiner can match an imageback to the

type of device that last modiﬁed it, either hardware or software.

This paper gives a brief overview of JPEG compression and

pays particular attention to the quantization tables used in

that process. Those tables control how much information is

lost during the compression process. The author has catego-

rized the types of tables and the implications for digital ballis-

tics are discussed. In particular, by eliminating those images

that were most likely last processed by a computer program,

the examiner is left with fewer images to consider for the

remainder of the investigation.

2. JPEG compression

This paper focuses exclusively on images stored in the JPEG

Interchange File Format (JFIF) (Wallace, 1992), a method for

storing data compressed with the JPEG standard (Joint Photo-

graphic Experts Group, 1991; Wallace, 1991). The JFIF is the

most commonly used format for JPEG data. Throughout the

paper, any reference to a JPEG or JPEG ﬁle refers to JFIF encoded

data.

A JPEG compressed image takes up considerably less space

than an uncompressed image. Whereas an uncompressed

640  480 pixel 24-bit color image would require 900 kB,

a JPEG version of the same image can be compressed to

E-mail address: jesse.kornblum@mantech.com

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/diin

doi:10.1016/j.diin.2008.05.004

digital investigation 5 (2008) S21–S25

a mere 150 kB. Converting an image into a JPEG is a six step

process that, as a whole, is beyond the scope of this paper

(Joint Photographic Experts Group, 1991; Wallace, 1991). First

the image is converted from the RGB color space into the

YCbCr space, or one based on the brightness and luminance

of each pixel. Next the image is downsampled, split into

blocks of 8 8 pixels, and a discrete cosine transform is

applied. The next stage is quantization, where the lossy com-

pression occurs and this paper is focused. Finally an entropy

coding (lossless compression) is applied and the image is

said to be JPEG compressed.

In the quantization stage, the image creation device must

use a table of values known as the quantization tables. Each

table has 64 values that range from 0 to 65,535.

A lower num-

ber means that less data will be discarded in the compression

and a higher quality image should result.

Each image has between one and four quantization tables.

The most commonly used quantization tables are those pub-

lished by the Independent JPEG Group (IJG) in 1998 and shown

in Fig. 1 (libjpeg, 1998). These tables can be scaled to a quality

factor Q. The quality factor allows the image creation device to

choose between larger, higher quality images and smaller,

lower quality images.

The value of Q can range between 0 and 100 and is used to

compute the scaling factor, S, as shown in Eq. (1). Each ele-

ment i in the scaled table T

is computed using the ith element

in the base table T

as shown in Eq. (2). All of these computa-

tions are done in integer math; there are no decimals (hence

the ﬂoor function in the equation). Any value of T

that com-

putes to zero is set to one.

For example, we can scale the IJG standard table using

Q ¼80 by applying Eq. (2) to each element in the table. The

resulting values are the scaled quantization tables and are

shown in Fig. 2. Note that the numbers in this table are lower

than in the standard table, indicating an image compressed

with these tables will be of higher quality than ones com-

pressed with the standard table. It should be noted that scal-

ing with Q ¼ 50 does not change the table.

S ¼

Q < 50

5000

: 200 2Q (1)

½i¼

S  T

½iþ50

100



(2)

3. Related work

Using quantization tables to identify the origin of digital im-

ages is only one method of conducting digital ballistics. A

good summary of other methods can be found in Sencar and

Memon (2008). The idea of using JPEG quantization tables for

digital ballistics was ﬁrst proposed by Farid (2006). In that re-

port he showed that the quantization tables from 204 images,

one per camera at the device’s highest quality setting, were for

the most part different from each other. When overlaps did

occur, they were generally among cameras from the same

manufacturer. He also demonstrated that the tables used by

the digital cameras were different from those used by Adobe

Photoshop.

Chandra and Ellis (1999) did some work to determine the

base quantization table used to compute the scaled tables

found in an existing JPEG image, but they focused on deter-

mining an equivalence to the IJG tables, not divining the true

tables used. Several papers have been written on recovering

the quantization table from previously compressed images

(Fan and de Queiroz, 2000, 2003; Neelamani et al., 2006) or

doubly compressed images (Lukas and Fridrich, 2003). Many

papers and patents have been written regarding quantization

table design such as Beretta et al. (1999), Costa and Veiga

(2005), Onnasch and Ploger (1994), Wang et al. (2001), and

Watson (1993).

4. Classifying JPEGs

For this paper the author examined several thousand images

from a wide variety of image creation devices and programs.

These devices include a Motorola KRZR K1m, a Canon Power-

Shot 540, a FujiFilm Finepix A200, a Konica Minolta Dimage Xg,

Fig. 2 – Standard JPEG quantization tables scaled with

Q [ 80.

Fig. 1 – Standard JPEG quantization tables.

In practice these values usually between 0 and 255. Some pro-

grammers chose to represent them as 8 bit values, but to be cor-

rect 16 bit values should be used.

digital investigation 5 (2008) S21–S25S22

and a Nikon Coolpix 7900. The author also examined the im-

ages produced by a number of software programs such as libj-

peg (libjpeg, 1998), Microsoft Paint, the Gimp (GIMP Team,

2007), Adobe Photoshop (Adobe Systems Incorporated, 2007),

and Irfanview (Irfan, 2007). The author also studied images

from the camera review web site Digital Photography Review

(Askey.Net Consulting Limited, 2007).

The author immediately noted that although some devices

always used the same quantization tables, the majority of

them used a different set of quantization tables in each image.

A further examination of the images allowed the author to

classify images into categories: standard tables, extended

standard tables, custom ﬁxed tables, and custom adaptive

tables.

4.1. Standard tables

Images in this category use scaled versions of the quantiza-

tion tables published in International JPEG Group standard

(libjpeg, 1998), shown in Fig. 1. Because many cameras and

programs use these tables, there is no way to determine,

based on tables alone, which program or camera created an

image.

The base tables shown in Fig. 1 can be scaled using Q ¼ {1,

2, ., 99} to create 99 separate tables. Scaling with Q ¼0 would

produce grossly unusable images. Scaling with Q ¼100 would

produce a quantization table ﬁlled with all ones. Such a table

would be indistinguishable from any other base table scaled

with Q ¼100.

Any image using one of the 99 tables deﬁned above is said

to be using the standard tables. It is certainly possible that an-

other method could be used to generate one of these tables,

but there is no way to distinguish that from the image alone.

For example, the author found that some images from partic-

ular devices matched an IJG table but others did not. It cannot

be determined if the devices are using the IJG tables for some

images and not others or the devices are using another

method that occasionally produces the same tables as the

IJG method.

4.2. Extended standard tables

These images are a special case of the standard tables. They

use scaled versions of the IJG tables, but have three tables in-

stead of the two in the standard. The third table is a duplicate

of the second. The same methodology used to identify the

standard tables can be used to identify extended standard

tables.

4.3. Custom ﬁxed tables

Some programs have their own non-IJG quantization tables

that do not depend on the image being processed. For exam-

ple, when Adobe Photoshop saves an image as a JPEG it allows

the user to select one of 12 quality settings (different settings

are used when saving images ‘‘for the web’’). The quality

setting is used to select one of 12 sets of quantization tables

(Adobe Systems Incorporated, 2007).

Some devices use their own custom base quantization

table with the IJG scaling method. Regardless, these images

consider a user selected quality factor and the base quantiza-

tion table. Unlike the images described in the next section, the

image itself is not part of the equation.

One of the more frustrating aspects of these devices is that

there is no provable method for ﬁnding the original tables or

scaling method used by an image creation device. The exam-

iner could reverse engineer the device in question, but doing

so is not often practical for an illicit imagery investigation.

Given an image and its quantization tables, it is possible to

compute a base table, T

for any assumed value of Q between

1 and 99 and the IJG scaling method. The equation for doing so

is shown in Eq. (3). Note that the ﬂoor operation used in gen-

erating T

in Eq. (2) means that the values obtained for T

may not be equal to the true values of T

½i¼

100  T

½i50

(3)

The examiner could check each value of T

by attempting to

use it to scale up to current value of T

and the values of T

for

other images made by the same image creation device. If the

computed value of T

is too small to generate the correct value

of T

, then T

must be manually increased until it is sufﬁcient

to create T

when scaled. In the event that T

is too large for

another image, the image was not created using the IJG scaling

method and must be considered as a custom adaptive image

described in Section 4.4.

The problem with all of the above calculations, however, is

that the examiner does not know the true value of Q used for

any of the images from the device. The method described in

Chandra and Ellis (1999) can be used to estimate a value for

, or the quality factor used to scale the IJG standard table

to the closest possible value of T

. That estimate of Q provides

a good starting point, but in the end we have a system with

many possible solutions. The examiner cannot determine

which solution is correct.

The author constructed a program that, given an initial im-

age, accepted a value for Q

from the user. This value was used

to compute the T

tables. These tables were then scaled to the

values of Q from 1 to 99. These tables were then compared to

other images generated by the same device. After each image,

the tables were adjusted to ﬁt the current image. If the tables

no longer ﬁt all of the images, the images were re-categorized

as having custom adaptive tables as described in Section 4.4.

In general the program was deemed impractical as it did not

identify any plausible base tables.

4.4. Custom adaptive tables

These images do not conform to the IJG standard. In addition,

they may change, either in part or as a whole, between images

created by the same device using the same settings. They may

also have constants in the tables; values that do not change

regardless of the quality setting or image being processed.

For example, the author examined 21 pictures captured

with a Fuji Finepix A200 camera. Of these pictures, eight

images had identical quantization tables; four pictures shared

one set of tables but had differences in the other two tables.

The remaining 13 images all had unique quantization tables.

What struck the author as odd, however, was that for all 21

images the ﬁrst value in each of the three quantization tables

digital investigation 5 (2008) S21–S25 S23

was four. The other values in the tables ranged from 1 to 24;

a wide range. The author could not devise a set of base quan-

tization tables that would include such a wide variation of

values but keep one member of the tables constant. The au-

thor has hypothesized that this camera uses one constant

value in the each table but scales the remainder of them.

It should be noted that the camera’s manufacturer, the Fuji

Xerox Company Limited, holds several patents regarding ‘im-

age creation apparatuses.’ These include at least one that de-

scribes creating custom quantization tables based on the

image being processed (Yokose, 2005). In that particular pat-

ent, a base quantization table is modiﬁed depending not

only on the standard scaling method, but also on the resolu-

tion of the original uncompressed image.

5. Using quantization tables for ballistics

An examiner can encounter hundreds of thousands of images

in the course of a single investigation. As noted above, it may

be difﬁcult to prosecute an offender using images that have

been retouched by a computer. Given the large volume of im-

ages, it would beneﬁt an examiner to only consider those im-

ages that could be used for prosecution, and thus only

consider images that have not been altered.

JPEG quantization tables can be used for digital ballistics to

identify and eliminate from consideration those images that

most likely have been altered by a computer. That is, those im-

ages whose quantization tables are the most likely to have

been generated by software can be eliminated from the

investigation.

This method may have some false positives, or images that

were not modiﬁed by a computer but are still eliminated from

the investigation. But given the scale of such investigations

and how few images are needed for a successful prosecution,

a few false positives are acceptable.

There will be some special cases, however, where the

quantization tables indicate the image was last modiﬁed by

software but it has other indicators that it originally came

from a camera. For example, the image could contain a com-

plete set of EXIF data from a known camera or color signatures

of real skin. In this case the examiner could use the quantiza-

tion tables as part of a larger system to evaluate images.

Ideally, the examiner could use the JPEG quantization

tables to determine exactly what kind of device created each

image and categorize the images accordingly. The program

JPEGsnoop aims to do exactly this (Hass, 2008). The program

comes with a database of tables that be compared against

input ﬁles. Unfortunately, however, JPEGsnoop assumes that

each camera can use only one quantization table. The use of

custom adaptive tables, however, means that programs like

JPEGsnoop would need to hold an unwieldy number of tables

to be practical. Worse, some tables may be used by several de-

vices, including both cameras and software programs, render-

ing the database inaccurate when attempting to determining

an image’s origin.

6. Calvin

The author has developed a software library called Calvin to

help programmers use quantization tables for digital ballis-

tics. The goal of Calvin is to identify those images who cannot

be guaranteed to have been created by a real camera. For our

purposes this means any image that could have been last

processed by software. The program was named in honor of

Dr. Calvin Goddard, the inventor of forensic ballistics (Federal

Bureau of Investigation, 2003).

The Calvin library is able to display the quantization tables

from existing images and determine if a new table is in a set of

known tables. By default the library contains the standard ta-

bles, extended standard tables, and the tables used by Adobe

Photoshop. Additional tables can be loaded from a conﬁgura-

tion ﬁle. The library can be used to generate these conﬁgura-

tion ﬁles from existing images.

6.1. Display mode

The user may wish to add more quantization tables to the set

used by Calvin. The program can extract and display the

quantization table for any image. For ease of use, the output

is presented in the library’s conﬁguration ﬁle format. The

standard tables scaled with Q ¼ 80 (shown earlier in Fig. 2)

are shown as a conﬁguration ﬁle entry in Fig. 3.

6.2. Comparison mode

An examiner can also use the Calvin library to compare the

quantization table from an unknown image to the set of

known tables. The user presents the library with an unknown

ﬁle and is told whether or not the quantization table is

contained in the known set. Presuming that the set of known

signatures contains only the tables used by software pro-

grams, a negative response means that the image in question

was possibly created by a hardware device. It could have been

created by a program that uses tables not in the set of knowns.

Fig. 3 – Sample Calvin conﬁguration ﬁle entry.

digital investigation 5 (2008) S21–S25S24

Conversely, a positive response from the library means that

the image was most likely last modiﬁed by a program. It could

have been created by a hardware device that uses the same

tables as a known software program.

It is also possible that a real picture that has been

processed by a software package, for example, cropped, would

get a positive response from Calvin. Such a program would

most likely be recompressed with the quantization tables

used by the program, not the original quantization tables

that the hardware device wrote into the image.

7. Conclusion

The author has demonstrated how JPEG quantization tables

can be used for digital ballistics to eliminate images that could

not be used in a prosecution for illicit imagery. The methodol-

ogy is not perfect, but given the large number of available

images and the small number needed in court, it should be

sufﬁcient. A more elegant solution, however, would be to

combine this kind of digital ballistic information with other

metadata from an image. Other factors, such as the presence

or absence of EXIF data, signatures of known programs, and

color signatures of real skin, could reduce the examiner’s

workload even more. In the meantime, however, using JPEG

quantization tables for digital ballistics is a big step forward

for examiners and should improve their productivity and

success.

Acknowledgments

The author would like to thank the people who provided both

pictures and support during this research: Rik Farrow, Joe

Lewthwaite, Brian Martin, Jennifer Reichwein, and Peiter

‘‘Mudge’’ Zatko. Invaluable technical support was provided

by Robert J. Hansen. Extra special thanks to S–.

references

Adobe Systems Incorporated. Adobe photoshop. CS3 ed.; 2007.

Askey.Net Consulting Limited. Digital photography review,

<http://www.dpreview.com/>; June 2007.

Beretta Giordano, Bhaskaran Vasudev, Konstantinides

Konstantinos, Natarajan Balas K. US Patent 5,883,979: method

for selecting JPEG quantization tables for low bandwidth

applications; 1999.

Chandra Surendar, Ellis Carla Schlatter. JPEG compression metric

as a quality aware image transcoding. In: Proceedings of the

2nd USENIX symposium on internet technologies & systems;

1999. p. 81–92.

Costa LF, Veiga ACP. A design of JPEG quantization table using

genetic algorithms. In: Proceedings of the ACIT signal and

image processing; 2005.

Fan Zhigang, de Queiroz Ricardo. Maximum likelihood estimation

of JPEG quantization table in the identiﬁcation of bitmap

compression history. IEEE Transactions on Image Processing

2000;1:948–51.

Fan Zhigang, de Queiroz Ricardo. Identiﬁcation of bitmap

compression history: JPEG detection and quantizer estimation.

IEEE Transactions on Image Processing 2003;12(2).

Farid Hany. Digital image ballistics from JPEG quantization.

Technical Report TR2006-583, Department of Computer

Science, Dartmouth College; 2006.

Federal Bureau of Investigation. The birth of the FBI’s technical

laboratory, <http://www.fbi.gov/hq/lab/labdedication/

labstory.htm>; 2003.

The GIMP Team. GNU image manipulation program. 2.2.15 ed.,

<http://gimp.org/>; 2007.

Hass Calvin. JPEGsnoop. 1.2.0 ed., <http://www.

impulseadventure.com/photo/jpeg-snoop.html>; 2008.

Irfan Skiljan. IrfanView. 4.0 ed.; 2007.

Joint Photographic Experts Group. Information technology –

digital compression and coding of continuous-tone still

images: requirements and guidelines. ISO/IEC 10918-1:1994;

1991.

JPEG Group. libjpeg. 6b ed., <http://www.ijg.org/>; 1998.

Lukas Ja

nˇ , Fridrich Jessica. Estimation of primary quantization

matrix in double compressed JPEG images. In: Proceedings of

the 2003 digital forensic research workshop, SUNY,

Binghamton; 2003.

Microsoft Corporation. Microsoft Paint overview.

Neelamani Ramesh (Neelsh), de Queiroz Ricardo, Fan Zhigang,

Dash Sanjeeb, Richard Baraniuk G. JPEG compression history

estimation for color images. IEEE Transactions on Image

Processing June 2006;15(6).

Onnasch Prause, Ploger. Quantization table design for JPEG

compression of angiocardiographic images. Computers in

Cardiology 1994.

Sencar Husrev T, Memon Nasir. Overview of state-of-the-art in

digital image forensics. Part of Indian Statistical Institute

Platinum Jubilee Monograph series ‘Statistical Science and

Interdisciplinary Research’; 2008.

Supreme Court of the United States. Ashcroft v. Free Speech

Coallition; 2002. case 00-795.

Wallace Gregory K. The JPEG still picture compression standard.

IEEE Transactions on Consumer Electronics 1991;38(1):18–34.

Wallace Gregory K. JPEG ﬁle interchange format. C-Cube

Microsystems September 1992.

Wang Ching-Yang, Lee Shiuh-Ming, Chang Long-Wen. Designing

JPEG quantization tables based on human visual system.

Signal Processing: Image Communication 2001;16(5):501–6.

Watson Andrew B. DCT quantization matrices visually optimized

for individual images. In: Proceedings of the society for optical

engineering; 1993. p. 202–16.

Yokose Taro. US Patent 6,968,090: image coding apparatus and

method; 2005.

Jesse D. Kornblum is a Research and Development Engineer

for the Defense Cyber Crime Institute. A contractor with the

ManTech International Corporation, his research focuses on

computer forensics and computer security. He has authored

and maintains a number of computer forensics tools including

foremost, md5deep and ssdeep. When choosing sodas,

Mr. Kornblum prefers cane sugar to high fructose corn syrup.

digital investigation 5 (2008) S21–S25 S25

Mobile-Cloud Inference for Collaborative Intelligence

Preprint

Full-text available

Jun 2023

Mateen Ulhaq

As AI applications for mobile devices become more prevalent, there is an increasing need for faster execution and lower energy consumption for deep learning model inference. Historically, the models run on mobile devices have been smaller and simpler in comparison to large state-of-the-art research models, which can only run on the cloud. However, cloud-only inference has drawbacks such as increased network bandwidth consumption and higher latency. In addition, cloud-only inference requires the input data (images, audio) to be fully transferred to the cloud, creating concerns about potential privacy breaches. There is an alternative approach: shared mobile-cloud inference. Partial inference is performed on the mobile in order to reduce the dimensionality of the input data and arrive at a compact feature tensor, which is a latent space representation of the input signal. The feature tensor is then transmitted to the server for further inference. This strategy can reduce inference latency, energy consumption, and network bandwidth usage, as well as provide privacy protection, because the original signal never leaves the mobile. Further performance gain can be achieved by compressing the feature tensor before its transmission.

Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization

Article

Full-text available

Jul 2022

To meet the high bit rate requirements in many multimedia applications, a lossy image compression algorithm based on Walsh–Hadamard kernel-based feature extraction, discrete cosine transform (DCT), and bi-level quantization is proposed in this paper. The selection of the quantization matrix of the block is made based on a weighted combination of the block feature strength (BFS) of the block extracted by projecting the selected Walsh–Hadamard basis kernels on an image block. The BFS is compared with an automatically generated threshold for applying the specific quantization matrix for compression. In this paper, higher BFS blocks are processed via DCT and high Q matrix, and blocks with lower feature strength are processed via DCT and low Q matrix. So, blocks with higher feature strength are less compressed and vice versa. The proposed algorithm is compared to different DCT and block truncation coding (BTC)-based approaches based on the quality parameters, such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) at constant bits per pixel (bpp). The proposed method shows significant improvements in performance over standard JPEG and recent approaches at lower bpp. It achieved an average PSNR of 35.61 dB and an average SSIM of 0.90 at a bpp of 0.5 and better perceptual quality with lower visual artifacts.

Integrity Verification Through File Container Analysis

Chapter

Full-text available

Jan 2022

In the previous chapters, multimedia forensics techniques based on the analysis of the data stream, i.e., the audio-visual signal, aimed at detecting artifacts and inconsistencies in the (statistics of the) content were presented. Recent research highlighted that useful forensic traces are also left in the file structure, thus offering the opportunity to understand a file’s life-cycle without looking at the content itself. This chapter is then devoted to the description of the main forensic methods for the analysis of image and video file formats.

FloreView: An Image and Video Dataset for Forensic Analysis

Article

Full-text available

Jan 2023

Linking a digital image or video to its originating device, or checking the content integrity still represent challenging forensic tasks. Even though several technologies based on metadata, file format, and sensor noise have been developed to address these problems, current methods are frequently made obsolete by new customized acquisition pipelines implemented by manufacturers. Therefore, to assess the performance of the available tools and push the research activity, researchers continuously need new datasets containing contents captured with recent technologies. In this paper, we present a new image and video dataset for forensic analysis. Data, acquired by the most recent acquisition devices, were collected under strictly controlled procedures designed to limit the bias induced by differences in the acquisition process between different devices. The dataset includes over 9000 media contents captured by 46 smartphones of 11 major brands. For each device, we collected at least 100 unique natural images, 30 unique natural videos, 30 flat images, and 4 flat videos. Great care has been taken in collecting data that can be used for multiple forensic tasks; moreover, images and videos have been carefully organized so that FloreView could be used by the community immediately and effortlessly. Finally, two case studies related to image source identification and video brand identification have been performed, using state-of-the-art methods, to show how the proposed dataset can be effectively used for forensic tasks.

A Forensic Methodology for Detecting Image Manipulations

Preprint

Full-text available

Aug 2023

By applying artificial intelligence to image editing technology, it has become possible to generate high-quality images with minimal traces of manipulation. However, since these technologies can be misused for criminal activities such as dissemination of false information, destruction of evidence, and denial of facts, it is crucial to implement strong countermeasures. In this study, image file and mobile forensic artifacts analysis were conducted for detecting image manipulation. Image file analysis involves parsing the metadata of manipulated images (e.g., Exif, DQT, and Filename Signature) and comparing them with a Reference DB to detect manipulation. The Reference DB is a database that collects manipulation-related traces left in image metadata, which serves as a criterion for detecting image manipulation. In the mobile forensic artifacts analysis, packages related to image editing tools were extracted and analyzed to aid the detection of image manipulation. The proposed methodology overcomes the limitations of existing graphic feature-based analysis and combines with image processing techniques, providing the advantage of reducing false positives. The research results demonstrate the significant role of such methodology in digital forensic investigation and analysis. Additionally, We provide the code for parsing image metadata and the Reference DB along with the dataset of manipulated images, aiming to contribute to related research.

Reversible Data Hiding in Digital Image

Article

Full-text available

Nov 2019

The growth of multimedia technologies and the attractiveness of internet are dramatically increasing. All kinds of multimedia information like audio, video and images can be obtained freely and copied even edited then transmitted as one wishes. At the same time, this makes the integrity, reliability, and security of data transmitted under threat. The information security has always been a major concern. As one of the ways to solve the security problem is data hiding technology that embeds the secret data imperceptibly into the cover media by slightly modifying some of the cover elements. This paper discusses the basic principle s of reversible data hiding (RDH) as a method of covert communication in order to understand how information is embedded in the graphical representation and covers the in-depth discussion on image compression.

Review on Medical Image Compression

Article

Jul 2023

In today’s digital era, the demand for digital medical images is rapidly increasing. Hospitals are transitioning to filmless imaging systems, emphasizing the need for efficient storage and seamless transmission of medical images. To meet these requirements, medical image compression becomes essential. However, medical image compression typically necessitates lossless compression techniques to preserve the diagnostic quality and integrity of the images. There are several challenges associated with medical image compression and management. Firstly, medical image management and image data mining involve organizing and accessing large volumes of medical images efficiently for clinical and research purposes. Secondly, bioimaging, which encompasses various imaging modalities like microscopy and molecular imaging, presents specific requirements and challenges for compression algorithms. Thirdly, virtual reality technologies are increasingly utilized in medical visualizations, demanding efficient compression methods to handle the high resolution and immersive nature of VR medical imaging data. Lastly, neuro imaging deals with complex brain imaging data, requiring specialized compression techniques tailored to the unique characteristics of these images. As the amount of medical image data continues to grow, image processing and visualization algorithms have to be adapted to handle the increased workload. Researchers and developers have been working on various compression algorithms to address these challenges and optimize medical image compression. This review paper compares different compression algorithms that would provide valuable insights into the strengths, limitations, and performance metrics of various techniques. It would assist researchers, clinicians, and imaging professionals in selecting the most suitable compression algorithm for their specific needs, considering factors such as compression ratio, computational complexity, and image quality preservation. By comprehensively comparing compression algorithms, this review paper contributes to advancing the field of medical image compression, facilitating efficient image storage, transmission, and analysis in healthcare settings.

Picking Up Quantization Steps for Compressed Image Classification

Preprint

Full-text available

Apr 2023

The sensitivity of deep neural networks to compressed images hinders their usage in many real applications, which means classification networks may fail just after taking a screenshot and saving it as a compressed file. In this paper, we argue that neglected disposable coding parameters stored in compressed files could be picked up to reduce the sensitivity of deep neural networks to compressed images. Specifically, we resort to using one of the representative parameters, quantization steps, to facilitate image classification. Firstly, based on quantization steps, we propose a novel quantization aware confidence (QAC), which is utilized as sample weights to reduce the influence of quantization on network training. Secondly, we utilize quantization steps to alleviate the variance of feature distributions, where a quantization aware batch normalization (QABN) is proposed to replace batch normalization of classification networks. Extensive experiments show that the proposed method significantly improves the performance of classification networks on CIFAR-10, CIFAR-100, and ImageNet. The code is released on https://github.com/LiMaPKU/QSAM.git

A Critical Look into Quantization Table Generalization Capabilities of CNN-based Double JPEG Compression Detection

Conference Paper

Aug 2022

Picking Up Quantization Steps for Compressed Image Classification

Article

Jan 2022

DCT quantization matrices visually optimized for individual images

Article

Full-text available

Sep 1993
Proceedings of SPIE

Andrew B. Watson

Several image compression standards (JPEG, MPEG, H.261) are based on the Discrete Cosine Transform (DCT). These standards do not specify the actual DCT quantization matrix. Ahumada & Peterson 1 and Peterson, Ahumada & Watson 2 provide mathematical formulae to compute a perceptually lossless quantization matrix. Here I show how to compute a matrix that is optimized for a particular image. The method treats each DCT coefficient as an approximation to the local response of a visual "channel." For a given quantization matrix, the DCT quantization errors are adjusted by contrast sensitivity, light adaptation, and contrast masking, and are pooled non-linearly over the blocks of the image. This yields an 8x8 "perceptual error matrix." A second non-linear pooling over the perceptual error matrix yields total perceptual error. With this model we may estimate the quantization matrix for a particular image that yields minimum bit rate for a given total perceptual error, or minimum perceptual error for a given bit rate. Custom matrices for a number of images show clear improvement over image-independent matrices. Custom matrices are compatible with the JPEG standard, which requires transmission of the quantization matrix.

Visually Optimal DCT Quantization Matrices for Individual Images.

Conference Paper

Full-text available

Jan 1993

Andrew B. Watson

A custom quantization matrix tailored to a particular image is designed by an image-dependent perceptual method incorporating solutions to the problems of luminance and contrast masking, error pooling and quality selectability

JPEG compression history estimation for color images

Article

Full-text available

Jul 2006

We routinely encounter digital color images that were previously compressed using the Joint Photographic Experts Group (JPEG) standard. En route to the image's current representation, the previous JPEG compression's various settings-termed its JPEG compression history (CH)-are often discarded after the JPEG decompression step. Given a JPEG-decompressed color image, this paper aims to estimate its lost JPEG CH. We observe that the previous JPEG compression's quantization step introduces a lattice structure in the discrete cosine transform (DCT) domain. This paper proposes two approaches that exploit this structure to solve the JPEG Compression History Estimation (CHEst) problem. First, we design a statistical dictionary-based CHEst algorithm that tests the various CHs in a dictionary and selects the maximum a posteriori estimate. Second, for cases where the DCT coefficients closely conform to a 3-D parallelepiped lattice, we design a blind lattice-based CHEst algorithm. The blind algorithm exploits the fact that the JPEG CH is encoded in the nearly orthogonal bases for the 3-D lattice and employs novel lattice algorithms and recent results on nearly orthogonal lattice bases to estimate the CH. Both algorithms provide robust JPEG CHEst performance in practice. Simulations demonstrate that JPEG CHEst can be useful in JPEG recompression; the estimated CH allows us to recompress a JPEG-decompressed image with minimal distortion (large signal-to-noise-ratio) and simultaneously achieve a small file-size.

The JPEG still picture compression standard

Article

Apr 1991

Gregory K. Wallace

This paper is a revised version of an article by the same title and author which appeared in the April 1991 issue of Communications of the ACM. For the past few years, a joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international compression standard for continuous-tone still images, both grayscale and color. JPEG’s proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each with various modes of operation. A DCT-based method is specified for “lossy’ ’ compression, and a predictive method for “lossless’ ’ compression. JPEG features a simple lossy technique known as the Baseline method, a subset of the other DCT-based modes of operation. The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications. This article provides an overview of the JPEG standard, and focuses in detail on the Baseline method. 1

Rule Interchange Format (RIF)

Book

Jan 2005

Digital Image Ballistics from JPEG Quantization

Article

Jan 2006

Hany Farid

Most digital cameras export images in the JPEG file format. This lossy compression scheme employs a quantization table that controls the amount of compression achieved. Different cameras typically em- ploy different tables. A comparison of an image's quantization scheme to a database of known cameras affords a simple technique for confirming or denying an image's source. Similarly, comparison to a database of photo-editing software can be used in a forensic setting to determine if an image was edited after its original recording.

Estimation of primary quantization matrix in double compressed JPEG images

Article

Jan 2003

In this report, we present a method for estimation of primary quantization matrix from a double compressed JPEG image. We first identify characteristic features that occur in DCT histograms of individual coefficients due to double compression. Then, we present 3 different approaches that estimate the original quantization matrix from double compressed images. Finally, most successful of them -Neural Network classifier is discussed and its performance and reliability is evaluated in a series of experiments on various databases of double compressed images. It is also explained in this paper, how double compression detection techniques and primary quantization matrix estimators can be used in steganalysis of JPEG files and in digital forensic analysis for detection of digital forgeries.

Maximum likelihood estimation of JPEG quantization table in the identification of bitmap compression history

Conference Paper

Feb 2000
Image Process

To process previously JPEG coded images the knowledge of the quantization table used in compression is sometimes required. This happens for example in JPEG artifact removal and in JPEG re-compression. However, the quantization table might not be known due to various reasons. A method is presented for the maximum likelihood estimation (MLE) of the JPEG quantization tables. An efficient method is also provided to identify if an image has been previously JPEG compressed

Jpeg Compression Metric As A Quality Aware Image Transcoding

Conference Paper

Jan 1999

Transcoding is becoming a preferred technique to tailor multimedia objects for delivery across variable network bandwidth and for storage and display on the destination device. This paper presents techniques to quantify the quality-versus-size tradeoff characteristics for transcoding JPEG images. We analyze the characteristics of images available in typical Web sites and explore how we can perform informed transcoding using the JPEG compression metric. We present the effects of this transcoding on the image storage size and image information quality. We also present ways of predicting the computational cost as well as potential space benefits achieved by the transcoding. These results are useful in any system that uses transcoding to reduce access latencies, increase effective storage space as well as reduce access costs.

Identification of bitmap compression history: JPEG detection and quantizer estimation

Article

Feb 2003

Sometimes image processing units inherit images in raster bitmap format only, so that processing is to be carried without knowledge of past operations that may compromise image quality (e.g., compression). To carry further processing, it is useful to not only know whether the image has been previously JPEG compressed, but to learn what quantization table was used. This is the case, for example, if one wants to remove JPEG artifacts or for JPEG re-compression. In this paper, a fast and efficient method is provided to determine whether an image has been previously JPEG compressed. After detecting a compression signature, we estimate compression parameters. Specifically, we developed a method for the maximum likelihood estimation of JPEG quantization steps. The quantizer estimation method is very robust so that only sporadically an estimated quantizer step size is off, and when so, it is by one value.

Using JPEG quantization tables to identify imagery processed by software

Abstract

Recommended publications

Anti-forensics of JPEG Detectors via Adaptive Quantization Table Replacement

The Rate-Distortion-Accuracy Tradeoff: JPEG Case Study

Kornblum, J.: Identifying Almost Identical Files using Context Triggered Piecewise Hashing. Digital...

The cost of JPEG compression anti-forensics

JPEG Noises beyond the First Compression Cycle