Two EXACTLY the same .jpg images with one image more than twice the file size of the other - Why?

Question

I have a number of .jpg images which are EXACTLY (dpi, frame size, image) the same - except one is more than double the file size (in bytes) of the other. How can this be? When I zoom in by say 700% on the same spot on each image I see just about the same amount of pixilation, indicating the compression is the same.

Bob Macaroni McStevens · Answer

If the files are different sizes, the only way the image data can be identical is if the Exif data is different (an embedded thumbnail difference would be a place to start).
Suppose there are two files encoding the same image data. One has no EXIF data attached. The other has a full size copy of the image data as an embedded thumbnail. The file size difference could be about 2:1.
But probably the images are different (the most likely cause).
A pixel by pixel diff can be done using ImageMagick’s compare command as outlined in this StackOverflow answer .
It will pick up even minor differences.

xenoid · Answer

The JPEG compression algorithm works like this:

Instead of being saved as three R, G and B planes, your image is decomposed into 3 planes, one that carries luminosity, and two that carry color (aka chroma) information.
The next step is called "chroma sub-sampling": the two chroma planes can then be scaled down by a factor of two in one dimension (chroma halved), or by a factor of two in both dimensions (chroma quartered)*.
The lossy compression is then applied to each plane.

So if you start with an image with N pixels, you initially have:
N + N + N = 3×N

pixels to save. But with the chroma halved you have:
N + ½N + ½N = 2×N

pixels, and with the chroma quartered your have:
1 + ¼N + ¼N = 1.5×N

So, when the chroma is quartered you have reduced the size of the data to a half before even applying the compression. This directly results in a much smaller file size.
Some applications have independent settings for quality and chroma, while others use a given subsampling depending on the quality setting.

*This works because our eyes are much more sensitive to luminosity than to color. Try this (in Photoshop or Gimp):

Take an image
Make one copy B&W
In the other copy:

Scale it down by 2 in both directions (you can even try a factor of 4)
Scale it up to its original size. In effect you have blurred your image

Import the color version as a new layer
Set it to "Color" blend mode (so the resulting image is the color of the top layer applied to the luminosity of the bottom layer)

You won't see much difference with the initial image.

Peter Cordes · Answer

These are two different images, not of the same scene?
For the same perceptual quality, some images compress better than others.
Does one have a lot of smooth or blurred areas (like the sky without a lot of clouds, or only the subject in focus), while another has lots of fine detail, especially with sharp edges like grass and shrubs where the edges of every blade and leaf are visible, or a print fabric?
The extremes are uniform white or black (very tiny), vs. random noise (large file).
The key building block for JPEG is a 2D DCT (Discrete Cosine Transform) (in every 8x8 block separately) that transforms the spatial pixel information into the frequency domain.  A smooth gradient only has low frequencies, while sharp edges like text, blades of grass, and/or leaves of a shrub, have lots of high frequency components.  (The 1D version of this is trying to represent a square wave as a sum of cosines - this site has a visual demo)
This transformation itself isn't particularly lossy.  But it tends to produce a lot of coefficients that are nearly zero, and can be rounded off to exactly zero without a big visual quality loss.  And the non-zero coefficients can be rounded off (quantized) to less than 8 bits each, depending on the compression level.  (How much you quantize, and how aggressively you discard near-zero high frequency components, is basically what the compression quality setting does.)
If you did similar rounding to the raw RGB or YUV pixel data, you'd kind of posterize / cartoonize the image by making areas of similar colour the same colour.  But doing it to DCT frequency coefficients can discard way more bits with only minor visual degradation.  e.g. ringing around sharp edges, and other usual JPEG artifacts.
(Choosing how to round off the coefficients is a bit of an art as well, with some consideration to the human visual system and what we notice / don't notice.)
After quantization, the rest of JPEG is IIRC lossless, just turning that sparse set of non-zero coefficients into a bit-stream and doing lossless compression (similar to zip) on it.
Fun fact: this is exactly the same math that MP3 / AAC are based on: transform to frequency coefficients and round them off (and fully discard insignificant ones).  It's not a coincidence that MP3 audio artifacts include "ringing", just like JPEG looks.

Note that pixelation is not a JPEG artifact.  It tends to smooth things out, preferentially discarding high frequency coefficients (fine details including sensor noise) not low frequency (overall shapes within the 8x8 block).
You're probably familiar with what a JPEG of text looks like when you crank up the compression (discard / round off more information to make the file size smaller).  Keeping enough bits to make that not happen is why high-quality images with lots of detail need more bits than smoother images.
Text has lots of sharp edges, and thus a lot of "energy" in the high frequency components of the DCT.  Representing it accurately requires keeping them.
Same for a noisy image, e.g. from film grain or low light / high ISO on a digital camera.  To JPEG, that noise is signal information that it needs to try to reproduce.  That can cost a lot of bits.

Semi-related: Why does preset "veryfast" in FFmpeg generate the most compressed file compared to all other presets? - my answer there is about h.264 video compression, but the same tradeoff of quality vs. bitrate applies to still images.  With less importance attached to encode/decode speed because it's only one image, and because JPEG doesn't have the ability to say "this block looks like the pixels over here", so the search space for the encoder is much smaller.
More advanced still-image formats do use intra prediction instead of just separate DCTs for each block.  e.g. HEIF is an h.265 I frame, and is claimed to require only about half the storage space as a JPEG of equivalent visual quality on average. (And might do even better in an image with a repeating pattern, such that a lot of blocks can be coded as "copy the pixels from over there, then apply this difference", instead of having to start from scratch with a checkerboard pattern that has lots of high-frequency energy.)

A.L · Answer

When I zoom in by say 700% on the same spot on each image I see just about the same amount of pixilation, indicating the compression is the same.

Open a JPEG file with an image editor, save the image at 85 quality. Close then reopen the image(1), save it again with 100 quality. Now compare the 2 output images. The second image will have a bigger file size, even if the two images will look almost identical.
So different compression levels can give (almost) identical images.
(1): so that the editor doesn't use the image in memory.

Solomon Slow · Answer

Maybe it would help if, instead of thinking of an image file as "being" the image or "containing" the image, you thought of it as instructions that tell your computer how to re-create the image.
Suppose that you have a picture, and you want to tell somebody else over the phone, how to re-create it.  You could divide your picture up into "pixels" and then tell the other person how many pixels high, and how many pixels wide, and which exact color to paint every single pixel in their copy. If you did that, you would be sending them "uncompressed" image data.
Depending on the image though, you might be able save some time if you stopped to think about it.  Maybe the image contains a lot of white pixels.  You could start by saying, "paint every pixel white," and then you could just tell them the colors of the not-white pixels. That would be a form of "compression." Maybe the image contains a lot of different solid colors. You could give them a partial list of pixels, and then say, for all the ones that I didn't tell you to color, just make them the same as the color to the left.
The number of different ways to describe an image is practically infinite. And finding the most compact ones is a Hard Problem. Image file formats like .jpg or .png put some narrow limits on the number of different ways you can describe an image, but they allow the creator of the file some leeway to choose just how much time their computer will spend trying to find a good one. Often there's a setting with a name like "compression level," "quality factor," etc.
In a nutshell, when you create a "compressed" image, if you let your computer work harder on it (e.g., choose a higher "compression level") you'll get a smaller file.

w. Patrick Gale · Answer

The problem is that we forget all of our digital files are made up of binary, but our computers obfuscate those details. We simply see in our file explorer that some whatever.jpg file was last updated on Jan. 1, 2021 and is 14.2Mb in size. Then if we open the file, we see an image in our image viewer software, after the computer reads the instructions on how the file should be displayed.
You can use software such as BeyondCompare to analyze differences between files. Below is a screenshot of two images being analyzed using BeyondCompare. The image on the left is ~2.1Mb and the one on the right is ~1Mb. They are both .jpg formatted files, but the one on the right is a copy of the image on the left, except it was saved at a higher compression. Visually they look similar.

But we haven't even looked at what our computer sees in order to create the image for us. Below is a screenshot of the same image files using hexadecimal comparison. The hexadecimal values are basically the 'instructions' that the computer reads in order to generate a representation of the image for you.

Ben · Answer

A simple way to compare two images which appear to be identical is to use the "Grain Extract" layer blend mode in Gimp.

Load the two images into two layers.
Set the top layer blend mode to Grain Extract.
Merge down.

If the images are identical, you will have a pure grey image. If there are differences in luminosity or colour, these differences will be visible. You can use curves to amplify the differences.

Two EXACTLY the same .jpg images with one image more than twice the file size of the other - Why?

7 Answers

Add your own answers!

Ask a Question