Can backup compression take advantage of similarities between photos?

Asked 9/29/2023

3 views

2 answers

0

I want to shrink my photo backups without losing data or deleting near-duplicates. Many of my images come from bursts or sequences with only small changes in angle, pose, lighting, or camera settings, so I’m wondering whether a backup/archive format can exploit those similarities across multiple files.

Would it help to:

  • put many images into one archive with a strong compressor and large dictionary, or
  • treat a sequence of photos like video frames and use inter-frame compression?

I’m mainly interested in reversible, archival storage rather than keeping each image individually readable. Are there established approaches that work well for JPEGs or RAW files, and are there reasons these ideas may not help much in practice?

Originally by Photography Stack Exchange contributor. Source · Licensed CC BY-SA 4.0

Photography Stack Exchange contributor

2y ago

2 Answers

2

A few suggestions:

  1. Max out the dictionary size option on the compression tool you use. This allows more internal re-use of common sequences found in the data. It can waste space on a few small text documents, but with many big photos, the dictionary size is relatively small compared to the data contained, and will pay for itself quickly.

  2. Compress each scene/batch in a different archive. This relates to #1; allowing the parts of the 16x16 block data that appear frequently to be replaced by a dictionary entry. If all the photos are about the same, it can really squeeze the redundancy out much more effectively than if you have a group of say vacation pics with food, people, sunsets, landscapes, and lots of other random subjects.

  3. Like 2, compress jpeg and raw files separately. This increases redundancy.

  4. Try several different compression engines. Don't assume zip or rar or z7 will work the best on everything. If you use command-line tools, it's pretty simple to just use all of them on each batch, all you have to do is wait and delete the largest archives of the grouping.

  5. Have reasonable expectations. I doubt you'll do better than to half the footprint, and even that could be a tall order depending on the camera, image, and algo. Many raw files these days are already compressed in-camera, so there's less overhead to wring out.

  6. Experimental. Try using a batch tool to combine many images into a large image file, then try different compression methods designed for images on that file. As a bonus, this should make non-image compression more efficient as well, but it might be a lot of manual work to save just a little bit; something to consider if you're long on time and low on cash/space.

Originally by user76556. Source · Licensed CC BY-SA 4.0

user76556

2y ago

0

AI Answer

Generated from our catalog & community — verify before relying on it.

Usually, cross-image similarity helps less than you might expect, especially for JPEGs. JPEG already compresses each image internally, and much of what remains is fine detail and noise, which are less reusable across files than broad scene similarity.

What can help:

  • Archive many related images together with a strong general-purpose compressor and the largest dictionary size available.
  • Group similar shots/scenes into separate archives rather than mixing very different subjects.
  • Compress JPEGs and RAW files separately.
  • For RAW files, consider DNG conversion/compression, which can reduce storage.

What likely won’t help much:

  • Expanding JPEGs to bitmap first usually won’t be a win; you’re restoring data patterns that may not compress as well overall, and JPEG artifacts/noise still limit gains.
  • Encoding photo sequences as video is awkward for archival use and not an established solution for still-photo backups.

So: yes, solid archiving strategy can find some extra savings, but there isn’t a common still-photo backup format that dramatically exploits similarities between separate images the way video codecs do. The biggest practical gains are usually from better archiving settings and more efficient RAW storage, not cross-photo deduplication.

UniqueBot

AI

2y ago

Your Answer