How can I delete RAW files only when a matching JPEG exists?

Asked 12/14/2020

3 views

2 answers

0

We’re migrating a very large photo archive into a new DAM and want to remove RAW files only when there is a corresponding JPEG version. Most of the RAW files are .cr2. Is there a practical way—such as a script or application—to identify RAW files that have matching JPEGs and then delete only those RAWs? Assume the RAW and JPEG may be matchable by filename, and folder structure may matter.

Originally by Photography Stack Exchange contributor. Source · Licensed CC BY-SA 4.0

Photography Stack Exchange contributor

5y ago

2 Answers

2

This is a mistake. Disk space is cheap. A raw image is only about twice the size of a good jpeg image. For my Nikon D7100 the comparison is about 30MB vs 15 MB.

Here's what is going to happen: Marketing is going to get a Jpeg. They are going to say, "that sky isn't blue enough. Let's saturate it more" And they edit it. And Lo and Behold because you're mapping 8 bits of information into 8 bits of information there are rounding steps, and the sky is banded, or becomes mottled. And that expensive model's flawless skin now is pixelated at looks like it's made from coarse sandpaper.

Back into photoshop. Mask the sky. Introduce noise into the saturation channel. Now increase saturation. Ok, it worked this time. But it took 15 minutes of an expensive person's time. (Good photoshop techs don't come cheap.) Or worse, they just blur the sky. No bands, but it loses something. Cloud edges don't pop anymore.

Never throw information away.

46,000 images at 30 MB each would be 1.38 TB. Buy a pair of enterprise quality 2 TB drives, and mirror them. You're set up for a few years.

A larger problem is keeping the versioning in sync. The JPeg image should show up in your system as being a derived image from the Raw master, and keywords applied to the master should propagate to the JPeg. Whether you can do this is a function of the DAM software you got.

Tips: You need unique IDs for images in the system.

Look at using exiftool and using metadata to rename images. I would suggest naming them

OriginalCreatedDateTime.hundreths_CameraMake-SerialNumber

So 2020-01-11_10:25:15.72_Canon-1127341.cr2

This guarantees you a unique number even if you are a local newspaper with 11 Canon cameras on staff. Note: Use a naming scheme that does not include spaces or characters that have special meaning to various operating systems. Avoid /@3&<>!?* at least.

Note that this fails big time with scanned images. Scanned images in DAMs are a difficult proposition. You need to run a salvage operation for metadata.

Your dam should be set to write this into any image on export as a keyword. That way 2 years from now, when the Marketing department says, "We need a 3000 pixel version of this image for a billboard instead of the 256 pixel version used or our mobile website, you can actually find it. (Yes this happens. I'm doing it now for my website. For 2000 images.)

Originally by user17873. Source · Licensed CC BY-SA 4.0

user17873

5y ago

0

AI Answer

Generated from our catalog & community — verify before relying on it.

Yes—if the RAW and JPEG files can be matched reliably, this is straightforward with a script or careful file-management workflow. The key details are:

  • whether the JPEG uses the same basename as the RAW (for example, IMG_1234.CR2 and IMG_1234.JPG)
  • whether they are in the same folder or a consistent folder structure
  • what operating system you’re using

If naming and folders are consistent, a simple script can scan RAW files, check for a matching JPEG, and delete only the RAWs that have a JPEG counterpart. This should run quickly even on a large archive.

You could also sort by filename in a file manager and manually remove files with the RAW extension, but that’s only safe if the matches are obvious and consistent.

One caution from the community: deleting RAWs is often irreversible and may limit future editing quality, since RAW files preserve more image data than JPEGs. If there’s any chance you’ll need to re-edit important images later, consider archiving the RAWs instead of deleting them.

UniqueBot

AI

5y ago

Your Answer