How can I match unorganized RAW files to an already organized JPEG library?

Asked 1/16/2023

3 views

2 answers

0

I have a Lightroom library of about 90,000 organized JPEGs. Over the years I renamed files, corrected capture dates, added ratings and keywords, and removed duplicates. I’ve now received many original RAW files that correspond to images already in that JPEG library, but the RAWs are completely unorganized.

The problem is that filenames and capture dates often no longer match, so simple metadata matching may not work. I’m looking for a way to associate each RAW with its corresponding JPEG, ideally by image content if needed.

Are there tools or workflows that can help match a folder of RAW files to an existing organized JPEG collection?

Originally by Photography Stack Exchange contributor. Source · Licensed CC BY-SA 4.0

Photography Stack Exchange contributor

3y ago

2 Answers

2

Not a full answer because I cannot suggest a tool.

If you didn't destroy the EXIF, there is plenty of information in there which, in combination, is possibly fairly unique per picture:

  • Camera model and serial number will discriminate pictures from different cameras.
  • With the same camera a combination of ISO, Exposure, Aperture, Measured EV, Focus distance, Focal length, and possibly some AF information should be fairly unique and be encodes identically in the JPEG and the raw file.

Experiments:

Generate data for the collection on my hard disk:

exiftool -progress -r -ISO -Aperture -ExposureTime -MeasuredEV -MeasuredEV2 -FocusDistanceLower -FocusDistanceUpper -FocalLength -csv . > allData.csv

Churn all the data with a quick Python script:

#! /usr/bin/python3

import sys,os

# Key is EXIF data, value is the file name
jpegs={}
raws={}

collisions=0

with open(sys.argv[1],'r') as file:
    for l in file:
        name,data=l.split(',',1)
        if 'IMG_' not in name:
            continue
        if name.endswith('.JPG'):
            if data in jpegs:
                collisions+=1
                print(f'{name} collides with {jpegs[data]}')
            else:
                jpegs[data]=name
        elif name.endswith('.CR2'):
            if data not in raws:
                raws[data]=name
        else:
            pass; # ignore other types 

print(f'Jpegs: {len(jpegs):d}, Raws: {len(raws):d}')
print(f'Collisions: {collisions:d}')

orphanRaws=0
for data,name in raws.items():
    if data not in jpegs:
        orphanRaws+=1
    
print(f'Orphan raws: {orphanRaws:d}')

` On my collection this simplistic approach yield 12% collisions, but....

  • I found that I had a significant number of duplicates
  • Most collisions come from either bursts or pictures taken in all manual mode during the same session, so these are at least the same subject.

The last test n the script also shows that given the EXIF, most CR2s can be linked back to their JPG (and those that are not turn out to be culling leftovers that should be erased)

Originally by user75947. Source · Licensed CC BY-SA 4.0

user75947

3y ago

0

AI Answer

Generated from our catalog & community — verify before relying on it.

Yes—this is possible, but the best approach depends on what metadata remains.

First, check metadata before relying on image-content matching. JPEGs may still contain fields such as PreservedFileName, RawFileName, DateTimeOriginal, or CreateDate. If so, tools like ExifTool can export metadata from both sets and help you match files in a table.

Even if filenames and dates differ, other EXIF fields can sometimes uniquely identify a match: camera model/serial, ISO, aperture, shutter speed, focal length, measured EV, focus distance, and AF-related data. ExifTool can extract these for comparison.

If metadata is incomplete, a DAM tool with visual matching can help. IMatch was specifically recommended as capable of matching images by visual content, and ACDSee may also offer content-based comparison.

So the practical workflow is:

  1. Inspect JPEGs for preserved original/raw filename tags.
  2. If absent, export EXIF from both RAW and JPEG collections and match on a combination of capture settings.
  3. Use a visual-matching DAM tool if metadata matching isn’t enough.

There doesn’t appear to be a widely known automatic one-click tool dedicated specifically to RAW-to-JPEG pairing, but these methods should get you close.

UniqueBot

AI

3y ago

Your Answer