How can I detect corrupted Canon CR2 raw files in a large archive?
Asked 12/11/2016
2 views
2 answers
0
I discovered that some of my Canon CR2 raw files are corrupted: the EXIF and file structure seem intact, and Lightroom or Adobe DNG Converter may import them without errors, but the actual raw image data shows visual distortions once rendered. The embedded preview initially looks fine, which makes me think only the raw data is damaged.
I have a large archive of about 18,000 files and want to know whether CR2 files contain checksums or any built-in integrity mechanism that can be verified. Is there a reliable way to identify corrupted CR2 files without manually inspecting every image?
Originally by user59122. Source · Licensed CC BY-SA 4.0
user59122
9y ago
2 Answers
2
This has gone unanswered for along time so I'll take a shot at a summary.
Most raw files, including as best I know Canon, do not include a checksum in the original file as delivered by the camera, so there is no sure way to know a given copy is correct. Some may be so damaged they fail to import, but often they are wrong when viewed but still "work" in most programs.
I believe the closest most of us come to a workable solution to the issue is to (1) review the files as soon as you can, and fix any that have issues, plus (2) use a workflow that is designed to detect and correct "bit rot" afterwards.
Often if detected early, while a copy of the image(s) are still on the card, recovery software or just downloading again may correct the issue. Obviously the problem is noticing quickly, and I have no good solution there other than review soon after shooting.
The second half is, in my opinion, even more important - digital images can "rot" over time. Yes, they can become corrupt just sitting there with you doing nothing to them. Copying to replacement computers, disks, edits that go bad, corruption on disks (e.g. index problems that cause file A to also be saved over top of part of file B), and just disk read errors can silently corrupt images. Prolific photographers can quickly start pushing the expected error rate of hard drives as they collect terabytes of data; and some such errors are not detected by hardware, and even if detected by hardware the desktop operating systems we use tend not to handle them reliably (or even report them at times).
A number of techniques can address this:
1) Use tools that do checksums of the file once loaded, and check them against recalculated checksums later. Image Verifier was a commercial tool for that, and others have contributed public domain tools (I wrote one called LR Validate for windows/Lightroom).
2) Use file systems that build in such detection, e.g. zfs and btrfs, and microsoft's new refs (I personally think it is too immature). A "zfs scrub" for example will check every file to see if it still checksums the same.
3) Convert to DNG and use the checksum validation (this is only a partial validation of the file contents; it checks the image portion but not all the metadata, so it may detect image corruption over time, but fail to report that your develop settings were lost). This is the easiest technique for most people (though it comes at a cost of using DNG -- I personally do not do so).
All of these are predicated on the idea you do backups, and when a "rotten" situation is detected can recover from one of (presumably) several copies backed up.
However, I realize this does not address the core issue of "if I may have allowed corruption in, how can I find them now". The best suggestion I have is to start with a mass DNG conversion (you need not do this to keep, you can just run the program standalone against folders to an output folder you plan to trash). Image Verifier, if I recall, does this internally. This will not catch all errors, but it may find some.
The rest, I believe you will need to find by reviewing by eye. You can just let lightroom (if you use it) rebuild the previews and scan them by eye; gross problems may jump out, minor ones likely will not.
Originally by user28109. Source · Licensed CC BY-SA 4.0
user28109
9y ago
0
Generated from our catalog & community — verify before relying on it.
Generally, no: camera-generated CR2 files typically do not contain a built-in checksum you can use later to verify that the raw image data is still intact. That means a file can have valid metadata and structure, import into Lightroom, and still render with image corruption.
So there usually isn’t a reliable tool that can scan a large CR2 archive and definitively flag every visually corrupted file just from the file itself.
What you can do:
- Review images soon after download, while the originals are still on the memory card.
- If corruption is found early, try copying again from the card or using recovery software.
- Use a storage/backup workflow designed to detect bit rot over time, rather than relying on the CR2 format itself.
In short, for existing CR2s, visual inspection or re-rendering is often the only practical way to catch this kind of corruption, because the embedded JPEG preview may look fine even when the raw data is damaged.
Recommended products
UniqueBot
AI9y ago
Your Answer
Related Questions
Can partially corrupted Canon CR2 files be repaired, or at least have the preview image extracted?
How can I find duplicate photos when some recovered JPEGs are partially corrupted?
How can I batch-check Nikon NEF files for corruption?
Why does a Canon CR2 raw file contain a small uncompressed TIFF preview?
How can I view Canon 6D Mark II .CR2 RAW files and thumbnails in Windows 10?