How should I organize a large photo library with overlapping research projects?
Asked 6/23/2019
7 views
2 answers
0
I have about 10,000 research photos for a biology PhD, with more coming from future field seasons. The images belong to 4–5 main projects, each with sub-projects, and some photos are relevant to more than one project. I need a structure that makes it easy to find images by project, sub-project, and general topic without duplicating work.
Would you recommend organizing the files primarily by date, by project, or by taxonomy/topic? Is it better to keep a simple folder structure and use keywords/metadata for the overlapping categories? I’m on Linux and have tried digiKam, but I’m mainly looking for a general strategy rather than OS-specific advice.
Originally by Photography Stack Exchange contributor. Source · Licensed CC BY-SA 4.0
Photography Stack Exchange contributor
7y ago
2 Answers
2
First step should be to conduct a detailed review of the data to establish solid project expectations so you can make these sort of decisions with confidence.
Key thing we're looking to answer is:
- How much overlap can an image have between sub-projects?
- How much 'meta data' do we want to embed in the file structure vs an external storage/library file?
- How much work would we want to do by ourselves, vs how much would we rather push off onto the computer to do for us...
Personally I would use Windows with Lightroom as an image library management solution, but this is less than ideal if you intend to remain within a Linux ecosystem.
However tools like Lightroom are kind of a bloated option with a lot of extra features that we probably don't need for this kind of project.
In a linux environment we may be better off scripting much of the handling ourselves over relying on ready-made tools. [It is also an excellent skill building task that gives highly useful experience in data-management.]
Manually sorting images into folders is 'less than ideal', prone to error, and awkward to reliably correct. This is especially true if there are large overlaps in sub-projects that any given image is likely to be involved in, or if you decide a major change is required later on.
- When dealing with data, we can waste our time, or we can waste a computers time. Choose wisely.
Keywording and Database interaction is a far more robust option than getting overly complicated with folders. Unless there is effectively zero overlap in sub-projects that an image is included in, then it is far better to allow the computer to "do the sorting" for us.
Keep the core of the archive simple with a standard timestamp based file structure.
Project/Year/Month/Day/[timestamped_filename]
or even just
Project/Year_Month_Day/[timestamped_filename]
From here we are going to want either existing software that can act like Lightroom or another Image Cataloging software, or create scripts to deal with things for us.
General workflow would be:
- Import images [or their file names] into a database. [flagged as 'new']
- Keyword and add meta-data as required by project. [Once finished, remove 'new' flag. Even in something like Lightroom we want to maintain a clear indicator of whether the entry for a given image is 'finished' or if it needs more work before it is ready to move forward within the project.]
- Define 'Views' of the data based on the above keywording and metadata to select the specific images needed for a given state of the project.
Project file structure then looks like:
\Project\
-\Core Image folder\[Subfolders]
-\Library, database, or Metadata\
-\Temporary Exports or 'views' folders\ {Flexible data generated on the fly as needed}
If you are comfortable with scripting and simple databases, then it is fairly easy to build out a basic toolchain yourself to generate view folders that contain symlinks back to the original source image if you don't require robust image review/editing tools from a more complex piece of software along the lines of Lightroom.
The specifics on how to implement something like this yourself easily vary, but the heart of it would be to define your target for a specific grouping based on a database query, which you then pass through a file management script.
Select all images that had keyword Alpha to create a list, and use that list to pull copies/links into a Sub-Project View Folder without changing data stored in the Core Image Folder.
If requirements of a sub-project change, then the old folder can be deleted and a replacement generated on the fly without having to manually copy and paste files to or from folders.
Remember to backup your core data and metadata! Exports/views can be regenerated on the fly as needed, but you want to make sure the original data is hard to lose or corrupt.
Originally by user82321. Source · Licensed CC BY-SA 4.0
user82321
7y ago
0
Generated from our catalog & community — verify before relying on it.
For overlapping projects, the safest approach is usually:
- Keep one primary file structure that is simple and consistent — often by date and possibly location, with clear filenames.
- Use metadata/keywords for everything that overlaps, such as project, sub-project, taxonomy, and general-interest categories.
- Let software build virtual groups from those keywords (for example, keyword hierarchies or smart collections).
This avoids forcing each image into only one folder when it may belong to several projects.
If you prefer folder-based browsing, another option on Linux is to keep one master set of files and create hard-linked copies in additional project folders. That lets the same image appear in multiple organizational views without storing multiple independent copies.
So, in practice: use folders for a stable “physical” organization, and use tags/keywords for the many-to-many relationships. A date-based master structure with metadata for project and subject is likely to stay manageable as your library grows.
Recommended products
UniqueBot
AI7y ago
Your Answer
Related Questions
How do I import an existing folder hierarchy into Lightroom without losing its organization?
Why does Lightroom show fewer lens profiles for JPEG files?
How should I organize personal photos in Aperture: folders, projects, or keywords?
How can I relink missing Lightroom photos that are stuck in hidden or 'ghost' folders?
Why can Autopano stitch a large 360° panorama more reliably than Hugin?