Don't Sort Photos by Date First

Jul. 1, 2022

Every photographer's organisation system that I've seen online has them sorting photos by date - and only date. This seemed obvious to me until recently, when I sat down to organise my own multi-hundred-GB photo collection before a backup.

TLDR; Organising photos by location first makes retrieval much faster than organising by date only.

My photo organisation was very bare bones. Every time I would empty my memory card after a day shooting, I would create a new folder within /home/pawan/Pictures for the day or occasion, and dump both the RAW and JPEGs into the folder. The name of the folder may carry some hint about the date, or it may not. I was not too concerned about building a formal process amounting to more than this because I only take photos in my spare time.

What this left me with over the years, however, is a flat folder with all my collections. As the number of collections grew, it got more difficult to find things because stuff from all various times and occasions was mixed together with no further hierarchy. Not too great! I wanted to improve this situation before I started uploading about 300GB to my backup destination - a multi-day commitment on my measly 3MB/s up-link.

After giving it some thought - about 30 minutes of it, to be precise - I realised that a purely date-based organisation system was not going to be very friendly. Without going too deep into things, any organisation problem is also a retrieval problem. We only need organisation and sorting because we think we might need to find things in the future. This is certainly true of my photos collection. With this in mind, you probably want to organise your data in a way that makes the kind of retrieval desired as efficient as possible.

As a very simple example - a sorted (in the sense of some default ordering of items) list makes binary search possible - making finding an item in the list very fast. But if you only want to confirm that an item does not exist in a collection, then a probabilistic data structure like Bloom filters or Cuckoo filters can provide even more efficiency with very different organisation of the same data. One can say similar things about a relational (SQL) database and search engines (inverted indices). It all depends on the pragmatics of retrieval. A good organisation system should be able to reduce the search space as fast as possible given the context.

So what are the pragmatics of retrieval in my collection of photos?

  • I remember photos by location (where they were taken) much more clearly than the time.
  • I generally also remember the season something was taken in - which roughly corresponds to the month at a given location.

So the best way to organise my photos is to first organise them by location, and then within a location, organise them by season. Further organisation could be year or month based if I really wanted.

With this in mind, I sat down to sort my photos - but then I quickly realised that there was almost 1:1 correspondence between the location and season when most of my photos were taken - I'm a very seasonal photographer! (This is related to the kind of photos I like to take - but that's out of scope for this post.) But what that meant was that I was not gaining any extra efficiency by adding another layer in the hierarchy for seasons, so I abandoned it.

In the end, my folder organisation looks like this (the locations and dates are fake):

★ 𝞴 tree . -d -L 5
.
└── Fuji
    ├── Canada
    │   ├── Kingston
    │   │   └── 2022
    │   │       └── 09-Kingston
    │   ├── Montreal
    │   │   ├── 2012
    │   │   │   └── 05-Montreal
    │   │   ├── 2013
    │   │   │   └── 07-Montreal
    │   │   └── 2022
    │   │       └── 09-Montreal
    │   ├── Ottawa - Gatineau

The leaf folders are named with the format of <MONTH_ORDINAL>-<NAME> where <NAME> can be free-form and <MONTH_ORDINAL> is the two-digit ordinal position of the month in the calendar (e.g. 07 for July). I chose to keep some temporal metadata in the folder hierarchy because it incurred no additional cost and still allows me to filter roughly by “season” if I wanted to (e.g. summer = all folders beginning with 05-08 in Canada).

That's it! I don't know why people choose to categorise by dates first as the default option. Pictures already (generally) have the date in their EXIF data, and most photo management tools can scan a directory and construct a timeline; so if I really need to find something by date first, I have that option at my disposal. I typically turn off geotagging for privacy reasons, so I can't rely on automated sorting for it. This approach gives me the flexibility of both, without inadvertently exposing location through pictures I post online. More than that, it gives me freedom from relying on any photo management tool for finding things fast - just my file-system is enough!

Your use case may be different from mine - you may be a photojournalist and your sorting system may need to be tied to the publication, in which case a date-first system might make more sense - so evaluate your own use case first. Just don't cargo-cult organisation systems. I'm much more pleased with mine, even if it deviates from the usual.


Share this post on :
Posted under photography productivity