Processing photos on demand with Photon

  • By Michael Fairley
  • August 3, 2012

Photon in action

I’m excited to announce that we’re open sourcing the core of our on-demand photo processing pipeline: photon-core.

At 1000memories, photos are king. We help families and groups of friends share old paper photographs with one another, and our users expect to see their photos quickly and with high quality. We’ve spent a lot of time perfecting the processing pipeline for the photos on our site, and we’ve found an approach that gives everyone (our users & our engineers) a great experience.

Images play an important role on most websites today, be it user avatars, photos of products, or funny memes. Photon-core makes it easy for sites of any size to do on-the-fly processing of images.

Processing photos (the hard way)

For the first ~2 years of our existence, we processed our photos on delayed jobs using ImageMagick (using paperclip) into 12 or so different styles (mostly different sizes used on our website and our iPhone app (both retina and non-retina densities)). Processing into these 12 different sizes took about 10 seconds, which was painful for a handful of reasons:

  1. Slowness – When our users would upload a photo, they would see a spinner for 10+ seconds. There was a similar delay when rotating and cropping images.
  2. Engineering pain – As we got ready to release the Android version of ShoeBox, we needed some new sizes of the photos, and we had to backfill the new sizes for all the old photos, an annoying process that took 100 resque processes 1.5 days to finish.
  3. Waste – Even though we generated these 12 different styles, very rarely did all of the sizes ever get displayed (e.g. most of our iPhone users have retina displays, so the non-retina sizes are rarely used), wasting both storage space and CPU.

Enter Photon

We decided that processing our images on the fly was a solution to all of these problems. We built a small Dropwizard service called Photon that handles almost all of our image processing with a pull-based model and lets our users see their photos < 1 second after upload.

With our new pipeline, when a user uploads a photo, our web process immediately puts it on S3 without doing any processing.

Our backend then spits out URI templates that look like http://photon-example.herokuapp.com/michaelfairley;w={width}. Each of our frontends (web, iPhone, Android) can fill in the width parameter with the width (in pixels) of the photo it needs, resulting in a URL like http://photon-example.herokuapp.com/michaelfairley;w=200, and uses that as the source of an <img> tag (or our custon URLImageView on iOS).

Photon can also handle cropping and rotating with additional matrix parameters. Now, when one of our users rotates a photo, we just store the rotation angle in our database and use it when building the URL for Photon: http://photon-example.herokuapp.com/michaelfairley;w=200;r=180.

Photon-example

Photon-example is an example app that uses photon-core to process Twitter avatars. Look at the examples, and play around with your own avatar.

Caching

Processing an image file is fairly slow, often taking 500ms, so we want to avoid doing it as much as possible. The URLs that photon uses are constructed to be completely deterministic. That is, GETing the same url will always return exactly the same photo. Consequently, we can cache the heck out of the results.

In our setup, we have a Varnish layer in front of Photon that holds a single cached copy of each image that has been requested in the past few days, as well as a CDN in front of Varnish for an additional layer of caching (and the obvious performance benefits that come from a CDN). This caching strategy gets our photos to our users screen much more quickly and significantly reduces the number of machines we need running Photon.

Issues

We’ve bumped into a few rare cases where photon didn’t work quite as well as the old imagemagick pipeline, but we found a workaround that we’re happy with.

  1. Formats – The Java ImageIO API doesn’t work for some images (less common formats, weird color spaces, etc.)
  2. Size – Our users upload huge photos. We have photos > 100 megapixels/100MB (yes that’s > 400MB in memory once it’s decompressed into a bitmap). It’s not practical to process photos this big each time they’re needed.

We solved both of these by putting a mechanism in place to use paperclip & imagemagick to process photos that meet the corner cases down into ~1MB JPEGs, that we then use as the input to Photon.

Photon-core

We’ve pulled out the parts of photon that aren’t specific to our setup into photon-core, making it easy for anyone to set up an on-the-fly photo processing service without too much difficulty. Check out photon-example to see how to use it.

Feel free to get in touch with us if you have any questions/comments/feedback, and we certainly welcome any pull requests or issues on the github project.

Most popular stories

  1. How many people have ever lived?
  2. How many photos have ever been taken?
  3. Old-School Instagram Filters
  4. Introducing ShoeBox for iPhone: A scanner in your pocket
  5. 2011 Year in Review

Like us on Facebook

Follow us on Twitter

Subscribe to our blog

Rss