The past couple of weeks I've had the good fortune to work with a great client on a Rails app that gets a lot of traffic. His server was groaning, and the wheels were starting to fall off the cart. So we dug in. And when we were done, we had a pretty cool caching setup that I wanted to share.
Here's the scene: The site is getting millions of pageviews per month, the original developer is off doing other things, and the site needs some help. My first question was, "are you caching?" Yes, caching was in place -- action caching, in fact. And the cache was working quite well... except when new items got published and the entire cache needed to get flushed. Then things went boom. It didn't help that the cache was being stored on the file system.
After re-reading Tobi's excellent post on the topic, I decided that trying to figure out the expiry issues was the way to madness, so I set about implementing his tips for asking for the most up-to-date content by going to the database first for a freshness check before serving content. But I was using (and wanted to continue to use) action caching, and Tobi's method wasn't a perfect fit for that scenario. I needed to be able to force a cache miss (if necessary) before the action got invoked. Enter the Action Cache plugin.
This plugin is a drop-in replacement for the default action cache in Rails, but it has additional features like sending a 304 response for unchanged content and providing callbacks to determine whether an action should be cached, and what the cache key should be. Bingo. Oh, and I also used the Memcached Fragments plugin to make Rails caching play nicely with the memcache-client gem. Onward.
So, I created my cache key callback in method in a few places like so:
class NewsController < ApplicationController
caches_action :index
protected
# Pull the latest fragment based on freshness
def action_fragment_key(options)
"#{url_for(options).split('://').last}:#{Post.latest_date.to_i}"
end
end
The default cache key is the part of the string before the :, and I'm doing a freshness check to make sure that any time a post is changed a new cache gets generated. So far this is pretty straightforward, and it looks a lot like what Geoffrey is doing. However, as people have commented in those two blog posts, this means that you still have a database hit for every page, so it's not as fast as it could be. In fact, if that Post.latest_date is call is expensive (and it can be, depending on how you define what the latest post is), you may be causing yourself some serious pain. This is when you can use memcached for some additional fun.
What if you cache the freshness check too, but for a set period of time? Then you can have the best of both worlds. You don't have to hit the database for the freshness check with every page load, and you can still make sure that when that freshness check fails (albeit a little delayed) you'll get a cache miss and fresh content.
So, as long as memcache_util is required somewhere (at the same spot you're setting up the memcached connection and the CACHE constant wouldn't hurt), you can do something like this:
class Post < ActiveRecord::Base
def self.latest_date
Cache.get("latest_post", CACHE_TTL_IN_SECONDS) do
connection.select_value("some potentially expensive query that returns a datetime column")
end.to_time rescue Time.now.utc
end
end
That will check memcached for the timestamp of the latest change, pulling it from the database if there was a cache miss or expiration.
So, a brilliant idea from my client, Jed (to use the hybrid approach), plus some excellent plugins and gems, makes this a really peppy caching configuration.
Comments