Matt Coneybeare

MC

Make Your Own Cache System in Ruby on Rails

| Comments

In a previous post I explained how to setup and use Amazon’s Webservices to find and display album covers in Ruby on Rails. For the purposes of simplicity in that post, I left out the cache code. In a nutshell, the album art process happens in 3 steps.
  1. I get the xml data for my most recent tracks from Lastfm.com
  2. I query Amazon with the artist, album and track information to get the url for the photo
  3. I get the photo from that url
All three of these steps are cached by me in my system. I don’t need instant updates of my music so in step 1, I cache the URL and set it to expire in 60 seconds. I don’t expect artists to release new products that often, so I cache the amazon data returned for 30 days in step 2. Lastly, an album cover never changes, so I cache that pic forever on disk. Let’s dive right in shall we?
First I create a new model called “Fetcher”
script/generate model fetcher
All the methods will be static methods because it really makes no sense to have multiple instances of our in-memory cache. We will also introduce code to make sure a Fetcher is a singleton. In my initialization method “get_fetcher”, I also setup the base directory for where I will store my downloaded album covers and flickr pics.
1
2
3
4
5
6
7
def self.get_fetcher
  $cache = {} unless $cache
  $album_cache_dir = "#{RAILS_ROOT}/public/images/album_cache"
  $photo_cache_dir = "#{RAILS_ROOT}/public/images/flickr_cache"
  $fixed_cache_dir = "album_cache"
  $fixed_photo_dir = 'flickr_cache'
end
I need to make sure any controller that might use the fetcher has an initialized one to start off with so I put this code in all the controllers
before_filter :get_fetcher
And this in application.rb
1
2
3
4
helper_method :get_fetcher
def get_fetcher
  Fetcher.get_fetcher
end

Caching a URL

In config/environment.rb:
require 'net/http'
And in models/fetcher.rb:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def self.url_fetch(url, max_age=0)
  # if the API URL exists as a key in cache, we just return it
  # we also make sure the data is fresh
  if $cache.has_key? url && Time.now - $cache[url][0] < max_age 
    return $cache[url][1]
  end
  
  # if the URL does not exist in cache or the data is not fresh,
  #  we fetch again and store in cache
  cached_url = Net::HTTP.get_response(URI.parse(url)).body
  $cache[url] = [Time.now, cached_url]
  cached_url
rescue
  return $cache[url][1]
end
So the “cache” is really just an in-memory hash with an array stored that has 2 elements. The time created and the data. This method is called from anywhere you need to fetch the data from a url. To get the data from the lastfm xml feed, I called it from my get_recent_covers method:
1
2
3
4
5
def get_recent_covers(num)
  url = "http://ws.audioscrobbler.com/1.0/user/coneybeare/recenttracks.xml"
  lastfm_doc = Fetcher.url_fetch(url, 60)) # 1 minute
  ...
end

Caching the Amazon Product Data

Similar to the url caching, the amazon product cache is an in memory cache that expires after 30 days. I am going to try and change this to an on disk method to save resources, but I just haven’t had the time to think about it. Maybe a reader will have a good idea (hint, hint). For more information in setting up your code to use amazon’s webservices, see this post. Put this in your models/fetcher.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def self.amazon_fetch(artist, max_age=0)
  # if the artist exists as a key in cache, we just return it
  # we also make sure the data is fresh
  if $cache.has_key? artist
    return $cache[artist][1] if Time.now - $cache[artist][0] < max_age
  end
  # if the artist does not exist in cache or the data is not fresh,
  #  we fetch again and store in cache
  @request = Request.new(DEV_TOKEN, ASSOCIATES_ID)
  begin
    @response = @request.artist_search artist
    products = @response.products
  rescue
    # there was no exact match for artist
    products = []
  end
    $cache[artist] = [Time.now, products]
  puts products.
  return products
rescue
  # on any error, return what I had before
  $cache[artist][1] ? $cache[artist][1] : nil
end
I call this method by doing something like this:
1
2
3
4
def album_cover_fetch(artist)
  amazon_products = Fetcher.amazon_fetch(artist, 108000) #30 days
  ...
end

Caching the Photos

I use my fetcher model to store the photos of my album covers, as well as flickr photos, on my server so that the pages load faster without having to go elsewhere and get the pic. Put this code in models/fetcher.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def self.get_pic(artist, album, url)
  unless url.nil?
    file = MD5.hexdigest(artist + album)
    file_path = File.join($album_cache_dir, file + ".jpg")
    # we check if the file (a MD5 hexdigest of the artist and album)
    #  exists in the dir. If it does and the data is fresh, we just read
    #  data from the file and return
    if File.exists? file_path
      return File.join($fixed_cache_dir, file + ".jpg")
    end
    # if the file does not exist (or if the data is not fresh), we
    #  make an HTTP request and save it to a file
    File.open(file_path, "w") do |data|
     data.write(Net::HTTP.get_response(URI.parse(url)).body)
    end
    file_path = File.join($fixed_cache_dir, file + ".jpg")
  end
end
This gets called as follows:
1
2
3
4
5
def album_cover_fetch(artist)
  ...
  album_cover = Fetcher.get_pic(artist, p.product_name, p.image_url_medium)
  ...
end
This code returns the local location of your cached (or newly downloaded) image. Any artist+album string that is the same will get the same cover and any new file is downloaded and saved in the directory of your choosing. Like I said before, I use flickr in the same fashion, but for the sake of redundancy (and maybe even a future post of my flickr code) I will omit that in this post. Good Luck!

Comments

My name is Matt Coneybeare, I design and develop for iOS (iPhone, iPad and iPod Touch), Mac OS X and the Web out of New York. In 2008 I started a software company called Urban Apps that has made some pretty popular apps such as Ambiance and Hourly News. My current Stack Overflow reputation is about 21k.

I was a Rockstar a decade ago, but then went back to school and collected a Bachelor's Degree in Computer Science from U.C. Berkeley. Now I am settled down with my beautiful wife Di and our three doggies Hamachi, Foxy and Millie. While coding, I walk at least 16 miles/day on my Treadmill Desk. When not at my desk, I love exploring New York City as a Yelp Elite.

Contact information

Name
Matt Coneybeare
Email
Website
Twitter
App.net
Instagram
GitHub
Wizpert
Google+