Matt Coneybeare

MC

Amazon Album Covers in Ruby on Rails

| Comments

Lastfm is a free online service that keeps track of the music you listen to. When you play a song in iTunes or another media player, and you have a program that connects to Lastfm, the song information is transferred automatically to the Lastfm servers. This process of sending a song is called “scrobbling”. Other than usage records and stats to track my listening habits, Lastfm has another great use: it gives an xml feed of your ten most-recently-listened-to tracks. I use this on my website to display those tracks for everybody to see. It is updated in near real-time and all automatic.
I used to have just the track title and artist listed, in boring text, but then decided I needed to get the album covers instead. For a little while, I used the picture url contained in the Lastfm feed, but soon realized that a spelling error, mislabled song, or just a rare song would not find a cover. Close to 30% of my songs would not have covers… Unacceptable! I decided to use Amazon’s web services (AWS) to get the pictures using a smarter method of searching… Here’s how.
To get started you need to install the hpricot gem
gem install hpricot
And the Ruby/Amazon libraries.
1
2
3
4
5
6
7
8
cd /tmp
wget http://www.caliban.org/files/ruby/ruby-amazon-0.9.2.tar.gz
gzip -d ruby-amazon-0.9.2.tar.gz
tar -xvf ruby-amazon-0.9.2.tar
cd ruby-amazon-0.9.2
ruby setup.rb config
ruby setup.rb setup
ruby setup.rb install
The Ruby/Amazon library is a well-documented way to easily access amazon services through ruby. After you do this, you want to sign up for your own Webservices account (free) over at Amazon. Now onto the good stuff.
First, put these requires in your environment.rb
1
2
3
4
require 'hpricot'
require 'open-uri'
require 'md5'
require 'amazon/search'
Next, we get the xml feed from Lastfm. You can stream my account if you would like, but I imagine you have your own lastfm account. To get your own feed, just replace ‘coneybeare’ below with your account name. Next I use the hpricot gem to easily extract data out of the xml feed. Then to get the album cover art and the url that points to the amazon page I call the album_cover_fetch. I use a cache system in mine to save both the url’s and the photos, but that is for another post, and for simplicity, I will leave it out.
Put this code in application.rb so all views can use it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
helper_method :get_recent_covers

def get_recent_covers()
    url = "http://ws.audioscrobbler.com/1.0/user/coneybeare/recenttracks.xml"
    recent_tracks = [];
    lastfm_doc = Hpricot.XML(url)
    (lastfm_doc/:track).each do |track|
      recent_track = {}
      recent_track[:artist] = (track/:artist).inner_html
      recent_track[:track] = (track/:name).inner_html
      recent_track[:album] = (track/:album).inner_html
      recent_track[:image], recent_track[:url] = album_cover_fetch(recent_track)
      recent_tracks << recent_track
    end
rescue
  recent_tracks = {}
end
Next is the code for album_cover_fetch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
include Amazon::Search
ASSOCIATES_ID = "INSERT YOUR OWN ID HERE" # Your Amazon Affiliate ID
DEV_TOKEN = "INSERT YOUR OWN TOKEN HERE" # Your Amazon Web Services Key

def album_cover_fetch(song)
  artist, album, track = [song[:artist], song[:album], song[:track]]
  
  @request = Request.new(DEV_TOKEN, ASSOCIATES_ID)
  begin
    @response = @request.artist_search artist
    products = @response.products
  rescue
    # there was no exact match for artist
    products = []
  end
  
  if products.empty?
    return [nil, "http://www.last.fm/music/" +
                 CGI::escape(artist) +
                 "/_/" +
                 CGI::escape(track)]
  end

  products.each do |p|
    if !album.nil? && !album.blank? && matches?(album, p.product_name)
      unless p.tracks.nil?
        if p.tracks.include? track
          return [p.image_url_medium, p.url]
        else # song is not an exact match
          p.tracks.each do |t|
            if matches?(t, track)
              return [p.image_url_medium, p.url]
            end 
          end
        end
      end # track matching
    end # album match
  end # No match yet, lets try to find the song on another album

  product_matches = []
  products.each do |p|
    unless p.tracks.nil?
      p.tracks.each do |t|
        if matches?(t, track, 0)
          #found it somewhere else
          return [p.image_url_medium, p.url]
        elsif matches?(t, track, 2)
          product_matches << p
        end
      end
    end
  end

  product = product_matches.sort{ |a,b| 
    a.sales_rank <=> b.sales_rank
  }.first unless product_matches.empty?
  
  if product.nil? 
    return [nil, "http://www.last.fm/music/" +
                 CGI::escape(artist) +
                 "/_/" +
                 CGI::escape(track)]
  else
    return [product.image_url_medium, product.url]
  end
end
  • Lines 8-15: Here is the call to the Amazon service. You might want to cache this as Amazon requests are not supposed to be generated more than 1 per second. More than likely, the artist will not have new products within a month, so I just use a cache system that keeps the products returned for a month, however, the code is not in this example.
  • Lines 17-22: If there are no products returned for that artist search (spelling mistake, no albums, etc…) then I return nil and a link to the Lastfm page. When nil is returned for a picture url, I use a generic blank cd image in place of an album cover.
  • Lines 24-38: If the products are not nil, and if the album returned by Lastfm is not blank, I locate that album in the product list, then check its track listing for a matching track. If I have found the track, I return, else I check all the tracks and look for matches and similarity (see below)
  • Lines 40-66: If Lastfm was unable to give me an album to look for, or if it did not match any album returned by Amazon, I then loop through all the products and all the tracks, looking for matching/similar tracks and adding them to the product_matches array. I then sort this array by sales rank and return the most popular.
Now lets add the helper method matches? and the string-closeness algorithm, levenshtein
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def matches?(s1, s2, tolerance = 2)
  s1 == s2 ||
  s1 =~ /#{Regexp.escape s2}/i ||
  s2 =~ /#{Regexp.escape s1}/i ||
  (levenshtein(s1, s2) <= tolerance)
end

def levenshtein(str1, str2)
  s, t = [str1.unpack('U*'), str2.unpack('U*')]
  n, m = [s.length, t.length]
  return m if (0 == n)
  return n if (0 == m)
  d = (0..m).to_a
  x = nil
  (0...n).each do |i|
      e = i+1
      (0...m).each do |j|
          cost = (s[i] == t[j]) ? 0 : 1
          x = [d[j+1] + 1, e + 1, d[j] + cost].min
          d[j] = e
          e = x
      end
      d[m] = x
  end
  return x
end
  • Lines 1-6: The matches? method can return a positive match between two strings if any of 4 conditionals are met:
    • 1) They are identical
    • 2) String 1 is part of String 2
    • 3) String 2 is part of String 1
    • 4) They have a levenshtein edit-distance of less than TOLERANCE (i have set at 2)
  • Lines 8-26: The Levenshtein Algorithm is one of the more famous string-closeness algorithms out there. It goes through both strings and calculates how many edits are needed to turn one into the other. For example, ‘Apple’ and ‘apple’ returns 1 meaning only 1 edit, as does ‘Apple’ and “Applw’, as does ‘Apple’ and ‘Aple’. This is a great way for getting correct results despite spelling errors.
Now that all the hard work is done, put this in your view:
1
2
3
<% for track in get_recent_covers(10) %>
  <%= album_cover_tag(track) %>
<% end -%>
… and this in your application_helper.rb:
1
2
3
4
5
6
7
def album_cover_tag(track)
  if track[:image] == nil
    link_to(image_tag("/PATH/TO/YOUR/BLANK/CD/IMAGE"), track[:url])
  else
    link_to(image_tag(track[:image], track[:url])
  end
end
Like I said before, you should probably use some sort of caching with the pics and the api calls. I cache my recent song list for 1 minute, artist products through Amazon for 30 days, and every album cover downloaded is cached forever on disk. I will publish a caching tutorial soonHere’s How. Good Luck!

Comments

My name is Matt Coneybeare, I design and develop for iOS (iPhone, iPad and iPod Touch), Mac OS X and the Web out of New York. In 2008 I started a software company called Urban Apps that has made some pretty popular apps such as Ambiance and Hourly News. My current Stack Overflow reputation is about 27k.

I was a Rockstar a decade ago, but then went back to school and collected a Bachelor's Degree in Computer Science from U.C. Berkeley. Now I am settled down with my beautiful wife Di and our two doggies Hamachi and Foxy. While coding, I walk several miles/day on my Treadmill Desk. When not at my desk, I love exploring New York City as a Yelp Elite, or training for marathons.

Contact information

Name
Matt Coneybeare
Email
Website
Twitter
Instagram
GitHub
LinkedIn
Google+