Wrap your SQL head around Riak's Map-Reduce

One of the conceptual issues you’ll face when switching your persistence layer from SQL over to Riak is grokking map-reduce, so let’s take a pretty predictable query from SQL and convert it to Riak’s Javascript map-reduce.

SELECT addresses.state, COUNT(*)
  FROM people INNER JOIN addresses ON people.id = addresses.person_id
  WHERE people.age < 18
  GROUP BY addresses.state

This query finds how many minors are in each state, using a join and a grouping. Now, assuming you stored this data as JSON in Riak, you’d probably denormalize that belongs_to :person relationship by storing the address(es) in the person record, effectively removing the join. Let’s build a plan for this query:

  1. Query over all the objects in the ‘people’ bucket (input)
  2. Grab the data out of the object as JSON (map)
  3. Filter out the people 18 years or older (map)
  4. Add a counter for the person’s state (map)
  5. Sum those counts for each state (reduce)

Now, you could easily collapse those map phases, but for clarity I’m going to keep them separate. Since each map-reduce job is submitted as a JSON object, I’ll build that object in a couple of snippets which we’ll join together at the end. First the input, which should either be a list of bucket/key pairs or in our case, the name of a bucket:

{"input":"people",

Riak lets you specify each map or reduce phase as a Javascript function literal, the name of a built-in, a bucket-key pair where the function is stored, or the module-function pair of an Erlang function. For our first phase, we’ll use a built-in that extracts the object data and returns its evaluated JSON contents:

"query":[{"map":{"language":"javascript", "name":"Riak.mapValuesJson"}},

Now we’ll return only the objects for minors (line-breaks are added for clarity — you’d need to escape them for it to be valid JSON).

{"map":{
   "language":"javascript",
   "source":"function(value, keyData, arg){ 
     if(value.age && value.age < 18) 
       return [value]; 
     else 
       return []; 
    }"
  }
},

This code might seem familiar if you’ve done CouchDB views, except that you don’t call emit() with key/value data, you just return a list of the values you want. Now let’s extract the state name and count:

{"map":{
   "language":"javascript",
   "source":"function(value, keyData, arg){ 
     if(value.address && value.address.state) 
       return [{value.address.state: 1}]; 
     else 
       return []; 
    }"
  }
},

Map functions only work on one item at a time, so we only add one to the count. Now, the output from this phase will look something like: [{"AL":1},{"CO":1},{"NY":1}...]. Our final reduce phase will merge those JSON objects into a single one, summing the counts.

{"reduce":{
   "language":"javascript",
   "source":"function(values, arg){ 
     return values.reduce(function(acc, item){
       for(state in item){
         if(acc[state])
          acc[state] += item[state];
         else
          acc[state] = item[state];
       }
       return acc;
     });
    }"
  }
}]}

Now if we submit a POST request to the map-reduce resource (/mapred by default), we can get the results of this query. Here’s the complete POST request body (with line-breaks maintained for clarity):


{ "input":"people",
  "query":[
    {"map":{"language":"javascript", "name":"Riak.mapValuesJson"}},
    {"map":{
       "language":"javascript",
       "source":"function(value, keyData, arg){ 
         if(value.age && value.age < 18) 
           return [value]; 
         else 
           return []; 
        }"
      }
    },
    {"map":{
       "language":"javascript",
       "source":"function(value, keyData, arg){ 
         if(value.address && value.address.state) 
           return [{value.address.state: 1}]; 
         else 
           return []; 
        }"
      }
    },
    {"reduce":{
       "language":"javascript",
       "source":"function(values, arg){ 
         return values.reduce(function(acc, item){
           for(state in item){
             if(acc[state])
              acc[state] += item[state];
             else
              acc[state] = item[state];
           }
           return acc;
         });
        }"
      }
    }
  ]
}

This is just a taste of the kinds of queries and data workflows you can create inside Riak. I hope it’s obvious that it will be easy to generalize the Javascript map and reduce functions we created above, and to build up a library of them that can tackle many problems in a composable, modular way.

Why Riak should power your next Rails app

Last fall, I heard about Riak and thought it sounded awesome, and it boasts a lot of neat features that I’ll go into detail about below. I was further impressed when I met the Basho team at nosqleast in Atlanta. They really seem to know what they’re doing.

In December, John Nunemaker wrote a post entitled Why I think MongoDB is to Databases what Rails was to Frameworks. I commented on twitter:

If as @jnunemaker says, #mongodb is like Rails for databases, then #riak is like Merb.

His reply was quite humorous (no, John, Riak won’t merge into MongoDB), but it got me thinking. So here are my…

7 Reasons why Riak is the Merb of databases (or better)

…and why it should be the database for your next Rails app.

Scales horizontally without pain

Riak runs just about in the same way on 1 node as it does 100. This is because, in short, it is an implementation of Amazon’s Dynamo. Need more storage capacity, throughput and availability? Add more nodes.

Now, a lot of people throw around the word “scalability” without really knowing what it means. It’s really a ratio of the desired properties of your system to the cost inherent in achieving those goals. High scalability means you have a low cost in proportion to your ability to grow. It has nothing to do with performance (although good performance can be a side-effect of good scalability). By this measure, actually, Rails scales well (shared nothing) — but that doesn’t mean it’s always performant (slow Ruby implementations, large memory consumption, etc). Despite the trolls, it’s pretty well known that Twitter’s scaling problems were not really caused by Rails.

So what makes Riak more capable of handling growth than MongoDB, CouchDB or MySQL? It’s built with master-less replication in mind from the beginning. When you grow a MongoDB or MySQL database, you typically add a slave server which receives and replays the transaction logs of the master server. Ideally, this slave can take over if the master goes down. There are also master-master setups and other ways of configuring replication, but in general it’s a thing you add on, not built-in.

Instead, Riak has no notion of a master database node. Every node participates equally in the cluster and spreads the load in a predictable way using consistent hashing (another feature of Dynamo). So yes, increasing capacity and throughput and all those other things you expect from your database is as easy as starting up new nodes and having them join the ring. Data will be replicated in the background. You can even run Riak nodes on less-powerful machines (think EC2) and get decent results. Large, monolithic DB servers are the last vestige of the mainframe era, so democratize your database layer!

HTTP-Compliant REST interface

Rails nerds like me love REST, but most don’t really understand what it means, or rather think that it is a one-to-one mapping to CRUD. Rails’ concept of REST is at best incomplete and potentially misleading. (See also Scott Raymond’s RailsConf 2007 talk)

The Riak developers spent a lot of time creating an awesome toolkit in Erlang for building RFC-compliant HTTP resources, and Riak has several of them. This means that when interacting with Riak, you use HTTP in a natural way and get predictable responses — even the hard stuff like content-negotiation, entity tags and conditional methods.

The side effect of composing Riak of well-behaved HTTP resources is that it plays very nicely with other HTTP-related infrastructure, including proxies, load balancers, caches, and clients of all stripes.

Robust failure recovery

One of Riak’s most compelling features is that it handles node failure robustly. If a node goes down in your cluster, its replicas will take over for it until it comes back, a feature known as hinted handoff. If it doesn’t come back — as happens all too often on EC2 — you can add a new node and the cluster will rebalance. The failure recovery story for MySQL, even in a replicated scenario, is much more difficult, time-consuming, and costly.

Riak has fine-grained robustness as well. It was built primarily in Erlang and with the OTP Design Principles in mind. One aspect of the OTP design is that the processes in the system are organized in a supervision tree, so that when one crashes, it will be restarted by its supervisor process (which is in turn supervised). Send some bad data to the web interface and get a 500 error? That internal error doesn’t crash the whole system, bringing your database node down.

Justin Sheehy, Basho’s CTO, tells a story of a customer whose cluster had two nodes go down late one night. The Basho team looked for less than 10 minutes at the cluster, verified that it was still going, and then decided to wait until the morning to fix the lost nodes. As Joe Armstrong, Father of Erlang, says (paraphrased), “In order to have fault tolerance, you need more than one computer.” Riak embraces that philosophy.

Links make graph-like structures possible

Most Dynamo implementations are simple key-value stores. While that’s still pretty useful — Dynomite is really awesome at storing large binary objects, for example — sometimes you need to find things without a priori knowledge of the key.

The first way that Riak deals with this is with link-walking. Every datum stored in Riak can have one-way relationships to other data via the Link HTTP header. In the canonical example, you know the key of a band that you have stored in the “artists” bucket (Riak buckets are like database tables or S3 buckets). If that artist is linked to its albums, which are in turn linked to the tracks on the albums, you can find all of the tracks produced in a single request. As I’ll describe in the next section, this is much less painful than a JOIN in SQL because each item is operated on independently, rather than a table at a time. Here’s what that query would look like:

GET /raw/artists/TheBeatles/albums,_,_/tracks,_,1

“/raw” is the top of the URL namespace, “artists” is the bucket, “TheBeatles” is the source object key. What follows are match specifications for which links to follow, in the form of bucket,tag,keep triples, where underscores match anything. The third parameter, “keep” says to return results from that step, meaning that you can retrieve results from any step you want, in any combination. I don’t know about you, but to me that feels more natural than this:

SELECT tracks.* FROM tracks
  INNER JOIN albums ON tracks.album_id =  albums.id
  INNER JOIN artists ON albums.artist_id = artists.id
  WHERE artists.name = "The Beatles"

The caveat of links is that they are inherently unidirectional, but this can be overcome with little difficulty in your application. Without referential integrity constraints in your SQL database (which ActiveRecord has made painful in the past), you have no solid guarantee that your DELETE or UPDATE won’t cause a row to become orphaned, anyway. We’re kind of spoiled because ActiveRecord handles the linkage of associations automatically.

The place where the link-walking feature really shines is in self-referential and deep transitive relationships (think has_many :through writ large). Since you don’t have to create a virtual table via a JOIN and alias different versions of the same table, you can easily do things like social network graphs (friends-of-friends-of-friends), and data structures like trees and lists.

Powerful Map-Reduce and soon, Lucene search

The second way to find data stored in Riak is using Map-Reduce. Unlike links, where you have to start with a known key, you can run map-reduce jobs over any number of keys or an entire bucket.

One thing that might not be immediately obvious when using an SQL database is that the declarative nature of the query language hides the imperative nature of performing the query. SQL databases have sophisticated query planners that decompose your query into several steps (see EXPLAIN command on MySQL) and attempt to optimize the order of those steps based on known table properties, like indices, keys, and row counts. Riak’s map-reduce, while still largely declarative, lets you decide which order to run steps in. You trade a little abstraction for a lot of power.

Riak’s map-reduce is also quite different from CouchDB’s views:

  1. You’re not trying to build an index which you’ll query later, you’re performing the query.
  2. You can have any number and combination of map, reduce and link phases (link is actually a special case of map).
  3. Since it’s not an index, your query need not be contiguous across the keyspace.
  4. There’s no concept of re-reduce because reduce phases are only run once.

Up until recently, you could only write map-reduce jobs in Erlang, but thanks to Kevin Smith’s awesome work, you can now write jobs in Javascript and submit them over the HTTP interface. This opens up a lot of doors to developers who aren’t familiar with Erlang but use Javascript every day.

Although not complete or released, Basho also has an awesome Lucene-compatible search system in the pipeline (already in use at Collecta, or so I hear). I saw it in action this past week while in Boston and was impressed. It’s fairly comparable in performance to Solr for large datasets, but scales across your Riak cluster.

Beyond schema-less: Content-type agnostic

One thing that may seem unintuitive at first is that Riak doesn’t care what type of content you put in it. CouchDB and MongoDB use JSON and BSON (Binary JSON), respectively, to store objects. Instead of forcing a format, Riak lets you store pretty much anything. There’s no concept of “attachments” or “GridFS” (unless you add it yourself), so just store the file directly in a bucket/key with a PUT or POST request. As long as you specify the “Content-Type” header, Riak will remember it and give you back the right thing when you access it the next time. No need to do base 64 encoding to store a picture, PDF, Word document, podcast, or whatever. This could enable you to replace a distributed filesystem like GFS, or even S3, with a Riak cluster. Bonus: you get replication and fail-over for no extra cost.

There are some minor kinks with large files (>100MB), but I’ve been assured by Basho that they are addressing the issue. In the meantime, you could chunk the file manually and follow links to get the pieces and put them back together.

Tunable levels of consistency, durability, and performance

The knee-jerk argument against Dynamo-like data stores is that they don’t have ACID properties and that “eventual consistency” is too eventual. In practice, especially in large deployments and high throughput scenarios, ACID breaks down because it requires actions to be “all or nothing”, effectively creating bottlenecks while clients wait for transaction handles. If you’ve done any distributed or concurrent computing, you know that contention for shared resources is a primary cause of failure, including problems like deadlock, starvation and race conditions. For the sake of reducing single points of failure (bottlenecks), Riak implements eventual consistency — where storage operations are accepted immediately, and then propagated across the cluster in an asynchronous fashion. This makes it nearly always available for writes and reads.

In order to deal with inconsistency, Riak tags each datum with a vector clock that internally reveals the datum’s lineage (who modified what version). When there are conflicts — that is, two parallel versions of the same datum — Riak returns you both versions so that your application can decide how to resolve it. In an SQL database with this kind of conflict, your transaction might fail and rollback, forcing you to resolve it and retry anyway.

Beyond just eventual consistency, Riak lets you tune the amount of consistency, durability, and availability you want, even somewhat at request-time. If you need more read availability and replication, you can increase the N value for a bucket (number of replicas, default 3). If you want to be sure the data you’re reading is consistent across the cluster, increase the R value at request time (R, read quorum, is always <= N). If you want high assurance that your data is stored, increase the W and DW values (W = write quorum, DW = durable-write quorum, both always <= N). If you want better performance, e.g. to use Riak as a persistent cache, keep the R, W, and DW values low. You also have the choice of which backend to use when storing your data on the replicas (anything from in-memory to on-disk, or combinations), including the recently released Innostore. Although in many cases the defaults are good enough, you have the flexibility to choose a model that fits your application.

(Note: currently, the N value must be set before you insert any data into the bucket).

Why Riak for Rails?

Ok, so those are the reasons why Riak is awesome in my mind. Why should you use it in your Rails app (or other Ruby application)? Here are some possibilities:

  1. Scale up (or down) cheaply: Riak can grow horizontally with your app, either in lock-step (have database nodes on your app servers!) or independently. A colleague described it as a “Christmas tree” — you could have a load-balanced cluster of Ruby processes and behind that a load-balanced cluster of Riak nodes.
  2. Ease of deployment: No longer do you need to break out or specialize nodes to be database-only, unless you want to. The homogeneity will make your sysadmins happy, and HTTP-friendliness means they already know how to grow the infrastructure.
  3. Flexibility: You can use Riak as a document database, a cache store, a session store, a log server, or a distributed filesystem, with custom settings for each scenario. Soon you’ll also be able to use it as a Solr search replacement.
  4. Powerful modeling: Use links to create relationships in a natural fashion, reducing the number of JOIN-like operations, and then discover data using arbitrary-depth link-walking, or create custom queries with map-reduce.
  5. Background processing and analytics, a.k.a. Big Data: With high write availability and the powerful map-reduce framework, you could defer complicated calculations and later run them in parallel across the Riak cluster. Think data warehouses or Hadoop, but with 1/10 the LoC and no Java.

Where Riak has room to grow

I love Riak’s capabilities, but let’s face it — no system is perfect. Here’s where it could be improved:

  1. Library support: Although it’s REST so you can theoretically use any HTTP client, the included client libraries are pretty basic and use the soon-to-be-deprecated “jiak” interface. I expect that this will improve real soon — we’ve already seen an open-source Java client released.
  2. Efficient ad-hoc lookup: If you only know the bucket of an object but not its key, it’s potentially expensive to find it, since Riak has no concept of indices other than the key (for obvious reasons – it doesn’t know the structure of your content). In the meantime you have two options – eagerly create and maintain them yourself by adding objects with meaningful keys that link back to your original object; or create a map-reduce job to achieve the same result.
  3. Reduce-phase bottleneck: Map and link phases happen with great data-locality, on the nodes where the data is stored, but reduce phases happen in a single node. This is logically simple, but introduces a potential bottleneck and point of failure. If reduce phases are also referentially transparent and idempotent, they could be federated as well, and only incur the single-node hit at the very end.
  4. Large files: Due to some YAGNI-related issues in Webmachine and Riak’s object storage, you can’t load big files into Riak. Break your “l33t xv1dz” into chunks or store them in the filesystem for now.

What to do next

I have lots more to say about Riak (hard to believe, right?), so watch for more blog posts in the future. If you’re ready to dive in, check out the Riak dev site, sign up for the mailing list, or join in the discussion in the IRC channel on Freenode.

smerl: for awesome!

Here's a quick hack with smerl to turn a fun() into a module function:

1> Fun = fun() -> ok end.
#Fun<erl_eval.20.67289768>
2> {env, [_,_,_,Forms]} = erlang:fun_info(Fun, env).
{env,[[],
      {value,#Fun<shell.7.115410035>},
      {eval,#Fun<shell.24.81953817>},
      [{clause,1,[],[],[{atom,1,ok}]}]]}
3> smerl:new(foo).
{meta_mod,foo,undefined,[],[],false}
4> Mod = smerl:new(foo).
{meta_mod,foo,undefined,[],[],false}
5> {ok, Mod2} = smerl:add_func(Mod, {function,1,bar,0,Forms}, true).
{ok,{meta_mod,foo,undefined,
              [{bar,0}],
              [{function,1,bar,0,[{clause,1,[],[],[{atom,1,ok}]}]}],
              false}}
6> smerl:compile(Mod2, [{outdir, "ebin"}]).
ok
7> l(foo).
{module,foo}
8> foo:bar().
ok

Generating Thousands of PDFs on EC2 with Ruby

Note: this is a repost of an entry I wrote on the RailsDog blog.

The Problem

For about two months, we've been working on a static website that exposes the results of complicated economics model to non-economists. We decided to make the site static because of the overhead involved in computing the results and the proprietary nature of the model. We would simply pre-generate the output for all valid permutations of the inputs. The visitor could then select her inputs from a questionnaire, click a button and immediately be shown the results.

The caveat of this decision is that in addition to the numerical outputs, three graphs and a summary (both in HTML and PDF) would need to be generated for each permutation. Since there were 3600 permutations, this would amount to 18000 files in total. Initial local runs of our generation process took about 30 seconds for each permutation, mostly due to embedding the graph images into the PDF. On a single machine, that would take 30 hours of uninterrupted processing! Clearly, this was a job for “the cloud”.

The Tools

Before we get into a discussion of the process of configuring and running the jobs, here’s overview of the tools we used to tackle the problem.

We initially considered using Amazon’s Elastic MapReduce to run the generation jobs, but it requires Java and Hadoop, we had already invested a lot of time in our Ruby tool chain. It is nigh impossible to automatically install Ruby and ImageMagick on an EMR node. Thus, we decided to use vanilla EC2 with the tools shown below.

Prawn

Prawn is the new kid in town for generating PDF in Ruby. Prawn is pretty well-written and easy to start using, and greatly improves on PDF::Writer.

Gruff

Gruff was not the most obvious choice for this project. We liked the flexibility and hackability of Scruffy, but translating its output to PDF was a nightmare and there were some strange inconsistencies in it. In the end, Gruff proved fast, reliable, and simple. The major caveat, as described above, is that embedding images in Prawn is orders of magnitude slower than simply drawing on the canvas.

Haml, Sass, Compass

Haml has been around for 3 years now. Many people cringe at the indentation-sensitive syntax, but it prevents so much frustration that it was a good fit for the project. Naturally, we also used its cousin Sass, and the new-ish CSS/Sass meta-framework Compass. The combination of the these three made it really quick to get started with the static site and make design changes as we iterated.

Chef

You may have already heard of the awesome configuration management tool, Chef. Chef allows you to ensure consistent configuration of your servers using a nice Ruby DSL and a huge library of community-developed “cookbooks” that covers many common use-cases. We were given the chance to try out an alpha of their “Chef Platform”, which is essentially a scalable, hosted, multi-tenant version of the server component of Chef and uses the pre-release version of Chef 0.8. With that, “knife”—the new CLI tool for interacting with the Chef server API—and the custom Opscode AMI, we were well-equipped to quickly deploy a bunch of EC2 nodes. We’ll talk more about the details of the Chef recipes below.

AMQP and RabbitMQ

What’s the best way to distribute a bunch of one-time jobs to a slew of independent machines? A message queue, of course! Despite the version packaged with Ubuntu 9.04 being pretty old, we chose RabbitMQ, having used it on another project. AMQP is also well supported in Ruby.

The Process

Preparing

The first step to start our processing job was to get the data up to S3. You could do this any number of ways, but we created a bucket solely for the data and uploaded all 3600 CSV files with a desktop client.

Next, we created the scripts for the workers and the job initiator. We would potentially need to run the process multiple times, so we chose Aman Gupta’s EventMachine-based AMQP client.

Here’s the worker script, which was set up as a daemon using runit:

#!/usr/bin/env ruby

$: << File.expand_path(File.join(File.dirname(__FILE__),'..','lib'))
require 'rubygems'
require 'eventmachine'
require 'mq'
require 'custom_libraries'

Signal.trap('INT') { AMQP.stop{ EM.stop } }
Signal.trap('TERM'){ AMQP.stop{ EM.stop } }

AMQP.start(:host => ARGV.shift) do
  MQ.prefetch(1)
  MQ.queue('jobs').bind(MQ.direct('jobs')).subscribe do |header, body|
    GenerationJob.new(body).generate
  end
end

Basically, it connects to the RabbitMQ host specified on the command line, subscribes to the job queue, and starts processing messages.

The job initiation script is almost as simple:

#!/usr/bin/env ruby

$: << File.expand_path(File.join(File.dirname(__FILE__),'..','lib'))
require 'rubygems'
require 'eventmachine'
require 'mq'

AWSID = (ENV['AMAZON_ACCESS_KEY_ID'] || 'XXXXXXXXXXXXXXXXXXXX')
AWSKEY = (ENV['AMAZON_SECRET_ACCESS_KEY'] || 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXX')

Signal.trap('INT') { AMQP.stop{ EM.stop } }
Signal.trap('TERM'){ AMQP.stop{ EM.stop } }

host = ARGV.shift
input_bucket = "custom-data"
output_bucket = "custom-output"
output_prefix = Time.now.strftime("/%Y%m%d%H%M%S")
count = 0

AMQP.start(:host => host) do
  exchange = MQ.direct('jobs')

  STDIN.each_line do |file|
    count += 1
    $stdout.print "."; $stdout.flush
    payload = {
      :input => [input_bucket, file.strip],
      :output => [output_bucket, output_prefix],
      :s3id => AWSID,
      :s3key => AWSKEY
    }
    exchange.publish(Marshal.dump(payload))
  end
  AMQP.stop { EM.stop }
end
puts "#{count} data enqueued for generation."

It reads from STDIN the names of files to add to the queue, which are stored in the S3 bucket. Before running the job, we created a text file that listed each of the 3600 files, one per line, which could then be piped to this script on the command line. Then it passes along all the information each worker needs to find the data, and where to put it when completed. We scoped the output by the time the job was enqueued, making it easier to discern older runs from newer ones.

Configuring the cloud

Now that the meat of the job was ready, we dived into configuring the servers with Chef. We created a Chef repository, added the Opscode cookbooks as a submodule, and uploaded these default cookbooks to the server:

  • apt
  • build-essential
  • erlang
  • imagemagick
  • runit
  • ruby

We created some additional cookbooks to fill out the generic setup:

  • rabbitmq – Installs and configures RabbitMQ
  • gemcutter – Upgrades Rubygems, installs Gemcutter and makes gemcutter.org the default gem source

Lastly we created our custom cookbook, which sets up all the libraries we need, downloads the code, and sets up the worker process as a runit service. Let’s walk through the default recipe in that cookbook:


%w{haml gruff fastercsv activesupport prawn prawn-core prawn-format prawn-layout eventmachine amqp aws-s3}.each do |g|
  gem_package g
end

This simply installs all of gems that we need to run the job.


# Find the node that has the job queue
q = search(:node, "run_list:role*job_queue*")[0].first

Here we use Chef’s search feature to find the node that has RabbitMQ installed and running so we can pass it to the worker script.


# Create directory to put the code in
directory "/srv"

# Unzip the code if necessary
execute "Unpack code" do
  command "tar xzf generationjobs.tar.gz"
  cwd "/srv"
  action :nothing
end

# Download the code
remote_file "/srv/generationjobs.tar.gz" do
  source "generationjobs.tar.gz"
  notifies :run, resources(:execute => "Unpack code"), :immediate
end

# Create the directory where output goes
directory "/srv/generationjobs/tmp" do
  recursive true
end

In these four resources, we set up the working directory for the worker process, download the project code (stored on the Chef server as a tarball), and unpack it. The interesting thing about this sequence is that we don’t automatically unpack the tarball. Since the Chef client runs periodically in the background, we don’t want to be unpacking the code every time, but only when it has changed. We use an immediate notification from the remote_file resource to tell the unpacking to run when the tarball is a new version; remote_file won’t download the tarball unless the file checksum has changed.


# Create runit service for worker
runit_service "generationworker" do
  options({:worker_bin => "/srv/generationjobs/bin/worker", :queue_host => q})
  only_if { q }
end

The last step is a pseudo-resource defined in the “runit” cookbook that creates all the pieces of a runit daemon for you; we only had to create the configuration templates for the daemon and put them in our cookbook. The additional options passed to the runit_service tell the templates the location of the worker code and the RabbitMQ host. We also take advantage of the “only_if” option so the service won’t be created if there’s no host with RabbitMQ on it yet.

The last step in the Chef configuration was to create two roles, one for the queue and one for the worker. Naturally, the node that has the queue can also act as a worker. Here’s what the role JSON documents look like:


// The queue role
{
  "name": "job_queue",
  "chef_type": "role",
  "json_class": "Chef::Role",
  "default_attributes": {

  },
  "description": "Provides a message queue for sending jobs out to the workers.",
  "recipes": [
    "erlang",
    "rabbitmq"
  ],
  "override_attributes": {

  }
}

// The worker role
{
  "name": "job_worker",
  "chef_type": "role",
  "json_class": "Chef::Role",
  "default_attributes": {

  },
  "description": "Processes the data from a queue into the PDF, PNG and HTML output.",
  "recipes": [
    "apt",
    "build-essential",
    "ruby",
    "gemcutter",
    "imagemagick::rmagick",
    "runit",
    "custom"
  ],
  "override_attributes": {

  }
}

Running the jobs on EC2

Now comes the fun (and easy) part! Armed with an AWS account, an EC2 certificate, and knife, we began firing up nodes to run the job. With Opscode’s preconfigured Chef AMI, you can pass a JSON node configuration in the EC2 initial data. First we generated the configuration for the job queue node:

$ knife instance_data --run-list="role[job_queue] role[job_worker]" | pbcopy

With the JSON configuration in the clipboard, we could paste it into ElasticFox (or the AWS Management console) and fire up the first EC2 node. Several minutes later, the node was ready to go. Now, we created a similar configuration, but with only the worker role:

$ knife instance_data --run-list="role[job_worker]" | pbcopy

Then we fired up nine of the nodes with that configuration and proceeded to initiate the job:

$ ssh -i ~/ec2-keys/my-ec2-cert.pem root@ec2-public-hostname
[root@ec2-public-hostname]$ cd /srv/generationworker
[root@ec2-public-hostname]$ bin/startjobs localhost < manifest.txt

After all the preparation, that’s all there was to it! A little over an hour later, we had generated PNG graphs, PDF, and HTML from all 3600 datasets.

Conclusion

It’s no mystery why “cloud computing” is so popular. The ability to quickly and cheaply access computational power, utilize it, and then dispose of it is really appealing, and tools like Chef and EC2 make it really easy to accomplish. What can you cook up?

Neotoma 1.3

I’m pleased to announce yet another update to Neotoma. The major features and changes in this release:

  • Transformation/semantic analysis code can be written inline with the PEG, enclosed in backticks (`) before the concluding semi-colon for each reduction. The variables available to your code are Node (the parse result, including transformed results of subtrees), and Idx (the current index into the input).
  • To do an identity transformation on a rule (i.e. pass along the result unchanged), use the tilde (~) instead of backtick-quoted code.
  • Extra functions for use inside your parser can be added to the end of the PEG, also enclosed in backticks (`).
  • All internal uses of the process dictionary have been expunged and instead use the memoization table.
  • All of the modules have been renamed and the parser re-bootstrapped. The most significant user-facing change is that peg_gen is now called neotoma. See the commit for more details.

What is Neotoma?

Neotoma is a packrat parser-generator for Erlang for Parsing Expression Grammars (PEGs). It consists of a parser-combinator library with memoization routines, a parser for PEGs, and a utility to generate parsers from PEGs. It is inspired by treetop, a Ruby library with similar aims, and parsec, the parser-combinator library for Haskell.

Upcoming Speaking Gigs - Late 2009 into 2010

I’ll be speaking at a few meetups in the near future. I encourage you to come to as many as you can! My apologies for the short notice on the first one.

Also, if you’d like me to speak at your meetup group or other event about any one of these topics, Radiant, or Erlang, please drop me a line at seancribbs AT gmail DOT com.