Webmachine in Elixir Tutorial, Part 2

by Sean Cribbs

Last time, we got our project set up and serving some simple dynamic content. In this installment, we’ll show how to serve static files via Webmachine so we can discuss lots of its best features.

Serving static files

Most times you would let a web-server like Apache or nginx serve your static files, but for our tutorial it’s nice to serve our content directly via Webmachine. By doing so, we can demonstrate several important features related to the dispatcher, content-negotiation, and conditional requests. Basically, everything you’d expect out of a well-configured web-server but in a resource module! First make a priv directory (where OTP apps store non-code files) and we’ll put our design assets in there. For now, we’ll just copy files from the webmachine-tutorial repo.

$ mkdir -p priv/www/css priv/www/img
$ curl -o priv/www/css/master.css https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/css/master.css
$ curl -o priv/www/index.html https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/index.html
$ curl -o priv/www/img/noise.png https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/img/noise.png

Let’s make a new resource module called Tweeter.Resources.Assets and fill out the boilerplate. For our resource state, we’ll use a map this time, but we’ll probably change it to a struct later.

defmodule Tweeter.Resources.Assets do
  # Basic initialization
  def init(_), do: {:ok, %{}}

  # Boilerplate function, which we should inject later
  def ping(req_data, state), do: {:pong, req_data, state}
end

Now we need to think about a few things, namely, how to determine which file is being requested, what media type it is, and then how to read it from the filesystem out to the client. Let’s start from the end, assuming we’ve already determined the correct file to read. We’ll make a body-producing function that simply reads the file and sends it to the client. This is not the most efficient way – sendfile() or other streaming would be better – but we are serving small files so it won’t be too bad.

  # Body-producing function
  def produce_resource(req_data, %{filename: filename} = state) do
    {File.read!(filename), req_data, state}
  end

That was easy! Continuing backwards through our list, let’s determine the media type of the file and point it at our body-producing function using the content_types_provided Webmachine callback. This callback tells Webmachine what media types you provide, and what to call to produce each one. Since ours is just reading a file from the filesystem, we’ll call produce_resource, but vary the type it produces.

  # Content-negotiation callback
  def content_types_provided(req_data, state) do
    filename = case :wrq.disp_path(req_data) do
                 '' -> 'index.html'
                 f  -> f
               end
    media_type = :webmachine_util.guess_mime(filename)
    {[{media_type, :produce_resource}], req_data, state}
  end

This is the first time we’ve used a Webmachine library function in a resource. :wrq.disp_path gives us the portion of the path that the dispatcher matched against. So at the root URL, this will be the empty string, otherwise, it’ll be a partial path to some file, like css/master.css. Then :webmachine_util.guess_mime is used to guess what a proper media type will be. For fun, let’s try that function from the shell via iex -S mix.

iex(1)> :webmachine_util.guess_mime('foo.png')
'image/png'
iex(2)> :webmachine_util.guess_mime('application.js')
'application/x-javascript'
iex(3)> :webmachine_util.guess_mime('home.html')
'text/html'
iex(4)> :webmachine_util.guess_mime('module.erl')
'text/plain'

Now that we have a body producing function, and the correct MIME type, let’s find the file on the filesystem, via one of the most important callbacks resource_exists. Obviously, if the file doesn’t exist in our static assets, we should return a 404 Not Found, and this is also a perfect place to populate the state with an absolute path to the requested file.

  # Find the file!
  def resource_exists(req_data, state) do
    priv_dir = Path.join :code.priv_dir(:tweeter), "www"
    absolute_path = Path.join(priv_dir, :wrq.disp_path(req_data)) |> Path.expand
    {File.regular?(absolute_path),
     req_data,
     %{state | filename: absolute_path}}
  end

Before we move on, there’s some repeated functionality with content_types_provided, and we have a minor bug too – at the root path we want to serve index.html. Let’s extract that shared functionality into a new function.

  # Find the file!
  def resource_exists(req_data, state) do
    # Find the root of our static files, add the identified path
    file_path = Path.join [:code.priv_dir(:tweeter), "www", identify_file(req_data)]
    # Compute the full path
    absolute_path = Path.expand file_path
    {File.regular?(absolute_path),
     req_data,
     Map.put(state, :filename, absolute_path)}
  end

  # Content-negotiation callback
  def content_types_provided(req_data, state) do
    media_type = req_data |>
      identify_file |>
      String.to_char_list |>
      :webmachine_util.guess_mime
    {[{media_type, :produce_resource}], req_data, state}
  end

  # Identifies the file we're trying to serve, normalizing path
  # segments
  defp identify_file(req_data) do
    # Getting the path tokens removes any duplicate slashes
    case :wrq.path_tokens(req_data) do
      # At the root path (no tokens), we want to serve index.html
      [] -> ["index.html"]
      # Otherwise serve the path they asked for
      toks -> toks
    end |> Path.join
  end

To get this resource to actually serve content, we now need to hook it up to the dispatcher. Let’s edit tweeter.ex again, replacing our Hello resource with Assets.

    # Some configuration that Webmachine needs
    web_config = [ip: {127, 0, 0, 1},
                  port: 8080,
                  dispatch: [
                    {[], Tweeter.Resources.Assets, []},
                    {[:*], Tweeter.Resources.Assets, []}
                  ]]

Note the special path segment :*. This tells the Webmachine dispatcher to match any number of trailing path segments. Kill/restart your mix process and refresh the page!

Reducing waste

This strategy of reading a file from disk and sending it to the client is as old as the web itself, but there’s much more we can do! HTTP includes caching in the protocol, and it’d be pretty inefficient for a client to fetch unchanged design assets every time they refresh the page.

Let’s add some simple validation caching to our assets resource. We can start by using the last_modified callback.

  # Last-Modified date
  def last_modified(req_data, %{filename: filename} = state) do
    mtime = File.stat!(filename, time: :universal).mtime
    {mtime, req_data, state}
  end

This is pretty simple: we read the file statistics, pulling out the mtime field which represents when it was last modified. We can use the File.stat! function instead of its safe equivalent because of the flow of Webmachine’s decision graph. That is, we know that last_modified will not be called if resource_exists returns false.

We can go even further by using entity tags, or “ETag” for short. These are usually a hash string of various aspects of the file’s metadata. Since we might be doing that File.stat! call in multiple places, let’s put it in resource_exists while we’re at it and save the result.

  # Find the file!
  def resource_exists(req_data, state) do
    # Find the root of our static files, add the identified path
    file_path = Path.join [:code.priv_dir(:tweeter), "www", identify_file(req_data)]
    # Compute the full path
    absolute_path = Path.expand file_path
    state = Map.put(state, :filename, absolute_path)
    # Return true if it exists and read the file info into the state
    # for future callbacks
    if File.regular?(absolute_path) do
      state = Map.put(state, :fileinfo, File.stat!(absolute_path))
      {true, req_data, state}
    else
      {false, req_data, state}
    end
  end

  # Last-Modified date
  def last_modified(req_data, %{fileinfo: fileinfo} = state) do
    {fileinfo.mtime, req_data, state}
  end

  # ETag
  def generate_etag(req_data, %{fileinfo: fileinfo} = state) do
    hash = {fileinfo.inode, fileinfo.mtime} |>
      :erlang.phash2 |>
      :mochihex.to_hex
    {hash, req_data, state}
  end

We use the built-in :erlang.phash2 function to compute the ETag, but you should probably use a better hash in other resources.

Finally, I noticed that our CSS and HTML, although small, are still multiple kilobytes. We can reduce the transmission time significantly through compression, using the encodings_provided callback. Somewhat similar to content_types_provided, it returns a list of pairs, where the first is an encoding and the second is a fn that performs the encoding.

  # Compression selection
  def encodings_provided(req_data, state) do
    {[{'identity', &(&1)}, # identity function!
      {'gzip', &:zlib.gzip/1},
      {'deflate', &:zlib.zip/1}],
     req_data, state}
  end

Note that this is a case again where Webmachine requires character lists and not binaries (single-quoted strings). Now that our compression is in place, I see the index.html file going from ~1KB to 560B, and the CSS file from 6.7KB to ~1KB. Nice bandwidth savings!

Up next

In the next installment, we’ll learn start serving some dynamically-generated content from a resource.

Comments

© 2006-present Sean CribbsGithub PagesTufte CSS