Webmachine in Elixir Tutorial, Part 2
by Sean Cribbs
Last time, we got our project set up and serving some simple dynamic content. In this installment, we’ll show how to serve static files via Webmachine so we can discuss lots of its best features.
Serving static files
Most times you would let a web-server like Apache or nginx serve your
static files, but for our tutorial it’s nice to serve our content
directly via Webmachine. By doing so, we can demonstrate several
important features related to the dispatcher, content-negotiation, and
conditional requests. Basically, everything you’d expect out of a
well-configured web-server but in a resource module! First make a
priv
directory (where OTP apps store non-code files) and we’ll put
our design assets in there. For now, we’ll just copy files from the
webmachine-tutorial
repo.
$ mkdir -p priv/www/css priv/www/img
$ curl -o priv/www/css/master.css https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/css/master.css
$ curl -o priv/www/index.html https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/index.html
$ curl -o priv/www/img/noise.png https://raw.githubusercontent.com/cmeiklejohn/webmachine-tutorial/bf86b8230259ed710bf1ab3f32a5c64bfb9f03bc/priv/www/img/noise.png
Let’s make a new resource module called Tweeter.Resources.Assets
and
fill out the boilerplate. For our resource state, we’ll use a map this
time, but we’ll probably change it to a struct later.
defmodule Tweeter.Resources.Assets do
# Basic initialization
def init(_), do: {:ok, %{}}
# Boilerplate function, which we should inject later
def ping(req_data, state), do: {:pong, req_data, state}
end
Now we need to think about a few things, namely, how to determine
which file is being requested, what media type it is, and then how to
read it from the filesystem out to the client. Let’s start from the
end, assuming we’ve already determined the correct file to read. We’ll
make a body-producing function that simply reads the file and sends it
to the client. This is not the most efficient way – sendfile()
or
other streaming would be better – but we are serving small files so
it won’t be too bad.
# Body-producing function
def produce_resource(req_data, %{filename: filename} = state) do
{File.read!(filename), req_data, state}
end
That was easy! Continuing backwards through our list, let’s
determine the media type of the file and point it at our
body-producing function using the content_types_provided
Webmachine callback. This callback tells Webmachine what media
types you provide, and what to call to produce each one. Since ours
is just reading a file from the filesystem, we’ll call
produce_resource
, but vary the type it produces.
# Content-negotiation callback
def content_types_provided(req_data, state) do
filename = case :wrq.disp_path(req_data) do
'' -> 'index.html'
f -> f
end
media_type = :webmachine_util.guess_mime(filename)
{[{media_type, :produce_resource}], req_data, state}
end
This is the first time we’ve used a Webmachine library function in
a resource. :wrq.disp_path
gives us the portion of the path that
the dispatcher matched against. So at the root URL, this will be
the empty string, otherwise, it’ll be a partial path to some file,
like css/master.css
. Then :webmachine_util.guess_mime
is used
to guess what a proper media type will be. For fun, let’s try that
function from the shell via iex -S mix
.
iex(1)> :webmachine_util.guess_mime('foo.png')
'image/png'
iex(2)> :webmachine_util.guess_mime('application.js')
'application/x-javascript'
iex(3)> :webmachine_util.guess_mime('home.html')
'text/html'
iex(4)> :webmachine_util.guess_mime('module.erl')
'text/plain'
Now that we have a body producing function, and the correct MIME
type, let’s find the file on the filesystem, via one of the most
important callbacks resource_exists
. Obviously, if the file
doesn’t exist in our static assets, we should return a 404 Not
Found
, and this is also a perfect place to populate the state with
an absolute path to the requested file.
# Find the file!
def resource_exists(req_data, state) do
priv_dir = Path.join :code.priv_dir(:tweeter), "www"
absolute_path = Path.join(priv_dir, :wrq.disp_path(req_data)) |> Path.expand
{File.regular?(absolute_path),
req_data,
%{state | filename: absolute_path}}
end
Before we move on, there’s some repeated functionality with
content_types_provided
, and we have a minor bug too – at the
root path we want to serve index.html
. Let’s extract that shared
functionality into a new function.
# Find the file!
def resource_exists(req_data, state) do
# Find the root of our static files, add the identified path
file_path = Path.join [:code.priv_dir(:tweeter), "www", identify_file(req_data)]
# Compute the full path
absolute_path = Path.expand file_path
{File.regular?(absolute_path),
req_data,
Map.put(state, :filename, absolute_path)}
end
# Content-negotiation callback
def content_types_provided(req_data, state) do
media_type = req_data |>
identify_file |>
String.to_char_list |>
:webmachine_util.guess_mime
{[{media_type, :produce_resource}], req_data, state}
end
# Identifies the file we're trying to serve, normalizing path
# segments
defp identify_file(req_data) do
# Getting the path tokens removes any duplicate slashes
case :wrq.path_tokens(req_data) do
# At the root path (no tokens), we want to serve index.html
[] -> ["index.html"]
# Otherwise serve the path they asked for
toks -> toks
end |> Path.join
end
To get this resource to actually serve content, we now need to hook
it up to the dispatcher. Let’s edit tweeter.ex
again, replacing
our Hello
resource with Assets
.
# Some configuration that Webmachine needs
web_config = [ip: {127, 0, 0, 1},
port: 8080,
dispatch: [
{[], Tweeter.Resources.Assets, []},
{[:*], Tweeter.Resources.Assets, []}
]]
Note the special path segment :*
. This tells the Webmachine
dispatcher to match any number of trailing path
segments. Kill/restart your mix
process and refresh the page!
Reducing waste
This strategy of reading a file from disk and sending it to the client is as old as the web itself, but there’s much more we can do! HTTP includes caching in the protocol, and it’d be pretty inefficient for a client to fetch unchanged design assets every time they refresh the page.
Let’s add some simple validation caching to our assets resource. We
can start by using the last_modified
callback.
# Last-Modified date
def last_modified(req_data, %{filename: filename} = state) do
mtime = File.stat!(filename, time: :universal).mtime
{mtime, req_data, state}
end
This is pretty simple: we read the file statistics, pulling out the
mtime
field which represents when it was last modified. We can
use the File.stat!
function instead of its safe equivalent
because of the flow of Webmachine’s decision graph. That is, we
know that last_modified
will not be called if resource_exists
returns false
.
We can go even further by using entity tags, or “ETag” for
short. These are usually a hash string of various aspects of the
file’s metadata. Since we might be doing that File.stat!
call in
multiple places, let’s put it in resource_exists
while we’re at
it and save the result.
# Find the file!
def resource_exists(req_data, state) do
# Find the root of our static files, add the identified path
file_path = Path.join [:code.priv_dir(:tweeter), "www", identify_file(req_data)]
# Compute the full path
absolute_path = Path.expand file_path
state = Map.put(state, :filename, absolute_path)
# Return true if it exists and read the file info into the state
# for future callbacks
if File.regular?(absolute_path) do
state = Map.put(state, :fileinfo, File.stat!(absolute_path))
{true, req_data, state}
else
{false, req_data, state}
end
end
# Last-Modified date
def last_modified(req_data, %{fileinfo: fileinfo} = state) do
{fileinfo.mtime, req_data, state}
end
# ETag
def generate_etag(req_data, %{fileinfo: fileinfo} = state) do
hash = {fileinfo.inode, fileinfo.mtime} |>
:erlang.phash2 |>
:mochihex.to_hex
{hash, req_data, state}
end
We use the built-in :erlang.phash2
function to compute the ETag,
but you should probably use a better hash in other resources.
Finally, I noticed that our CSS and HTML, although small, are still
multiple kilobytes. We can reduce the transmission time
significantly through compression, using the encodings_provided
callback. Somewhat similar to content_types_provided
, it returns
a list of pairs, where the first is an encoding and the second is a
fn
that performs the encoding.
# Compression selection
def encodings_provided(req_data, state) do
{[{'identity', &(&1)}, # identity function!
{'gzip', &:zlib.gzip/1},
{'deflate', &:zlib.zip/1}],
req_data, state}
end
Note that this is a case again where Webmachine requires character
lists and not binaries (single-quoted strings). Now that our
compression is in place, I see the index.html
file going from
~1KB to 560B, and the CSS file from 6.7KB to ~1KB. Nice bandwidth
savings!
Up next
In the next installment, we’ll learn start serving some dynamically-generated content from a resource.