In Search of the Software Ursatz: Part 1 - Introduction

This is the first in a series based on the talk I gave at Code PaLOUsa 2014 entitled “In Search of the Software Ursatz”.

It has been a long time since I wrote on this blog, but it has been an equally long time (or more) that I have been thinking about the topic of this series of posts.

What I want introduce to you, dear reader, is a set of musical theories that have been influential in my thought process as a musician and a programmer, in the hopes that they bring about deeper insights to all of us. This has been a difficult thing to begin writing and talking about. The concepts are very abstract, and connecting them concretely to the work we do requires both a strong gut feeling and occasional leaps of faith.

The ideas I will try to construct are also in their infancy; indeed, I proposed the talk to Code PaLOUsa with very ambitious goals, but I haven’t yet connected all of the dots. Luckily, the title of the talk and these posts are “In Search of…” not “I Found It!”. The musical theories I will discuss herein took the author over thirty years – essentially his entire career – to develop; I am just in the beginning stages of developing my theory.

That said, I apologize if it seemed through the title or the talk abstract that I have a Grand Unified Theory of Software Design. That is not the case. Instead, I will look at some important questions and point in some directions I think we could pursue in the future.

In Search of:

To begin this discussion, I’d like to outline what it is exactly that I am seeking.

First, there has been a trend in recent years to see code as craftsmanship. This seems to mean not just the creation of code to do a job, but a skilled work that also requires aesthetics beyond the measurable aspects of the product. What does it mean to be a software craftsperson? How does that quality reflect in the products of the crafters? I’d like to know the answers, or at least better understand the questions.

Second, I’d like an intuitive technique for analyzing the structure of software. By intuitive I mean that its interpretation is obvious to the experienced practitioner, and draws from a deep understanding of software construction by the analyst.

Third, I’d like a subjective means of critical comparison of software designs. That is, I’m not interested in performance comparisions, SLOC counts, but what makes the software what it is. For example, what are the defining aspects of the functional program versus the object-oriented program and why is one subjectively better in various circumstances. How do the surface features of a program convey or obscure its meaning?

Fourth, I’m looking for the why not the how. We have many Turing-complete languages and rich tools that can express essentially the same computation. We have a wealth of information on the Internet (e.g. stackoverflow) that can tell us how to accomplish things. Given so many choices, what are the designs that win out and what makes them better than others? What makes them tick?

Why care?

Those are great goals to achieve, but why do I care about them?

If we accept the idea that we should strive to be software craftspersons, we need a framework for critical thought about our craft that goes beyond surface details. We can only improve our craft if we deeply understand the things we create and turn a critical eye to our own work.

As Rich Hickey has so eloquently put it, there is a strong distinction between simplicity and ease. Simple things require deep understanding to wield, easy things are often canned solutions that are quickly outgrown.

Finally, I believe that the software systems we build reflect – and in some cases “leak” – the foundations on which they are built. Understanding those foundations and the interaction between different layers is essential to building successful, well-crafted software.

A Tale of Two Pieces

Below are two works from the Tonal period (an umbrella term encompassing works from approximately 1600 to 1850). The first is Prelude no. 1 in C from The Well-Tempered Clavier, Book 1 by Johann Sebastian Bach. Many of you will know this piece, even if you are not a musician, as it has become very popular in weddings and television spots in recent years.

The second will be less familiar to most, except the pianists. It is Etude in F Major, Op. 10 no. 8 by Frederik Chopin.

Would you believe, aside from the fact that both of these pieces were intended for keyboard instruments, that they have the same fundamental structure? They sound very different on the surface, but use the same techniques, in different combinations, for expounding upon the deep structure of Tonal music.

That realization is the genius of the theories of Heinrich Schenker, who I will discuss in more depth in the next post.

Property-Driven Grammar Development

Back when I first learned about parsing expression grammars (PEGs), I was impressed by the test-driven grammar development demo that the author of Treetop had created. TDD, BDD, and friends are a given in the Ruby community, but are not as popular in the Erlang world. On the other hand, QuickCheck is the most powerful tool for testing Erlang, given that it can generate random test cases and quickly reduce found errors to the minimal failing case (the most important part!).

A few weeks ago Rich Hickey released an informal specification edn, a subset of Clojure syntax for expressing data, and the on-the-wire format for Datomic. Since I have a PEG/Packrat tool and QuickCheck, it seemed like a perfect weekend project to attempt property-driven development on. (With minimal modification, one could use PropEr or Triq to do this, too.) I’m not going to go into detail about how to use QuickCheck, but I’ll try to cover the relevant bits as I go.

Now, the interesting part about testing a parser with QuickCheck is that you have to do the work twice! That is, you must define a generator for a subset of the language at the same time that you develop the rule that parses it; the challenge will be avoiding the “ugly mirror” problem. With some more formal methods than I take here, one might be able to use the grammar as both generator and parser, an exercise I leave to you, kind reader.

Usually I try to attack developing a grammar by selecting the simplest construct – usually a terminal deep in the syntax tree – and implementing that, then build up the language as I go with more terminals and simple non-terminals until I reach the top level. Since the simplest and most prolific terminal in edn is whitespace, we’ll start there. In my first pass at this, I started by writing my properties in the grammar file, but that quickly became unmanageable, so my examples below will keep them separate. Whitespace in edn is defined as any tab character, carriage return, linefeed, horizontal space, or comma, so let’s create a QuickCheck generator for that.

%% edn_eqc.erl
-module(edn_eqc).
-ifdef(EQC).
-compile([export_all]).
-include_lib("eqc/include/eqc.hrl").

gen_ws() -> oneof([9, 10, 11, 12, 13, 32, $,]).

-endif.

I use the oneof generator because each of the whitespace types is independent and none are preferred over another, meaning that they don’t need to shrink to a specific value. Since we need binaries and not just bytes as parser input, and all streams in edn are UTF-8 encoded, let’s modify the generator a little bit and add a convenience macro for converting to UTF-8.

%% [snip]
-define(to_utf8(X), unicode:characters_to_binary(lists:flatten(X), utf8, utf8)).

gen_ws() ->
    ?LET(X,
         list(oneof([9, 10, 11, 12, 13, 32, $,])),
         ?to_utf8(X)).

The ?LET macro allows you to wrap a non-abstract operation around a generator so that you can modify the concrete value after it is generated, while still returning a generator that QuickCheck can understand. Now we can sample that generator and see if it makes sense. (Note that I’ve skipped over some setup stuff you’ll need to do with rebar to make it a proper app. I put edn_eqc.erl in test/.)

$ rebar get-deps compile eunit compile_only=true
$ erl -pa .eunit
 
1> eqc_gen:sample(edn_eqc:gen_ws()).
<<"\t \n">>
<<>>
<<"\f ">>
<<>>
<<"\f\r,\f,">>
<<>>
<<" \v\r\n">>
<<",\f\n">>
<<"\n, \n\v\n">>
<<>>
<<>>
ok

Great, now we should define what the property of parsing whitespace should be, namely, that it is ignored. However, given that edn can be used to stream data, and has a native list type, returning an empty list when the stream has only whitespace would not make sense. Returning a tagged error tuple, which is Erlang’s convention, would also be presumptious, given that edn has a tuple type. Therefore, I’m going to choose to return a sentinel value of '$space' for now, and I’ll later insert a throw at the top level so we can detect empty streams. Luckily, it will be simple to change this later.

%% Must be after the EQC include, since it will try to define similar
%% macros.
-include_lib("eunit/include/eunit.hrl"). 

%% [snip]

prop_whitespace() ->
    ?FORALL(Spaces, gen_ws(),
           '$space' == edn:parse(Spaces)).

Now let’s run it!

$ rebar qc
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled test/edn_eqc.erl
prop_whitespace: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
Failed! Reason: 
{'EXIT',{undef,[{edn,parse,[<<>>],[]},
                {edn_eqc,'-prop_whitespace/0-fun-0-',1,
                         [{file,"test/edn_eqc.erl"},{line,16}]},
                {eqc,'-f777_0/2-fun-4-',3,[]},
                {eqc_gen,'-f321_0/2-fun-0-',5,[]},
                {eqc_gen,f186_0,2,[]},
                {eqc_gen,'-f321_0/2-fun-0-',5,[]},
                {eqc_gen,f186_0,2,[]},
                {eqc_gen,gen,3,[]}]}}
After 1 tests.
<<>>
ERROR: One or more QC properties didn't hold true:
[prop_whitespace]

Woops, we got undef because we didn’t define our grammar module yet! Let’s open up edn.peg and add the grammar rule.

whitespace <- [,\s\v\f\r\n\t]+ `'$space'`;

Briefly, we’ve defined the whitespace non-terminal as parsing from one-or-more characters in the class of visible whitespaces plus the comma character, and returning the Erlang atom '$space'. Now let’s compile the grammar and try it again.

$ rebar compile qc skip_deps=true
==> edn (compile)
Compiled src/edn.peg
src/edn.erl:109: Warning: function p_all/4 is unused
Compiled src/edn.erl
==> edn (qc)
NOTICE: Using experimental 'qc' command
src/edn.erl:109: Warning: function p_all/4 is unused
Compiled src/edn.erl
Compiled test/edn_eqc.erl
prop_whitespace: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
Failed! After 1 tests.
<<>>
ERROR: One or more QC properties didn't hold true:
[prop_whitespace]
ERROR: qc failed while processing /Users/sean/Development/edn: rebar_abort

Hmm, an empty content is a valid input, but shouldn’t be recognized as a space. Let’s make that generator predicated as non-empty on the property.

%% QuickCheck property for whitespace
prop_whitespace() ->
    ?FORALL(Spaces, non_empty(gen_ws()),
           '$space' == edn:parse(Spaces)).

$ rebar compile qc skip_deps=true
==> edn (qc)
NOTICE: Using experimental 'qc' command
src/edn.erl:109: Warning: function p_all/4 is unused
Compiled src/edn.erl
Compiled test/edn_eqc.erl
prop_whitespace: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests

Alright, we can parse whitespace! *facepalm* Let’s quickly add a few more simple language constructs, namely nil and booleans so we can see how to start building up the structure around these terminals. Again we start with the generators:

gen_nil() -> <<"nil">>.

gen_bool() -> oneof([<<"true">>, <<"false">>]).

Native Erlang values generate themselves in QuickCheck, so simply returning the <<"nil">> value means that that will always be generated from the gen_nil() function. We can sample those generators again if we like, but they will be unsurprising. Instead, let’s define a property for nil:

prop_nil() ->
    ?FORALL(Nil, ws_wrap(gen_nil()),
            nil == edn:parse(Nil)).

Notice I haven’t defined that ws_wrap function yet. Remember that our goal here was to treat whitespace simply as a separator, so the property we want to define is that a real terminal surrounded by whitespace parses into that terminal. Let’s teach QuickCheck how to wrap things in whitespace by making another generator, using our handy ?LET macro again:

%% Wrap another generator in whitespace
ws_wrap(Gen) ->
    ?LET({LWS, V, TWS}, 
         {gen_ws(), Gen, gen_ws()},
         ?to_utf8([LWS, V, TWS])).

Thanks to ?LET, ws_wrap defines a generator that will create some amount of leading whitespace (maybe none), evaluate the passed generator, and then some trailing whitespace (maybe none) and flatten it into a UTF-8 binary. Perfect, check that property!

$ rebar qc skip_deps=true
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled test/edn_eqc.erl
prop_nil: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
Failed! After 1 tests.
<<"nil">>
prop_whitespace: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
ERROR: One or more QC properties didn't hold true:
[prop_nil]
ERROR: qc failed while processing /Users/sean/Development/edn: rebar_abort

We’ve got a failing property again, and look how it shrunk! It’s easy to see what broke, namely that, DUH, we didn’t define how to parse nil. That’s easy to fix:

nil <- "nil" `nil`;
whitespace <- [,\s\v\f\r\n\t]+ `'$space'`;

Now, I could run the property again, but I’ll save you the pain; simply adding that rule isn’t going to cut it because nil must be surroundable by whitespace. Also, neotoma won’t compile that grammar because it contains nonterminals that are not referred anywhere else – its convention is that the first rule is the entry point to the grammar. Let’s add some rules that allow us to describe the syntactic form of whitespace, and the semantic behavior of empty streams at the same time.

edn <- whitespace? (term:term whitespace?)* `
case Node of
  %% Nothing but whitespace
  [ _, []] ->
        throw({edn,empty});
  %% Just one datum
  [ _, [[{term,Term}, _]]] ->
       Term;
  %% Lots of terms
  [ _, Terms ] ->
        [ T || [{term, T}, _WS] <- Terms ]
end
`;
 
term <- nil ~;
nil <- "nil" `nil`;
whitespace <- [,\s\v\f\r\n\t]+ `'$space'`;

This is the first time we’ve seen significant code in the grammar, so I’ll try to describe what’s going on. In neotoma grammars, you can include inline code between backticks or comment-braces (%{, %}) that will be run when a rule is successfully parsed. Within that code block, the variable Node is sequence of terms that was parsed, so you can manipulate that to build the data structures you want to result from the parse. In the previous two rules, we’ve been ignoring the parse result and simply returning static values. In our new term rule, we are using the special-form of ~ to skip doing any transformation, which is the equivalent of writing %{ Node %}, but much less noisy.

Now let’s focus our attention on the top-level rule, edn, which encapsulates our whitespace and stream behavior. It says that leading whitespace is optional, followed by zero-or-more terms separated by whitespace. We tag the terms as they are parsed so they are easier to pattern-match on and extract. Now in the code block, we can do something with parse. If the parenthesized portion parses zero times, the result will be an empty list, so we handle that case by throwing a special term like I mentioned above. In the case of parsing only a single term, we want to return only that term, and it not wrapped in a list, so we special-case that parse as well. Finally, if there is a stream of terms, for now we will just extract them and return them in a list.

Let’s recompile the grammar and try our properties again.

$ rebar compile qc skip_deps=true
==> edn (compile)
Compiled src/edn.peg
Compiled src/edn.erl
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled src/edn.erl
prop_nil: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_whitespace: Failed! Reason: 
{'EXIT',{{case_clause,{edn,empty}},
         [{eqc,'-f777_0/2-fun-4-',3,[]},
          {eqc_gen,'-f321_0/2-fun-0-',5,[]},
          {eqc_gen,f186_0,2,[]},
          {eqc_gen,'-f321_0/2-fun-0-',5,[]},
          {eqc_gen,f186_0,2,[]},
          {eqc_gen,gen,3,[]},
          {eqc,'-f758_0/1-fun-2-',3,[]},
          {eqc_gen,'-f321_0/2-fun-1-',4,[]}]}}
After 1 tests.
ERROR: One or more QC properties didn't hold true:
[prop_whitespace]
ERROR: qc failed while processing /Users/sean/Development/edn: rebar_abort

Woops, we broke the whitespace property because we didn’t expect the throw! (One might call this letting your code get ahead of your tests.) Let’s change that to use an assertion provided by eunit.

%% You must put this AFTER the EQC header file.
-include_lib("eunit/include/eunit.hrl").

%% [snip]

prop_whitespace() ->
    ?FORALL(Spaces, gen_ws(),
            ok == ?assertThrow({edn, empty}, edn:parse(Spaces))).

Run that one more time.

$ rebar qc skip_deps=true
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled test/edn_eqc.erl
prop_nil: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_whitespace: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests

Cool, now we can integrate that boolean generator and write a property for it.

prop_bool() ->
    ?FORALL(Boolean, ws_wrap(gen_bool()),
            lists:member(edn:parse(Boolean), [true, false])).

I think you get the drill now, let’s assume you ran that, you would get the {edn, empty} thrown because it will stop parsing at the first valid tree before an unknown character. Let’s add the rule to the grammar:

edn <- whitespace? (term:term whitespace?)* `
case Node of
  %% Nothing but whitespace
  [ _, []] ->
        throw({edn,empty});
  %% Just one datum
  [ _, [[{term,Term}, _]]] ->
       Term;
  %% Lots of terms
  [ _, Terms ] ->
        [ T || [{term, T}, _WS] <- Terms ]
end
`;

term <- boolean / nil ~;
boolean <- "true" / "false" `binary_to_existing_atom(Node, utf8)`;
nil <- "nil" `nil`;
whitespace <- [,\s\v\f\r\n\t]+ `'$space'`;

On the term rule, we just added boolean to one of the possible terms, using ordered choice, and use the binary_to_existing_atom/2 BIF in the boolean rule to create the proper Erlang term. One last time, let’s compile the grammar and run the properties:

$ rebar compile qc skip_deps=true
==> edn (compile)
Compiled src/edn.peg
Compiled src/edn.erl
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled src/edn.erl
prop_nil: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,10,11},{11,19,8}}
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_whitespace: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_bool: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests

Puzzler

So far I’ve lead you through it by hand, including most of the missteps along the way. I’ve gone way past this point in the actual project, including doing more complicated-to-parse types like numbers. Given the grammar and properties in the project on Github, can you figure out why prop_symbol() fails? The answer is subtle.

$ rebar qc skip_deps=true
==> edn (qc)
NOTICE: Using experimental 'qc' command
Compiled test/edn_eqc.erl
Compiled src/edn.erl
prop_whitespace: Starting Quviq QuickCheck version 1.25.1
   (compiled at {{2011,10,1},{13,42,22}})
Licence for Basho reserved until {{2012,9,30},{14,35,1}}
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_bool: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_nil: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_unescape: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_string: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_symbol: ............................................................................................................................................................................................................................................................................Failed! After 269 tests.
<<"\v\v\n\r  ,, ,\r\t-3fAloF0oZXp8 ,">>
Shrinking...(3 times)
<<"-3">>
prop_character: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_integer: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
prop_float: ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK, passed 500 tests
ERROR: One or more QC properties didn't hold true:
[prop_symbol]
ERROR: qc failed while processing /Users/sean/Development/edn:
rebar_abort

Screencast: Riak Client Multi-node Connections

The Riak client for Ruby (riak-client) was released a few weeks ago and it includes some really useful features for working with Riak from your Ruby applications.

This second screencast demonstrates the multi-node or “cluster” connection feature in the client, and the effect that has on performance and reliability. Co-starring in this video is Riak Control, Riak 1.1’s new web-based administration tool.

Riak Ruby Client 1.0: Multi-node connections from Sean Cribbs on Vimeo.

Screencast: Riak Client Serializers

The Riak client for Ruby (riak-client) was released a few weeks ago and it includes some really useful features for working with Riak from your Ruby applications. Here’s my first screencast about those features, which describes how to use custom serializers. Enjoy! (Watch on Vimeo for the best experience.)

Riak Ruby Client 1.0: Serializers from Sean Cribbs on Vimeo.

Webmachine vs. Grape

Back in December, I gave my Resources, For Real This Time talk for the third time, this time at NYC.rb. After the talk, I got into a very emphatic discussion with Daniel Doubrovkine and John “JJB” Bachir about the differences between Webmachine’s approach and Grape’s approach and their relative strengths. Daniel followed it up with an interesting blog post titled Grape vs. Webmachine. I’ve had some time to think it all over and so I figured it was about time I wrote a response.

Daniel poses the question “Should you build your next RESTful API with Grape or Webmachine?” Before I address his question (and the inherent assumptions therein), I want to tell you a bit more about Webmachine and why it is fundamentally different from the prevailing approaches.

Protocols are contracts

If you Google ”define: protocol”, two definitions appear:

  1. The official procedure governing affairs of state or diplomatic occasions.
  2. The established code of procedure or behavior in any group, organization, or situation.

Merriam-Webster gives some additionally detailed definitions:

  1. a code prescribing strict adherence to correct etiquette and precedence (as in diplomatic exchange and in the military services) <a breach of protocol>
  2. a set of conventions governing the treatment and especially the formatting of data in an electronic communications system <network protocols>

Another way of saying this is that protocols are contracts or conventional manners of speech and behavior. To violate that contract is to be misunderstood, worse, to offend or to cause unintended actions. Granted, computer protocols may have lesser social consequences than social protocols, but if we don’t speak them properly, our programs won’t work.

Protocols are FSMs

The classical way to implement a protocol participant (that is, a client, server or peer) is a finite state machine (FSM). Why? Protocols are usually defined in terms of “in this situation, do that” or “react to this condition by doing that”. Many of those assertions are dependent on one another, meaning that they are not even relevant if other assertions have not been made previously. To illustrate this better, imagine the protocol of two heads of state meeting. Their meeting might go through these steps:

  1. Arrive at the same location.
  2. Shake hands and introduce other participants.
  3. Enter the meeting space.
  4. Negotiate an issue.
  5. Leave the meeting space.
  6. Arrive and speak at the press conference.
  7. Shake hands again.
  8. Depart the press conference.

First, this is a discrete set of steps that must be followed in the order given. It wouldn’t make much sense to negotiate the issue (which might have its own internal protocol) before you shake hands and enter the meeting space, or to discuss the negotiations at the press conference before you’ve done any negotiation. Second, if one part of the protocol fails, other steps in the protocol may never occur! Imagine that upon arrival, the other head of state refuses to shake your hand or even look at you; you might abort the meeting altogether.

Like protocols, in finite state machines, there are also discrete steps (states), and conditions that allow transition from one state to another. A transition may lead to another internal state, or an end state in which processing is terminated. Finite state machines are the essential way to implement protocols.

And interesting side-effect of this coherence between protocol and FSM is that they are duals of each other. The FSM is an implementation of the protocol, and the protocol’s states and assertions can be derived from the FSM. It’s the kind of thing that researchers interested in provability and mathematical formulations of software get really excited about.

So what does this have to do with Webmachine and Grape?

HTTP happens to be a protocol with a simple syntax but very rich semantic possibilities. If your application “misspeaks” HTTP, it might still be partially understood (the syntax may still be grasped), but the other party might miss out on some crucial subtlety your application wants to convey or might take an unexpected or undesirable action as a result.

Despite HTTP’s flexibility (laxness?), it’s still important to speak the protocol as fluently as possible. Building a better Web is just as much about the brick and mortar (the HTTP protocol) as the paint and trim (“Web Standards” in the browser).

Webmachine tries to do just that. Its core is an FSM of the server side of HTTP. The end states are response status codes (e.g. 200 OK or 404 Not Found). The transition conditions come from the “MAY”, “MUST”, “SHOULD” language in the HTTP/1.1 RFC 2616 as well as the less formal aspects of the specification. The FSM determines which transitions to take based on facts about the request and facts about the resource being requested. Because the FSM is a dual of the HTTP protocol, we at Basho have taken to calling Webmachine “an executable model of HTTP.”

This is where Webmachine fundamentally differs from Grape and other existing frameworks:

  • It implements an FSM that is a dual of the protocol, not an ever-varying stack of middleware.
  • It focuses on determining facts about the resource, not performing actions.

This is what I mean when I say that Webmachine is declarative (functional?) rather than imperative. By being declarative and focusing on the facts about your resource rather than “what do I do when I get a request”, a whole lot of complex and error-prone aspects of the protocol are hidden from the developer, and more importantly, done in a deterministic way every time.

In contrast, Grape and most other Rack-based frameworks encourage you to (perhaps unwittingly) redefine HTTP semantics for every application. In my opinion, this is not just error-prone, it is wasteful. Why should you have to define what GET means everytime? You want to focus on the resources your application exposes, not implementing the protocol all over again. This is why Webmachine encapsulates those decisions (FSM!) and includes sensible defaults so that you only have to focus on the decisions and behaviors (transitions!) that your resources need to modify. You focus on what your resources are, rather than what they do.

REST, For Real This Time

Daniel is by no means the only or greatest offender, but I take strong objection to his use of “REST”. He says,

Grape is a DSL for RESTful APIs.

Simply exposing your service over HTTP and not treating it like RPC is not sufficient to be called “RESTful”, you must satisfy the “Hypermedia Constraint”. Daniel admits

…you have to be disciplined about those API methods - they should represent resources, not RPC service endpoints.

…but does not address Hypermedia. I could go into great detail about why the typical HTTP-based API is not REST, but that has been done by some really great people who have said it much better, Roy Fielding, Jon Moore and Nick Sutterer. Do check out their presentations and blogs.

A note on “DSLs”

Rubyists, we have a fetish for so-called “DSLs”. It’s time for an intervention.

In reality, what we call DSLs in Ruby tend to be thin wrappers around the fluent-builder pattern with a dash of instance_eval and class_eval to remove block arguments and necessary uses of self. (One lightning talk at RubyConf humorously called gratuitous use of the pattern “Playskool MyFirstDSL”.) Grape, and its elder cousin Sinatra, follow this pattern. On the surface, it seems to promote clean, concise, readable code. But at what cost? What complexity is hidden? Does it actually help you write better code, faster and more reliably, or are you in the end working around the DSL to do what you want?

So this is where I take big issue with Daniel’s argument:

I would grant Grape an advantage over favoring the API consumer, since it focuses on the expressiveness of the API.

That warm fuzzy the developer gets when writing an application with Grape is not correlated to the experience of the consumer of the API. It is indeed a strength that Grape can generate API consumer documentation from the code, but as Moore and Sutterer demonstrate, a truly RESTful service is mostly self-documenting.

Maybe it’s the fact that Webmachine(-Ruby) is a fairly faithful port of the original Erlang version, but when authoring it I felt disillusioned with metaprogramming magic. Instead of including a module and executing some class methods to decorate your Resource class, you use simple inheritance and override methods. Internally, modules only exist as namespaces and to separate functional concerns of the same class (see Webmachine::Decision::Conneg or Webmachine::Resource::Callbacks), they are never used to decorate or modify the behavior of the class they are included in. Webmachine::Decision::FSM uses a loop to walk the decision graph, where individual state methods either return a Symbol for the next state or a Fixnum that is the response status code.

That said, others have been working on higher level abstractions on top of Webmachine, ones that include “DSLs”. Whether they will provide more value or simplicity over the existing abstractions Webmachine provides has yet to shake out.

So which should you use?

I think if I were still doing web APIs via Rails or Sinatra, Grape would be an extremely attractive alternative to those, having a lower barrier to entry than Webmachine. It’s a great library and very well written. For an application that exposes very simple semantics, the amount of code you need to write in Grape is small, and you don’t need to have any awareness or understanding of Webmachine’s decision flow, and you can get consumer documentation nearly for free.

On the other hand, I have been just as productive in Webmachine (both Ruby and Erlang) and now that I think more in terms of resources instead of actions, it feels more natural. I want to be able to add those extra semantics just by declaring a few methods, without worrying as much about whether I did it right. I want to avoid the cross-cutting, double-blind mentality of the middleware pattern promoted by Rack.

What next?

Like Webmachine has done for the server side, I think we can also do for the client side and for intermediaries (which act as both clients and servers). We can encapsulate the client side of HTTP into an FSM and expose its decisions in a clean way to applications. We can build client and server-side libraries that make working with Hypermedia APIs simpler (Nick’s Roar project is a good start).