Webmachine vs. Grape
by Sean Cribbs
Back in December, I gave my Resources, For Real This Time talk for the third time, this time at NYC.rb. After the talk, I got into a very emphatic discussion with Daniel Doubrovkine and John “JJB” Bachir about the differences between Webmachine’s approach and Grape’s approach and their relative strengths. Daniel followed it up with an interesting blog post titled Grape vs. Webmachine. I’ve had some time to think it all over and so I figured it was about time I wrote a response.
Daniel poses the question “Should you build your next RESTful API with Grape or Webmachine?” Before I address his question (and the inherent assumptions therein), I want to tell you a bit more about Webmachine and why it is fundamentally different from the prevailing approaches.
Protocols are contracts
If you Google “define: protocol”, two definitions appear:
- The official procedure governing affairs of state or diplomatic occasions.
- The established code of procedure or behavior in any group, organization, or situation.
Merriam-Webster gives some additionally detailed definitions:
- a code prescribing strict adherence to correct etiquette and precedence (as in diplomatic exchange and in the military services) <a breach of protocol>
- a set of conventions governing the treatment and especially the formatting of data in an electronic communications system <network protocols>
Another way of saying this is that protocols are contracts or conventional manners of speech and behavior. To violate that contract is to be misunderstood, worse, to offend or to cause unintended actions. Granted, computer protocols may have lesser social consequences than social protocols, but if we don’t speak them properly, our programs won’t work.
Protocols are FSMs
The classical way to implement a protocol participant (that is, a client, server or peer) is a finite state machine (FSM). Why? Protocols are usually defined in terms of “in this situation, do that” or “react to this condition by doing that”. Many of those assertions are dependent on one another, meaning that they are not even relevant if other assertions have not been made previously. To illustrate this better, imagine the protocol of two heads of state meeting. Their meeting might go through these steps:
- Arrive at the same location.
- Shake hands and introduce other participants.
- Enter the meeting space.
- Negotiate an issue.
- Leave the meeting space.
- Arrive and speak at the press conference.
- Shake hands again.
- Depart the press conference.
First, this is a discrete set of steps that must be followed in the order given. It wouldn’t make much sense to negotiate the issue (which might have its own internal protocol) before you shake hands and enter the meeting space, or to discuss the negotiations at the press conference before you’ve done any negotiation. Second, if one part of the protocol fails, other steps in the protocol may never occur! Imagine that upon arrival, the other head of state refuses to shake your hand or even look at you; you might abort the meeting altogether.
Like protocols, in finite state machines, there are also discrete steps (states), and conditions that allow transition from one state to another. A transition may lead to another internal state, or an end state in which processing is terminated. Finite state machines are the essential way to implement protocols.
And interesting side-effect of this coherence between protocol and FSM is that they are duals of each other. The FSM is an implementation of the protocol, and the protocol’s states and assertions can be derived from the FSM. It’s the kind of thing that researchers interested in provability and mathematical formulations of software get really excited about.
So what does this have to do with Webmachine and Grape?
HTTP happens to be a protocol with a simple syntax but very rich semantic possibilities. If your application “misspeaks” HTTP, it might still be partially understood (the syntax may still be grasped), but the other party might miss out on some crucial subtlety your application wants to convey or might take an unexpected or undesirable action as a result.
Despite HTTP’s flexibility (laxness?), it’s still important to speak the protocol as fluently as possible. Building a better Web is just as much about the brick and mortar (the HTTP protocol) as the paint and trim (“Web Standards” in the browser).
Webmachine tries to do just that. Its core is an FSM of the server side of HTTP. The end states are response status codes (e.g. 200 OK or 404 Not Found). The transition conditions come from the “MAY”, “MUST”, “SHOULD” language in the HTTP/1.1 RFC 2616 as well as the less formal aspects of the specification. The FSM determines which transitions to take based on facts about the request and facts about the resource being requested. Because the FSM is a dual of the HTTP protocol, we at Basho have taken to calling Webmachine “an executable model of HTTP.”
This is where Webmachine fundamentally differs from Grape and other existing frameworks:
- It implements an FSM that is a dual of the protocol, not an ever-varying stack of middleware.
- It focuses on determining facts about the resource, not performing actions.
This is what I mean when I say that Webmachine is declarative (functional?) rather than imperative. By being declarative and focusing on the facts about your resource rather than “what do I do when I get a request”, a whole lot of complex and error-prone aspects of the protocol are hidden from the developer, and more importantly, done in a deterministic way every time.
In contrast, Grape and most other Rack-based frameworks encourage you to (perhaps unwittingly) redefine HTTP semantics for every application. In my opinion, this is not just error-prone, it is wasteful. Why should you have to define what GET means everytime? You want to focus on the resources your application exposes, not implementing the protocol all over again. This is why Webmachine encapsulates those decisions (FSM!) and includes sensible defaults so that you only have to focus on the decisions and behaviors (transitions!) that your resources need to modify. You focus on what your resources are, rather than what they do.
REST, For Real This Time
Daniel is by no means the only or greatest offender, but I take strong objection to his use of “REST”. He says,
Grape is a DSL for RESTful APIs.
Simply exposing your service over HTTP and not treating it like RPC is not sufficient to be called “RESTful”, you must satisfy the “Hypermedia Constraint”. Daniel admits
…you have to be disciplined about those API methods - they should represent resources, not RPC service endpoints.
…but does not address Hypermedia. I could go into great detail about why the typical HTTP-based API is not REST, but that has been done by some really great people who have said it much better, Roy Fielding, Jon Moore and Nick Sutterer. Do check out their presentations and blogs.
A note on “DSLs”
Rubyists, we have a fetish for so-called “DSLs”. It’s time for an intervention.
In reality, what we call DSLs in Ruby tend to be thin wrappers around
the fluent-builder pattern with a dash of instance_eval
and
class_eval
to remove block arguments and necessary uses of
self
. (One lightning talk at RubyConf humorously called gratuitous
use of the pattern “Playskool MyFirstDSL”.) Grape, and its elder
cousin Sinatra, follow this pattern. On the surface, it seems to
promote clean, concise, readable code. But at what cost? What
complexity is hidden? Does it actually help you write better code,
faster and more reliably, or are you in the end working around the DSL
to do what you want?
So this is where I take big issue with Daniel’s argument:
I would grant Grape an advantage over favoring the API consumer, since it focuses on the expressiveness of the API.
That warm fuzzy the developer gets when writing an application with Grape is not correlated to the experience of the consumer of the API. It is indeed a strength that Grape can generate API consumer documentation from the code, but as Moore and Sutterer demonstrate, a truly RESTful service is mostly self-documenting.
Maybe it’s the fact that Webmachine(-Ruby) is a fairly faithful port
of the original Erlang version, but when authoring it I felt
disillusioned with metaprogramming magic. Instead of including a
module and executing some class methods to decorate your Resource
class, you use simple inheritance and override methods. Internally,
modules only exist as namespaces and to separate functional concerns
of the same class (see Webmachine::Decision::Conneg
or
Webmachine::Resource::Callbacks
), they are never used to decorate or
modify the behavior of the class they are included
in. Webmachine::Decision::FSM
uses a loop to walk the decision
graph, where individual state methods either return a Symbol
for the
next state or a Fixnum
that is the response status code.
That said, others have been working on higher level abstractions on top of Webmachine, ones that include “DSLs”. Whether they will provide more value or simplicity over the existing abstractions Webmachine provides has yet to shake out.
So which should you use?
I think if I were still doing web APIs via Rails or Sinatra, Grape would be an extremely attractive alternative to those, having a lower barrier to entry than Webmachine. It’s a great library and very well written. For an application that exposes very simple semantics, the amount of code you need to write in Grape is small, and you don’t need to have any awareness or understanding of Webmachine’s decision flow, and you can get consumer documentation nearly for free.
On the other hand, I have been just as productive in Webmachine (both Ruby and Erlang) and now that I think more in terms of resources instead of actions, it feels more natural. I want to be able to add those extra semantics just by declaring a few methods, without worrying as much about whether I did it right. I want to avoid the cross-cutting, double-blind mentality of the middleware pattern promoted by Rack.
What next?
Like Webmachine has done for the server side, I think we can also do for the client side and for intermediaries (which act as both clients and servers). We can encapsulate the client side of HTTP into an FSM and expose its decisions in a clean way to applications. We can build client and server-side libraries that make working with Hypermedia APIs simpler (Nick’s Roar project is a good start).