REST misunderstandings

My apologies if you’ve been looking for this and found it 404; my hosting provider had a uncharacteristic data-loss, and the backups were incomplete. -jsled, 2005-10-06

It’s really unfortunate that the REST community has mis-represented REST so thoroughly.

Mark Baker points to an excellent example, and I offer another: Amazon Web Services talking about SOAP vs. REST.

It’s quite unfortunate when one of the largest “REST-interface” providers doesn’t really understand what it means…

The key issues I have here are the following:

  1. ignorance of the de facto REST-architecture technology.
  2. ignorance of the de facto SOAP-technology architecture.
  3. ignorance of REST’s extensibility model.
  4. ignorance that REST has an even-better-defined-processing-model than SOAP, by definition.

But the real issue here is that the REST camp has seemingly embraced RPC-invocation-over-HTTP as “REST”.

Ignorance of the de facto REST-architecture technology.

The AWS piece seems to imply that HTTP is not the standard technological implemntation of REST. C’mon.

Ignorance of the de facto SOAP-technology architecture.

Most SOAP interfaces published on the web are there to support a RPC architecture. Both the “REST” and SOAP AWS interfaces are exceedingly RPC interfaces.

Ignorance of REST’s extensibility model.

Certainly the ability for a transport protocol to negotiate a representation, as well as the architecture’s explicit enumeration of a representation’s media-type and encoding lead to a much more interesting extensibility model than SOAP, but that may be a topic for another article.

Ignorance that REST has an even-better-defined-processing-model than SOAP, by definition.

This is the one that really makes me believe that the AWS team — and, in fact, the majority of people out there — mis-understand REST.

“The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components […]” — REST, emphasis mine

SOAP is primarily a mechanism by which to invoke arbitrary functions on a remote host. WSDL — perhaps the most closely-related SOAP technology — serves to describe these functions.

Purely, there is a strict subset of the use of SOAP whereby the functions are not arbitrary, and are in fact agreed-to before hand. For instance, we might agree to the function-set “GET”, “QUERY”, “SUBSCRIBE”, and “SUBMIT-FORM”. Or maybe “GET”, “POST” or even “GET”, “PUT”, “POST”, “DELETE”. Perhaps “SEND”, “RECEIVE”. But the central feature here is that we have a uniform interface between components.

In any case, it’s basically incorrect that SOAP and REST are basically the same thing. It’s also misleading to say that REST is basically an “introductory” SOAP. They’re just really different things, and to indicate otherwise is to not understand the difference.

The REST community, however, should have been making that difference quite clear for a while, now.

SHAREd Services

A brief history, if you will… the last few years has seen a very strong push towards “Web Services” … using XML and HTTP in order to provide application-to-application services. This push has spawned SOAP and XML-RPC, primarily, as ways to do RPC using HTTP as a transport and XML to encode the data. As well, we’ve seen the identification of REST as a description of the architecture of the Web. In the course of the SOAP and general Web Services discussions, a tangible and often contentious “REST vs. RPC” thread emerged.

The WS-* efforts have been well-intentioned, seeking to build upon the basic specs and each other to provide agreement about increasing levels of complexity. Many of them seem to want to solve interesting problems, while many seem like magic pixie dust. But in any case they are complexity. Not necessarily needless complexity, but complexity none the less.

Thus, more recently, there’s been a very strong and understandable backlash against all of this, especially from a set of people looking to provide simple procedural APIs to developers. They don’t need WS-ReliableMessaging, since TCP is reliable enough. They don’t care about WS-Security because HTTPS is there if necessary. WS-Attachments are silly to them, because they already know how to transmit data…

So they reject it all, and use what they know, the tried, true and simple HTTP and XML. But fundamentally, they’re still representing an RPC interface, so it comes out like this:

To create a foo item, call the “createFoo” procedure by doing an HTTP GET of http://api.example.com/services?operation=createFoo&name=My+first+foo&developerId=12345&applicationId=54321 You will get back the XML document “<created />”. Then you can execute the “getFoo” operation with the same name token as the item you created via http://api.example.com/services?operatoin=getFoo&name=My+first+foo&developerId=12345&applicationId=54321

These APIs are nice and simple, and quite welcome; as such, developers eat them up. Everyone everywhere has a simple HTTP client library, and its often part of the basic development platform. There are hundreds of XML APIs that are all different from each other, and thus all work and suit individual developers tastes. It’s not SOAP and it’s not XML-RPC … it uses HTTP and XML … it must be REST. We now have a RESTful API!

Wrong.

REST does not mean “do RPC over HTTP simply”. There are many other factors involved in it’s description and execution. Using hypermedia as the engine of application state is an important one. This means both identifying “hard” resources within the application such as a news article at /articles/12345 and application states such as /articles/createNew.cgi. REST is heavily intertwingled with well-known representation formats like HTML and its forms; it is described well by web browsers that understand what to do when you click on a <input type="submit"...>.

Another important property of REST is the uniform interface — that we only need to agree to a small and uniform set of operations a-priori … we don’t need an IDL, per se, since there isn’t a novel interface to describe. As the client application, I know that you support GET, and I can use it to safely retrieve a representation of the current state. If you, the server, want me to POST something there’ll be a form that describes what I the client need to do – what fields are necessary and where to POST or GET and in what encoding and … – in order to accomplish some operation. Everything necessary for the application to submit that processing request back to the server and have it be understood will be present in said form.

It is important, for instance, that the server is branching its processing based on a token that the server just provided along with the form, rather than an operation name that the client knows to provide to the server to invoke the operation. To a large extent and in terms of what has made the web so successful, it is vitally important that the client not understand the semantics of the form at all.

REST describes all of that, and more. Calling these APIs RESTful is unfair to REST and to these APIs. It makes it hard[er] to discuss REST, since now REST has it’s original sense plus a “simple RPC over HTTP” sense. As well, it makes it harder to comprehend these “RESTful” APIs, as they should be expected to exhibit all of the constraints that REST sets forth, yet they don’t necessarily.

As such, we need an alternate name for these APIs. I believe a fair label is SHARE: Simple HTTP API of RPC and Encoded data.

I think this name captures the most important aspect of these APIs: their simplicity, use of HTTP, use of the RPC paradigm and trafficing in encoded data. (Okay, the “encoded data” part is silly — what is unencoded data, after all? But it does make the acronym work, so c’mon). More importantly, that they exist in order to share functionality with the other applications or systems.

It is my hope that distinguishing between these two concepts – REST and SHARE – helps allow people to develop both. We have a lot of experience creating procedural interfaces and systems, but the web is demonstrably better at scaling, is more generally useful, &c. I hope that we can find ways to build services and interfaces that are as simple as the procedural ones we’ve come to know and love, but as powerful, flexible and as scaleable as the web. But we’re not going to do that without identifying which is which appropriately.

RE: REST Design Question #5

Wherein I respond to Dave Megginson’s last REST design question.

Dave’s strawman notwithstanding, there’s nothing necesarily unRESTful with using XML-RPC-encoding, SOAP’s rpc/encoded, RDF, TM, “doc/literal” domain-specific XML as the representation encoding. I’d argue that … in terms of RESTfulness … the most transparent, visible, unambiguous and resiliant formats are the best, but maybe that’s just me … a quick re-reading of Dr. Fielding’s dissertation about representations doesn’t constrain the formats.

As well, I would agree that a bottom-up standardization of useful literal-XML patterns is a good thing; anything to save us from more data- and content-type XML encoding; as Tim Bray said recently, “[…] it would appear that itís more valuable to know what something is called than to know what data type it is. Thatís an interesting lesson.” Don’t get me wrong, being able to trivially [un]serialize data-structures on the wire is useful. It’s may be just not the best way to go about building internet-scale services. What if your language doesn’t have unsigned types, for instance? Or you try to serialize to me a super-complex data-structure that my platform simply has no implementation of? I believe it’s better to let [the client] handle how I’m going to store data, and let us exchange literal- rather than encoded- content.

In any case, I would argue (and I have been increasingly, recently) that “REST” is the wrong marketing word for what you’re really talking about. I believe what you’re calling “Amazon, EBay’s and Flickr’s ‘RESTful’ Web Service APIs” are in fact not [necessarily] RESTful … I think it’s better to characterize them as SHARE — Simple HTTP API via RPC and Encoded data.

So, yeah, REST continues to be an architectural style. For a variety of reasons, the word REST has been co-opted to describe not-necessarily-RESTful things … SHAREd and SHARE-friendly APIs, specifically. Please join me in tilting against windmills^W^W^Wtrying to call things by reasonable names.


So, now, maybe Dave will entertain a question I have.

RESTafarians boast that there are RESTful web applications already online for Amazon, , eBay, Flickr, and many others, but developers quickly figure out that they donít get any benefit: each REST application requires its own separate stovepipe of code support right from the ground up, because they all use different content formats. If these all used XML-RPC or SOAP, there would be many standard libraries to simplify the developersí work, and a lot of shared code that could work with all these sites.

Is the bit that’s shared across all these sites the XML <-> object mapping only, or something more? What else are you thinking of? For bonus points, why are XML-RPC or SOAP required for those things to be librarized?

RE: REST Design Question #4, part 2

In response to Dave’s update…

Responding to the RDF bits, I guess RDF/XML makes what you’re trying to express a bit more verbose, but it’s hard to call it difficult. It’d end up being…

John Hughes

…with ‘director’ as an rdf:type, and ‘name’ being the property/relation to the subject “12345.html”; or, in Turtle/N3…

<12345.html> a :director; :name “John Hughes”.

I’d recently worked out a different XML serialization — http://asynchronous.org/rx/ — for RDF that is a little more concise for data that’s mostly properties, and not so much about rdf:types. It also does not allow mixed-content, but it doesn’t require node-typing and thus element-striping; it would allow you to say…

John Hughes

… though that would result in the Turtle…

<12345.html> :director “John Hughes”.

Frankly, I like the version with “name” involved better, but it’s your data-model. :)


In any case, I see a bit more how this is specifically about REST, but I’d still argue that it’s more about change-handling in coarse-grained API design. If in a [local] procedural approach you may have…

Film.setName( String newName ); Film.setYear( int newYear );

… the argument goes that in a remote-procedural environment you should really, and only, just have …

updateFilm( Film newFilmData );

… to cut down on network round-trips, update overhead, synchronization/coherency concerns, &c.

So there is an interesting issue about how much of a representation — either the whole representation or just the changed aspects — need to be given back to the server. I guess it’s really a function of the server’s capabilities, as well as an assessment of the client’s capabilities, as well as what’s reasonable for the application/API.

Back to a “RESTful” worldview, one way to go is simply say “PUT back the whole representation”, which as you’re saying can be weird. Reasonably, the server may ignore [either silently or loudly] changes to some fields. An optimization of this may be to used something like RFC3229 to be more … um … optimal.

Another way to go would be to put something inline in the representation — either a form or indiciation that a certain convention is supported — that effectively says to the client “hey, you only need to send changed fields on update” or “here are the change-allowed fields” or something.

I’d consider these both RESTful, since both are talking about a constrained interface, placing the semantics in the media-type and document content, and using hypermedia as the engine of application state. The former certainly implies more server- and client-side smarts, and the latter is certainly what the web does. I guess if I were trying to be pedantic, I would call the later more RESTful, since the server is directing the client about how to proceed …

… but this is also the point I start to get into trouble about RESTful API design. In traditional RPC/RMI environments, the client is built with a specific understanding and knowledge of the API space. In the web, a human sits just past the client-side UserAgent, and interprets the generic HTML and forms that come back. I believe there is an intersection of those two, but I’m having trouble finding it.

RE: REST Design Question #4

Dave Megginson’s 4th REST Design Question

  • I don’t see how this is REST design question in particular.
  • I think questions and concerns like this are why a number of RESTafarians are also RDF-heads.

I think the ultimate response to this question is: Do The Right Thing for your clients. If you anticipate them needing to see “John Hughes” in order to make a determination about whether to traverse the associated subject-URI^W^W^Wxlink:href, then do so. But it’s just generic information- and API-design questions at this point.

REST Design questions 2 and 3

Dave Megginson continues with his REST design questions. Herein, I respond:

Question #2

Yup, I’ve used query-paramaters in the way you describe to paramaterize a large resource space. I don’t consider it particular inelegant or ugly, but maybe that’s a matter of perspective…

I believe that retrieving the “container” resource should respond with a form, indicating something like … “the resource is large, and supports paging, and the following values can be used to paramaterize the resource-space: * offset:non-negative integer * count:bounded integer:0..50 * sortCode:enumeration{ * ‘name': ”’sort on object’s name”’ * ‘size': ‘sort on object’s size’ * …} * sortAscending:boolean “

As HTML has its forms, we probably want media-types tailored to machines … either a new level of user-agent mediating for the user, or an autonomous one.

One of the more annoying issues I’ve run into with this is documentation of the cross-parameter interaction of complex query-spaces. Once things become a flattened name-value-pair list, I don’t believe you actually lose representational complexity (vs. submitting, say, a QueryParam struct on a more traditional procedural API), but you definitely lose representational ease. You may have to do things like lexically-structure your paramater-names, or disambiguate them. That bit is pretty inelegant, true.

Question #3

Links means whatever is specified by the format defining the context in which link appears. For instance, html:a@hrefroughly means “this is the the location that should be travesed to in the user agent when the contained object is clicked on”, or something. In the @xmlns case, yes, it does not mean what html:a@href means, but why should it?

And, yes, while the URIs which are used for most xmlnses don’t resolve, this is a bad thing, specifically for the reasons you’re getting at — HTTP URLs should resolve. This is especially true for namespaces, IMHO, because more often than not the semantics are implicit and unknown, unlike HTML where it’s expliclit and well-known.

Yes, there’s some nice ideal world where a spider can simply GET deeper and deeper into information resources. And in a RESTful web, that may even be very true. But it’s a bit naive to believe that at some depth the crawler doesn’t need to understand a least a bit of the document semantics — in fact, an important part of Fielding’s dissertation is that the HTML semantics are wide-spread and well-known.

Re: REST Design Questions

Dave Megginson has posted some questions about his understanding of REST.

First off, please do look beyond the “there are only 4 [HTTP] verbs” aspect of REST. “Hypermedia as the engine of application state” is hand-in-hand with Constrained Interface as being fundamental to the RESTful web. As well, the GET/POST distinction is critically more important than any other aspect of constrained interface, I believe.

Back to question 1 … yes, the identifier does need to be recorded. This is the client’s responsibility, though a good server would help. That is: if the usage of the protocol is embracing REST, then each message would already come back with the identifier of the resource baked in. In any case, I think there’s a fallacy that the resource representation alone is or should be solely sufficient to describe the resource … if that was the case, why does HTTP need headers at all!? Certainly, there is meta-data that the client must be responsible for.

On the client side, I think your options are: keep the data external to the representation, or keep it within the representation. Since you’re assuming XML documents, I recommend that you use it’s extensibility, maybe via a namespaced attribute on the root level. How about thisDocumentWasHttpRetreived:from="http://.../", or maybe rdf:about?. I’ve also used [eg:]href, which works well; you may like xlink:href, though.

Regarding identifiers and locations … it’s nice to know the identitiy of things; it’s also nice to have locations of things. But it’s really nice when your Uniform Resource {Identification,Location} scheme supports redirection, such as HTTP[s] does.

Amazon’s RESTless WS API

Steve Maine’s Isomorphism post is really good until the end, where he tries to make a point about Amazon.com’s “touted” RESTful web service API.

The REST crowd touts Amazon’s human-focusedservice as a REST example, not the WS API.

Only Amazon themselves claim that their non-SOAP API is RESTful, and I believe it’s because they’re confused. As well, there’s not a marketing term for “simply using HTTP query strings to invoke RPC which returns XML without doing SOAP or XML/RPC”. Thus, everyone calls it REST, even though it isn’t, necessarily.

So, the reason that <wsa:Action>urn:GetCart</wsa:Action> and http://webservices.amazon.com/onca/xml?Operation=CartGet[...] are isomorphic is because they trivially are. They’re both RPC.

Re-creating the same thing in a purely RESTful way would not look isomorphic, depending on how much of HTTP you want to exploit. I’ll leave that as an exercise for the reader, for the moment. As well, Joe Gregorio has been developing that idea out.

SOA and web services

The following is a message to the service-orientated-architecture yahoo group. It’s in response to a thread about standardized interfaces and protocol semantics. I don’t think it bring anything new to the table except my own understanding of some issues, but – especially since Yahoo Group’s search interface is so bad – I’m going to post it here in case other people find it useful.

On Mon, 2005-01-10 at 19:14, Spork, Murray wrote:

Should we add new verbs that incorporate (for example) complex transactional semantics – or should we instead refine the current verbs through other mechanisms. For example – by layering other protocols on top of HTTP. If we do the later then I strongly agree with Mark – the semantics of the verbs in the higher level protocol should not contradict the verbs in HTTP.

Yes.

I think what is happening here is that the HTTP verbs are being implicitly refined by attaching some semantic behavior to the resource – in this case the verb PUT is refined by stating that a particular resource is of type “order-acceptor” – in this way a client (that understands what an order-acceptor is) can infer that PUTing an order to this resource will cause a order to be placed.

IMO verbs are all about defining side-effects.

Yes. The hard assurances, side-effects and basic semantics are defined by the verbs, and refined, as well as contextualized, by the type of the data being operated over.

It seems like all the effort that’s gone into WS-* in the last few years could have instead been directed at determining new verbs, collections of verb-profiles and more-rich media-types. Somehow, a lot of MEPs and the yet-another-RPC mechanism doesn’t seem as useful… I’m not exactly sure why not, though.

nor that HTTP is sufficiently flexible for allowing the incorporation of new verbs.

Hmm… is it even possible to have a protocol that both specifies operational semantics and does not do so … and still be efficient?

I’m not sure I understand your question – could you elaborate?

Not well enough, as it’s a poorly formed quesiton: “efficient” is too loose and relative of a word. But that won’t stop me from trying :) …

Especially for the “does not do so” part, I guess I’m thinking of a protocol whereby the operation isn’t specified, so much as the semantics of the unspecified operation being performed. I guess somewhere between HTTP and RPC; one would make an request^Winvocation of the form:

[ a :Invocation ; :informativeOperationName “get” ; :idempotent “true” ; :safe “true” ; :cacheControl [ :cacheable “true” ; :cacheStateTag “12345” ; :expirationDate “2005.01.12T10:19:00Z” ] ].

[For the “specifies” part I was thinking of something like HTTP, where the above properties for a handful of operations are specified in the … uh … specification. But that’s not important once you have the above.]

But what I’ve specifed is basically what’s become of SOAP and WS-*, except that — for the most part — the transfer semantics aren’t explicitly specified in the content of the messages, but are implicit in the namespaces and qualified-names present in the data being exchanged.

I guess that it is the obvious and direct consequence of the implementation of SOAP’s design… the quest for “protocol independence” leads to trivial use of a transport protocol that is basically: open connection and execute “Process Message”. Since all the semantics are defined in the data, there is no need for protocol-level verbs. As well, it actively seeks to ignore the semantics defined by any protcol it “rides”. Thus, an application protocol like HTTP is converted into transport.

As well, answers about “efficiency” are now more clear…

  1. no, it can’t be as efficient (as HTTP), since intermediateries need to do message inspection. A lot of message inspection.

  2. as well, it’s hard to see how one even describes the full semantics of these messages in a machine-processable way…

  3. but, it can be quite flexible, which further reduces intermediateries capabilities.

  4. the underlying protocol must be completely ignored to prevent conflict with higher-level semantics.

It continues to sound nothing like the web. I guess it could be used to implement services.

“How to create a REST Protocol”

xml.com just posted a a nifty article by Joe Gregario titled How to Create a REST Protocol.

Right along side it are articles from Bob DuCharme on using XSLT to access [RESTful] web services and Mike Dierken on a RESTful analysis of various web services.

All quite interesting reading.