WebDAV vs HTTP method semantics

Roy T. Fielding fielding@ebuilt.com
Mon, 27 Aug 2001 01:40:33 -0700


On Sat, Aug 25, 2001 at 11:26:57AM -0400, Mark Baker wrote:
> 
> My hypothesis is that, at least in theory, all operations with side
> effects on resources can be represented with only PUT or POST.  Here's
> an attempt to recast LOCK, COPY, and MOVE in that way.

Yes, they can, but it isn't a good idea to do so.

The REST style doesn't suggest that limiting the set of methods is a
desirable goal.  What it does suggest is that designing an architecture
for which the common case is optimized will produce greater results if
the common case is very general indeed, and thereby applications that
conform to the general case will benefit from the optimizations and
deployment via the network-effect.

That is where tuple-spaces and REST are similar -- they both define an
interaction style that places genericity in the forefront in order to
obtain better optimization.  However, for tuple-spaces (like Linda), the
target architectural domain is that of blackboard styles, so it makes
sense that the read and write actions were relatively equivalent operations
with a very limited set of provable semantics.

That is where the similarity between Linda's tuple-space and REST ends.
REST does not restrict the system to a limited set of methods -- what it
does is restrict the interface such that the methods are easily identifible,
and then only attempts to optimize those that are known to be the common
case -- GET, and to a lesser extent the data-provider form of POST.
Applications that need the benefits of the common case will gravitate
toward the generic methods on their own accord.

In particular, REST encourages the creation of new methods for obscure
operations, specifically because we don't want to burden common methods
with all of the logic of trying to figure out whether or not a particular
operation fits the 99.9% case or one of the others that make up 0.1%.

Doing everything with a common method means that intermediaries must
look to other parts of the message to differentiate semantics, which
causes intermediaries to go through an excrucuating amount of effort
in the common case just to find out what operation to perform if a
message happens to be the uncommon case.

> LOCK's side effect is to change the state of the resource identified by
> the Request-URI to one of locked (per WebDAV lock semantics).  It does
> this relative to the current state of the resource, not to the exclusion
> of it (duh), so this should use POST not PUT.  But how do we use POST?
> We POST a lock entity, describing the type of lock.
> 
> e.g.
> 
> <?xml version="1.0">
> <lock xmlns="http://www.ietf.org/rfc/rfc2518.txt" />
> 
> The use of an XML namespace being the extensibility mechanism I mentioned
> above.  Content-Type on the POST wouldn't have been appropriate because it
> is an attribute of the representation, not the resource itself.  The
> namespace describes the resource.

Nope, in that case you might as well just use XML-RPC.  There is another
way to tackle the same problem that is closer to REST's type of genericity.

Instead, consider a lock to be another resource with a special relationship
between itself and the resource that you wish to deny others access.
What is the state of that lock resource?  How may different values of
that state influence temporary access control to the other resource(s)?
There already exists a large amount of research that defines how management
of one variable can determine the coordination of access for some other
variable(s).  And it is very simple to translate that into HTTP, once you
separate actions on the lock resource from actions on the resource to be
locked.

The actions for establishing and discovering the lock relationship, which
are very generic operations themselves, were at one point called LINK and
HEAD.  Nobody implemented LINK, and it turned out to be bad for latency
for to much metadata to be in the headers, so now we have properties.

> COPY is a means of taking the state of one resource, and creating a new
> resource with that state.  This can be done with PUT.  2616, 9.6 describes
> exactly this scenario;
> 
>    The PUT method requests that the enclosed entity be stored under the
>    supplied Request-URI. If the Request-URI refers to an already
>    existing resource, the enclosed entity SHOULD be considered as a
>    modified version of the one residing on the origin server. If the
>    Request-URI does not point to an existing resource, and that URI is
>    capable of being defined as a new resource by the requesting user
>    agent, the origin server can create the resource with that URI. If a
>    new resource is created, the origin server MUST inform the user agent
>    via the 201 (Created) response.
> 
> So COPY would be GET+PUT.  The downside there being the extra hop, but
> I think that is more than compensated for the by simplicity.  Plus,

No, unfortunately, it isn't compensated by the simplicity.  For example,
let's say the owner of the resource pays for Internet access at $1/MB
transferred.  That owner happens to be distributing a very large collection
of images that rotate through a relatively small number of names.  For
security reasons, the owner is only given HTTP+DAV access to the site.
Which style of operation would that owner prefer?

> as I'm thinking about it now, making an atomic operation like COPY or
> MOVE between two distributed resources seems like it's breaking REST's
> concept of moving state to the user agent, plus it seems to be making
> distribution a lot more transparent than HTTP has.  Hmm..

Huh?  REST talks about a user's application state as being best left to
the user agent for a variety of reasons, where "application" refers to
a fairly vague concept from the middleware research, but not so vague
in practice (i.e., the thing the user wants to do, the purpose for which
the system is being operated, the point within a given workflow at
which the user currently resides, etc.).  But the "state of the world"
still needs to be kept out in the world.  The origin server must know
its own state, and namespace operations are a way for a remote user to
manipulate that state.

> I believe MOVE would simply be COPY+DELETE, aka GET+PUT+DELETE.

Think of it another way.  The URI namespace consists of a hierarchy
of names (collections).  COPY and MOVE semantics are not really operations
on the target of the COPY and MOVE -- in fact, they are operations on the
parent collections that "own" the origin and destination namespaces.
The set of names within a collection are the state of that collection
as a resource.  If you want to make these operations more REST-like,
then define a suitable representation of a collection such that each
namespace can be GET-retrieved, manipulated at the user agent, and then
have the result of those manipulations communicated to the two namespaces
in such a way that they can achieve the new state, barring conflicts,
without unnecessary transfers across the network.  This one isn't easy.

WebDAV did not go with that design for a number of reasons, which I am
sure Jim could describe better than I could.

....Roy