[Fwd: RE: XML vs HTTP]

From: Stephen D. Williams (sdw@lig.net)
Date: Tue Jan 02 2001 - 16:11:12 PST


Arguments related to this are breaking out all over! ;-)

sdw
-------- Original Message --------
Subject: RE: XML vs HTTP
Date: Tue, 02 Jan 2001 12:13:03 -0700
From: Mike Brown <mbrown@corp.webb.net>
To: "'xml-dev@lists.xml.org'" <xml-dev@lists.xml.org>

Mark Baker wrote:
> I would take issue with your conclusion that POST makes the
> best "request > method". In HTTP, GET is the *only*
> "request method"; i.e. the only method with any semantics
> that relate to requesting information.

I disagree. Request != Retrieve.

I am using mostly the same terminology as the HTTP/1.1 specification. See
http://www.faqs.org/rfcs/rfc2616.html section 5. The idea is that you are
requesting that the server do something with respect to a resource. What to
do is implied by the request "method" (not exactly the best term, but we're
stuck with it), which is one of: OPTIONS, GET, HEAD, POST, PUT, DELETE,
TRACE, CONNECT, or a server-specific extension method. POST in particular
means to "apply the supplied content as a subordinate of the identified
resource".

> For example;
>
> - content negotiation, a very request-specific feature, is
> only defined over GET, not POST

I hadn't considered content negotiation w.r.t. transmitting XML. I was
thinking of the more common case where the receiver would only be accepting
whatever comes in, fishing the XML out of it, and disposing of the
entity(-ies) approrpriately.

I still disagree with your position. Content negotiation involves the client
telling the server its preferences for how the response should be
represented. While it is traditionally used with GET so that the server can
identify which language version of the identified resource to return, the
HTTP spec actually does not define it over GET exclusively. Section 9.2
suggests a way it could be used with OPTIONS, and section 12 defines content
negotiation very generically: "Any response containing an entity-body MAY be
subject to negotiation, including error responses."

> - GET is side-effect free, as requests should be. POST isn't.

I can think of a million HTML forms that use GET to send data, URL-encoded,
embedded in the URI. This is certainly not side effect free. The only
differences are that GET does not involve sending an entity in the body of
the request, and that GET means retrieve the identified resource while POST
means apply the entity in the body of the request as a subordinate of the
identified resource.

> - GET is safe. Nobody can claim that any GET request of mine
> meant that I wanted to order their product.

Again, there are many examples of people using GET to send form data, for
all kinds of purposes. For most people the only difference between GET and
POST is whether the form data appears in the web browser's URL box.

What my paper discusses is that it is the *HTML* spec that says what form
data is and how it is encoded in a GET or POST. Yet "generic" HTTP
applications go around offering interfaces to this data, calling it
"parameters" and allowing it to be exposed and processed in ways that are
unsafe for anything other than pure ASCII form data.

When you trying to send entire documents as form data, embedding what is
essentially arbitrary binary data into a form field's value, you risk unsafe
transport. This is not a new issue at all; it is precisely why Netscrape
came up with the multipart/form-data version of POST, so that people could
safely do "file uploads" through HTML forms.

> With a request structure sent over POST, it could be argued
> that I didn't understand that *this* particular request
> structure means that I *did* want to order their product.

Given the well-documented semantics of GET and POST, it would seem that any
well-designed system would use the request URI to imply the ordering action,
not the request method. The "what to order" info should be in the body of
the POST.

> For better or worse, requesting (GETting) information over
> HTTP means not using the XML syntax.

I should probably clarify in my paper that a client can make a request for
an XML document without actually sending XML in its request. A response from
the server can contain an XML document in the body of the response, and in
fact this is usually what happens when an XML document is requested. There's
nothing unsafe about this, although one might argue that the Content-Type is
usually not as specific as it should be.

The problems arise when the request (GET or POST or any other) needs to
contain XML.

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at My XML/XSL resources:
webb.net in Denver, Colorado, USA http://skew.org/xml/



This archive was generated by hypermail 2b29 : Fri Apr 27 2001 - 23:17:46 PDT