Glen Ropel quoth:
> Both data and code are fossilized behavior.  One cannot 
> specify behavior in a static medium any more than one can 
> identify data in a dynamic medium.
>
> Whether one is distinguishing between data and code by saying 
> that "Code is expensive and data is ephemeral" or "... as platform
> half-lives collaps, externalized data lasts longer and longer by
> comparison", it's still a distinction between code and data.
> And this distinction is false.
and Patrick Logan, chorus :
> Ideally, I agree. The lack of a distinction is the asymptote. 
I think we'd all agree that today, in practice, there is a vast gulf between 
data and behavior represenations. The difference in philosophy is that we 
think we think the orthogonalization is essential, not accidental (reread that 
sentence: that's not a duplicate imperative).
Many algorithms can trade off space for time, but we'll put aside the issues 
of working store as 'data' and concentrate on the 'externalizable' state of an 
application object. An auto part record has many negotiable particulars in it, 
but the choice of what data to include or include in its state description is 
more restricted than the choice of how many factors to leave on the heap in 
differential cryptanalysis. 
Application-domain modeling inherently has to separate the state of an object 
and the actions upon it. The actions might be purely functional, allowing the 
behavior to be enumerated consequent to the data; the data might limit the 
range of actions -- there are many interactions between the two that vitalize 
the object in the context of a given application. But there is something 
essential, historic, about the continuous record of the "state" of said 
object. 
The crux of the matter is that we think that those descriptions are often more 
stable than the methods impinging on that state. The auto part record changes 
more slowly than the inventory, accounting, security, tax, and hazardous-waste 
tracking subsystems do. The more methods, the lower the mean time to failure 
(new version) for the 'code'; and yet, if we can easily extend the 
externalizable state (a new haz-mat bit, say), then the state representation 
does *not* fail (get re-versioned) as often. The auto-part's web-page gets 
richer and richer as more meanings are aggregated together (especially easy in 
XML; especially hard with CORBA/DCOM serialization).
The only essential (in the Brooks sense) difference we can cleave to is the 
interorganizational intent. The state description of an artifact is intended 
to be understandable by any observer. The behavior is intra-organizational by 
default: behavior is only standardized by exceptional law. 
Attack it as circular logic, if you please, but we see Data as precisely the 
description which any two observers can agree on; that's why you're 
externalizing it in the first place. 
===============================================================================
That said, we can return to our advocacy of XML already in progress...
[Adam's section]
Doug Lea quoth:
> > We would like to see more use of XML to capture the thing itself,
> > not just the interface to the program which manipulates the thing.
> In the general case, `capturing the thing itself' requires behavior
> description. Right?
Our philosophy is that behavior cannot be totally known to the outside
observer -- that it is too difficult to completely describe a complex
object's behavior outside of some constraints on inputs and outputs.
What we think will happen is that developers will take a scientific
approach to "learning" behavior: since all you can do is observe the
results of the behavior -- the artifacts it leaves behind as snapshots
-- then you'll have to draw conclusions as to what kind of behavior
would generate such results based on those snapshots.
The only way to completely know an object's behavior is either prove it
conforms to a rigorous specification [very hard for complex objects] or
look at the source code and step through it operationally for all
possible sets of inputs and outputs.
If you don't have a rigorous proof, and you cannot look at the code
behind an object's interface curtain, then you have to look at the
outputs produced by that object, and make conclusions from there.
Doug Lea, further:
> I remain clueless about how this is supposed to
> work.  Purely declarative approaches to behavior description are
> challenging at best. (As far as I can tell, the remarks that Dennis
> deChampeaux (mostly) and I wrote about this 6+ years ago --
> http://gee.cs.oswego.edu/dl/oosdw3/ch5.html -- still basically hold.)
> More concretely, suppose I want a description of: 
>   A Water tank
>   A Car alarm system
>   A Web server
>   A Bank
>   A Telecom switch
>   ...
> How would I go about it in an XML/.../... -based object system? 
You would describe the artifacts left by them -- in the form of data
stored or exchanged (usually attribute-value pairs):
    A Water tank -- temperature, volume, contents of tank
    A Car alarm system -- events dealt with and responses
    A Web server -- log of requests and server respones
    A Bank -- debits and credits for transactions on accounts
    A Telecom switch -- list of switching decisions made and bits moved
Of course, this doesn't address the conveying of the behavior of these
systems.  If we take a more data-centric approach, we can instead try to
ascertain behavior from our observations of the system -- the way
physicists, chemists, etc do.  This seems a much more promising approach.
[Rohit speaking]
I have to offer my kudos to Doug for introducing a well-written resource to 
ground this discussion by citing the book chapter. The truth is that in many 
ways, there *isn't* anything whiz-bang about an XML/.../... object system. I 
wouldn't quibble over the book's description of a door's state. I'm just 
saying I'd prefer to see <FRAME><DOOR ANGLE=37></FRAME> because that snippet 
shows:
	* building a data schema, a comprable task to designing a class hierarchy. 
The data architect has to invest additional time and thought to say that his 
House DTD needs to have FRAME with possible subelements DOOR and WINDOW, much 
as a CAD designer might have to for the behaviors thereof. 
	* allowing well-formedness to add value to a common artifact. Later, a house 
visualization on top of the door-actuator can add <COLOR> elements to any part 
of the house for its own purpose -- without interfering with the 
door-actuator's ability to parse the door angle. 
	* using a human-editable format. It's more conceivable that if my home 
automation system was described in such files, I could debug it in EMACS 
xml-mode, write scripts to shut all the doors, or so on. 
These are engineering tradeoffs of marshalling formats. That's why we argue 
for XML over XDR or Q in our paper:
   http://www.cs.caltech.edu/~adam/papers/xml/xml-for-archiving.html
Our opinion is that data can be observable without the code, left as a
document representing a checkpoint in the computation.
Think of a mailing list.  Each post is an observable checkpoint in that
mailing list, which is both stored and transferred as a document.  It
represents a statically viewable slice of an otherwise fluid, dynamic
discussion -- the mailing list "object."
> > Finally, a word or two on the hype behind XML.  There's a lot of it.  We
> > contribute to it as often as we can.  We think it's important to get the
> > word out to the 97%.
> Whoa.  Lets look at that statement it little, its a wee be pretentious. 
> "Get the word out to the 97%".  I'm sorry, but that is INSULTING.
Why is it insulting to get the XML message out to the 97% who have not
heard of XML?
> I'm finishing a Masters degree in this stuff and I've been working
> professionally as a programmer for almost 7 years now and I'm lumped in
> the unwashed 97%.  My opinion is just as revelent as they guy who just
> got his first AOL disk just because I'm not in the object church.
When did I say your opinion wasn't relevant?  All I meant was, XML can
be useful to some people, so we should get the word out about it.
Remember, I believe in the right tool for the job.  The more tools you
know about, the more tools you have available when it comes time to pick
the right tool for the job.
> I have news for you, you have got to work on your evangelizing skills.
> You aren't informing.  You are ANNOYING.  Professionals in the field
> are not going to LISTEN you with this kind of attitude.
What attitude?  We like XML, we want other people to like XML.
What's wrong with that?
Remember, too, Dave, that this is a mailing list debate, not a refereed paper. 
And if you understand the Zen of the 97%, you'll recall that no one "is" or 
"isn't" in the 3%. It's a point of view on an issue, not a call for the Elect. 
The truth is 97% of the world hasn't even heard of SGML or XML; 97% of those 
who have heard haven't studied it; 97% of those who have studied it haven't 
worried about how to apply it to systems design (not just traditional 
"documents"); and 97% of those who have considered its role in distributed 
computing state capture haven't worried about the fat of its concrete syntax; 
and 97% of the proposals I've seen to slim down the syntax and compress XML 
streams don't worry about temporal layout of streamed XML parse trees 
(breadth-first for page layout; depth-first for dictionary delivery...) 
And I have a hint: there's no Swami Connolly or Guru Bosak or Maharishi 
Berners-Lee sitting up on the peak offering enlightenment. Some of us are 
gonna have to knock stones together until something catches fire. Or gets 
flamed to death :-)
Rohit
(Adam)