Re: XML Modeling

Chris Olds (colds@nwlink.com)
Fri, 10 Apr 1998 17:04:06 -0700


At 02:47 PM 4/10/98 PDT, Patrick Logan wrote:
>
>> The reason it is bad hinges on the horrible truth that XML is a
>> very nice syntax for marking up a document (file) according to its
>> semantic content, but does not provide a way to convey the
>> semantics of the document in a standard way.
>
>Please define what you mean by "standard way"...

Note that I said "a standard way", not "the standard way" I think that
there will probably be a number of viable and useful means to do this.

> While this is obviously a hard problem...
> ...shared ontologies...
>
>Are talking about the symbol grounding problem or are you talking
>about the socio-political problem?
>
>I subscribe to the Winograd/Flores(1) approach to systems. I don't
>think you're going to solve the symbol grounding problem. So maybe XML
>doesn't need anything more than for some disparate socio-political
>factions to resolve their differences in any given domain.

As Winograd & Flores point out so clearly, there is no solution to the
symbol grounding problem, but there is a lot of hope for what you call the
"socio-political problem". I happen to think that there is enough shared
context among these so-called 'factions' that it is useful to consider
collecting them into an explicit context, which W&F might call a "systematic
domain" (sec. 12.3, pp174-177). Since I think they frame the point I am
trying to make very well, I'm going to use some quotes:

On Page 174, they say: "The use of a distinction is very different from its
explicit formal articulation. The fact that we commonly use a word does not
mean that there is an unambiguous formal way to identify the things it
denotes or to determine their properties. But whenever there is a recurrent
pattern of breakdown, we can choose to explicitly specify a _systematic
domain_, for which definitions and rules are articulated."

My suggestion of "a standard way" was intended to start people thinking
about the creation of systematic domains to describe the commitments and
conversations that exist in their subject areas. Since my (current)
subject area is model-based applicationa and database development, I have a
vested interest in the creation of systematic domains that cover the basic
conversational elements of business and workgroup applications. The
prospect of seeing most of the data on the web marked up with semantics is
exciting. The idea that I will have to rediscover the semantics of every
part of each of the systematic domains used to mark up the data is
horrifying. What I want to be able to do is read an XML representation of a
model and easily determine the potential commonalities between that model
and a model my user is currently working with. With guidance from my user,
it them becomes possible to define conversations between the two models with
a high degree of assurance that they are correct. This does not mean that
there will be any fewer unique systematic domains, but rather that their
users can avoid an important source of breakdown when they try to connect
models of such domains.

There is more I could say (yadda yadda), but I think Winograd & Flores said
it better over a decade ago.
[p176] " To some extent, the content if each profession-oriented domain
[i.e., spreadsheets, word processors, expert systems /cco] will be unique.
But there are common elements that cross boundaries. [...] The computer is
ultimately a _structured dynamic communication medium_ that is qualitatively
different from earlier media [...]. Communication is not a process of
transmitting information or symbols, but one of commitment and
interpretation. [...] There is a systematic domain relevant to the structure
of this network of commitments, a domain of 'conversation for action' that
can be represented and manipulated in a computer."

If all I have done is raise a problem that everyone already understood, then
I apologize. I have not seen any proposals from the data/modeling side that
address the difficulties in representing systematic domains using XML; the
SGML folks have SGML DTDs and HyTime (ISO 10744:1997) Architectural Forms,
but XML DTDs are not as powerful as SGML, and AFs are apparently seen as too
complex (I don't think they really are too complex, but the documentation is
quite obscure).

What are the problems I'd like to see solved?
1) Model representation in XML - I need to represent an object model in XML
in a way that can be extended without my having to revise my program. This
should work for Dirk (and anyone else that has a modeling tool) as well as
it does for me.
2) Definition of a systematic domain that covers the most commonly used
semantic domains in databases. This is useful even if the number of terms
defined is fairly small. Name, address, phone number, quantity, units, etc.
Any abstraction above number and string is useful information.

/cco