wisn99 (8/19-20)-- very early draft -- comments?

Rohit Khare (rohit@uci.edu)
Fri, 11 Jun 1999 02:59:21 -0700


Namespaces are all around us on the Internet. We notice the big ones,
like Domain names, even to the point of international political and
financial wrangling, but rarely the profusion of smaller ones that
make the Net work: character sets, color spaces, Document Type
Definitions (DTDs), class hierarchies, and more. Computer science has
a long tradition of advocating indirection -- after all, namespaces
are just mappings (typically, surjections) from one set of symbols
called names to another set called addresses. From a software
engineering perspective, then, we have isolated the central virtue of
namespace management: create names when they make design easier by
abstracting up a level of discourse. That's why we might see the
following:

WISN'99 is a name to the human, which he/she resolves to the address:
http://www.ics.uci.edu/IRUS/WISN.jpg which
is used as a name by the browser, in four parts:
1) http, from the namespace of schemes (access protocols)
2) www.ics.uci.edu from the domain namespace
3) IRUS/ from the file namespace of the ICS web server
4) jpg from the Internet Media Type namespace for further interpretation

1) is controlled centrally, by IANA for a few key types; thus we have
RFCs documenting ftp, etc.
BUT, it can be unilaterally extended to, say, iiop:, but that's only
meaningful to the community of people who trust OMG. Or, better yet,
the phone-service-signalling-event notification scheme of WAP, which
has semantics limited to WAP forum members.

Within the browser, the name "http" is resolved to the address of a
protocol-handling module.

2) is hierarchically delegated from 13 root name servers (eventually,
theoretically, reporting to ICANN).
The goal is to sucessively resolve the name as pointers to the Start
of Authority (SOA) of .edu, .uci.edu, and ics.uci.edu. Then, it is
resolved to a single IP address -- or multiple -- for the hostname
"www". Here, the namespace control changed hands four times. Here,
too, we see a recursive-resolver; typically, we ask the local DNS
server to resolve the whole name, and it handles whatever nested
transactions may be neccessary.

2a) the ip address is itself a name, now resolved left-to-right
(instead of DNS right-to-left, or URL's mixed-direction). They are
allocated by ARIN in the USA, and must conform to net/subnet/host
addressing hierarchy. A routing table, then, is also a nameservice,
one that resolves that number to a specific choice of gateway network
for onward routing.

Once a packet addressed to that IP number arrives on the web-servers'
subnet, the IP address is again treated as a name by ARP, the
Ethernet Address Resolution protocol, which resolves the address to
an ethernet id. Those IDs are centrally allocated by IEEE for a
modest fee per-manufacturer.

When the packet arrives at the adapter, it is demultiplexed by port
number. IANA resolves the service-type "http" to port # 80.

3) is resolved by an http server's directives onto a filesystem name,
say "/usr/www/IRUS/WISN.jpg". That's a namespace controlled by the
webmaster.

The filenamespace is controlled by the sysadmin, and then the user.
It is hierarchically resolved to an inode address, which UNIX uses to
indirect the "human name" of a file in the face of moves, renames,
links, and so on.

The inode is an OS-maintained name which is resolved into actual
disk-blocks (which are in turn names for the hard-drive controller,
until you finally arrive at the absolute address of the WISN logo,
the physical location of individual magnetized particles).

4) .jpg is a file extension, which the browser/.mailcap file resolves
to a MIME type. The MIME content-type namespace is IANA controlled,
with a two-level hierarchy and multiple levels of administrative
admission control (vnd. vendor names, application/ unbound semantics,
etc). That mime-type is in turn used to resolve the address of a
loadable library which can handle JPEGs.

Next, you might look inside the JPG for a 'creator' comment, and then
you're in the namespace of email-addresses; but just try to verify
it, and you're in the treacherous public-key management namespace.
Or, if we had downloaded an XML document, the many XML tag namespaces.

[You can imagine there's been a cool powerpoint animation of this
recursive dissection process all along.]

******

From this walkthrough, I can distill my taxonomy of what's going on here:

There are addresses, which are how any given software module/layer
actually gets its hands on information. A memory adddress ("pointer")
is the most basic example of such. In fact, addresses are the "bottom
turtle" of the system -- the bedrock.

There is a construct called a namespace which maps a symbolic "name"
onto none, one, or several addresses by way of a "directory". The
names may have internal structure/hierarchy.

"directories" are further decomposed into two aspects: "registry",
the actors which control the process of name (or address) allocation,
and "resolvers", which are the technological artifacts which attempt
to compute that mapping.

A namespace mapping is not necessarily complete, not a surjection.
Aliases, hierarchies, and "dangling links" all compromise the
integrity of a namespace.

Moral: Every layer's address is the next lower layer's name.

[after all, we wouldn't bother 'lifting' the signifier up a level of
abstraction of the new abstraction -- creating the namespace in the
first place -- if it didn't afford greater traction in the face of
ambiguity]

******

Now, so far, I've just systematized a sense of namespace management.
It does not seem to differ from the sense that's been known to system
implementors for decades.
What this workshop today is intended to do is elucidate a sense of
*Internet-Scale* namespace management issues.

"Internet-scale" encompasses more than sheer numerical heft (mass,
mobility, and multilaterality?). True, we do need to understand the
engineering tradeoffs of building planetary-scale resolution systems.
Hence, our case studies today on the track record of Novell Directory
Server, MS Active Directory, LDAP, and the venerable Domain Name
Service.

Looking beyond numbers, though, we see issues of trust: who controls
the registry? How can we operate truly decentralized namespaces? What
does "integrity" mean if each network node has a different,
potentially conflicting, subset of the namespace at hand? Indeed,
what are the philopshical ground truths of a socially constructed
"name"? we are deep in the domain of semiotics here...

We see issues of mobility: what happens when the directory varies in
time, as entries resolve to new addresses? Even further, what happens
when the directory varies on the observer's state: that
"www.tollroad.com" should bind to the nearest tollroad; and that only
73.caltrans.gov.ca.us can be a stable world-wide identifier? What
happens when the directory gains additional dimensions, like a Common
Name Resolution Protocol hypothesis

******

These issues arise in the context of several current-day IScale
namespace management initiatives, and form a long-term research
agenda for the future.

Why has DNS been so hard to govern? To apply Conway's law in reverse,
what does the structure of the protocol tell us about the expected
governance structure of the Internet?

Why have URNs been such a rathole? Why might CNRP succeed -- or fail?

Are search engines viable, de-facto resolvers for a great-many IScale
namespaces, such as the "set of CS Departments", public key
locations, maps and so on?

Why have the closed-world models of filesystem-scale namespace
management failed for DAV Advanced Collections' hypertext naming
model (hint: the Web doesn't have inodes as persistent, intermediate
addresses for documents in the face of moves, renames, and deletes).

Why is there such fear that unrestrained deployment of XML namespaces
-- which essentially map tag-names to URLs -- could lead to a
gridlock of hypertext versioning roadblocks, in which no one document
can be read from the network without an indeterminate number of
recursive lookups succeeding as well?

For the future, smart-environments demand trusted, *ad-hoc*,
*self-assembling* namespaces -- how might we solve this? Will the
jini or bluetooth tiny-net leader election approaches scale?

How could you maintain a namespace for routing across highway traffic
flows? How about if a helicopter passes by with wider-horizon updates
to share? How about if you want to name mines in a minefield (yet
update it as they blow up?)

In the very long term, when computing and communications are not just
part of every built object, but even approach biological scale, how
will we address an individual diseased liver cell? Can we send a
message to a tumor, wherever it extends?

What are the ultimate limits to namespace management in human
communities -- what can the truth or falsity of the six-degrees of
separation hypothesis imply for the eventual form of public-key
lookup, or e-mail routing? For bringing electronic communications to
the other five billion people on the planet, who don't even have
bedrock addresses like SSNs, or even unique family names?

*******

If you participate in this workshop, I _promise_ you will walk away
with a broader concept of "namespace", with increased familiarity in
several case-studies of IScale namespaces.

I _hope_ we will converge on an ontology for characterizing
namespaces; there are many additional axes to the ones I identified
earlier: localized (natural-language) variants, the rate of change of
bindings, the rate of name creation and deletion, the density of
namespace utilized (related to name length, too), the degree,
direction, and syntax of hierarchy, and so on.

I _wish_ we will converge on a vision for future research in this
area, by identifying novel approaches for creating self-organizing,
self-governing, (self-centered?) namespaces.
For software engineers in particular, I _expect_ this dialogue to
emphasize the importance of identifying your namespaces as a step in
systems analysis. This will clearly identify naming a crucial enabler
of dynamism, an indirection that permits post-deployment evolution of
the namespace.

[in other words, don't treat "iso-8859-1" as a character string
constant, model it as a (replacable) name for a character mapping,
and you make it possible for individual users to define new ones,
say, for Klingon keyboards. ]