AGENDA: WebStep

To: WebStep (khare)
Thu, 26 Jan 95 06:50:57 PST


Well, let's get the ball rolling. Frankly, I wish I were further along, but due to a series of interrupts, I haven't had time to put together my suite of proposed specs (maybe Friday night). What follows is my attempt to jump-start a thread on what WebStep could accomplish. It's stream-of-conciousness, and meant to jumpstart the debate. I suggest that replies be cut and focussed on single points, rather than quoting this mess.

*****************************
* URL Exchange *
*****************************

Two levels: first, how can we exchange URLs between apps as strings, and further, a possible W3Link object to encapsulate URI structure.

First, we need a pasteboard type for exchanging these bits

NXAtom W3URIPasteboardType "WebStep Universal Resource Identifier 1.0"

Question: should this have accompanying data or should it simply be an added type for -declareTypes, and the data-string left in NXAsciiPbtype?
Pros: ASCII interchange still works for non-W3 apps (see below)
Cons: We may want to put the "title" in AsciiPbtype and URL in this

Question: What are allowable formats for the associated string?
<URL:scheme://host/reference#extension>
scheme://host/reference#extension
scheme://host/reference
host
or some garbage, and we have to "recognize" URLs
(this last is possible if we have a service, "Copy as URL")

The answer to the above question might be in level2, a W3Link object.
Such an object, like NXColor, would wrap around URLs as the sole programmatic interface. Default values are listed first; lists are regulated by IANA.
- type; {URL,URI,URN}
- scheme; {http,gopher,wais,telnet, &c}
- host;
- reference; // Can we know enough to parse this out?
Such an object would also have to understand the encoding and escaping scheme of the standard: '~'=%7e ' '=%20, and other scheme-specific encodings, such as those for mailto:

- {read,write}Pasteboard:pb
- {read,write}Stream
// so we can define standard, draggable .url files

This object probably fits as an NSDictionary, since we probably want it to be extensible to handle vendor-specific information gracefully. Other applications of W3Link are mentioned below.

*****************************
* HTML Exchange *
*****************************

As files or intact streams, I think we can leave it alone by requiring fully-conformant HTML level 2. The question of encodings is raised below.

NXAtom W3HTMPasteboardType "WebStep Hypertext Markup 1.0"

As a pasteboard type, we have two scenarios:
1) we are working with ASCII, and merely wish to denote its origin
2) we have a multiformat selection, like we handle RTF/ASCII

I suppose this decision hinges on what plans we all have for working with HTML input, and possible filter-service applications. Again, these correspond to the "does our type have a payload?" question above.

At the other end of the spectrum is the possibility of writing a (very) simple HTML container that can be used to extract structural information from the <HEAD> areas. I think we should NOT be involved in anything approaching the definition of an object-library for manipulating HTML.

*****************************
* .htmd File Format *
*****************************

This one is a raging open question, but probably the easiest settled
Generally, we want such a format to be self-contained, so the intention is that the wrapper can include other files, directories, symlinks; and that such structure should be preserved by cooperating applications (eText, for example, garbage collects). Here are a few of the design choices:

.htmd (HyperText Markup Document) vs .htmld
index.html vs. _____ (TXT.rtf, multiple topics, a la sanguish)
multi-format representations (index-ascii, index-mono, index-color)

"Best Practice" guidelines:
relative linking: should be able to move wrapper & trees around
quantized colors
no NetScape-hacked HTML!
keep "original" material around (.eps, 44khz audio)

*****************************
* HTML Entity Encoding *
*****************************

Obviously, there are a lot of holes in mapping the NeXTSTEP encoding to HTML entities, ISO-Latin1, or to asian scripts. Furhermore, we have no consistency for Symbol font. Take a look at the eText .encoding files in the appwrapper, and you'll see

*****************************
* Scheme Service Registry *
*****************************

This is NeXTSTEP; we have NeXTGopher, YFTP, a plethora of apps that are designed to work with specific internest resources. What if we could register apps to W3Link like we do file-extensions?
- bestApp;
- defaultApp;
so we can route URLs to be opened in appropriate apps (and make the appropriate input transformations from URL strings to app-strings)

Counterexample: eText is forced to compile code attempting to NXPerformServce("Open URL") for OmniWeb.

*****************************
* More More More! *
*****************************

There are a few more ideas I don't have time for right now. The most important is the defnition of a standard docInfo, which allows us to track the meta-information of a document (author, title, parent, evetual URL when converted, date, parent, links....). This would allow much more sophisticated interapplication document reference & management (eg. having the eText navigator link to Pages or Diagram documents). Similar to, but much more open than, NeXT's .linkdb.

************************************************
Finally, a note on the timeframe. I hope we can come to a consensus on some of the basic, most essential points fairly rapidly; in the best of all possible worlds, we'll fold these specs into the imminent releases of Pages 1.7, SpiderWoman, OmniWeb .8 and eText5 .95 -- ie in the next two or three weeks.

Looking forward to the deluge,
Rohit

PS. The address for replies is WebStep@mail.xent.caltech.edu ; you need the "mail"