Rohit
Begin forwarded message:
Date: Sat, 28 Jan 95 01:18:03 PST
From: eps@toaster.SFSU.EDU (Eric P. Scott)
To: khare@xent.caltech.edu (Rohit Khare)
Subject: Re: MAILING LIST: WebStep - a standards effort for W3-aware document
management
In-Reply-To: <3g20ke$jqt@digifix.digifix.com>
Newsgroups: comp.sys.next.announce
Organization: San Francisco State University
Reply-To: eps@cs.SFSU.EDU
Sigh, whimper.
In srticle <3g20ke$jqt@digifix.digifix.com> you write:
> * Define interchangeable file & pasteboard formats for
> - W3 URLs, URIs, URNs
pasteboard formats are trivial. These are ASCII text, right?
(file formats? huh?)
> - HTML Pasteboard type
messy. I can envision at least three text representations:
"Canonical HyperText Markup Language v2.0 pasteboard type"
All text is ISO Latin 1; CRLF separates lines.
"Portable HyperText Markup Language v2.0 pasteboard type"
All text is ISO Latin 1; \n separates lines.
"NeXT HyperText Markup Language v2.0 pasteboard type"
Literal 8-bit characters use NextStepEncoding;
&#nnn still interpreted as ISO Latin 1; \n separates lines.
[This is most similar to "NeXT plain ascii pasteboard type"]
Of course, if you want to turn HTML into typed streams, things
get a bit more complicated!
> * Specify the .htmd/.htmld document types
bogus. :-)
> * References to selections within NS documents
As in NXSelection? Good luck.
> * Opening URLs in compatible applications
yawn.
> * Explore encodings from NeXTSTEP & Symbol to HTML
obvious. HTML truly believes in ISO Latin 1. This means
that most of NextStepEncoding maps over, and very little of
Symbol does. However, many of the "missing" characters have
agreed-upon named entities in the SGML world, and it would seem
logical to use those. For the remainder, I'd simply extend
&#nnn; to &#nnnnn; where nnnnn is a Unicode code point. This
should take care of NEXTSTEP, Symbol, Zapf Dingbats, etc.
-=EPS=-
"don't thank me, I'll bill you later"