RFC#3: HTML Pasteboard Type

Rohit Khare (khare)
Fri, 10 Feb 1995 14:01:43 -0800


_____________________________________________________________________________
WebStep RFC #3: HTML Pasteboard Format February 10, 1994 / Rohit Khare
_____________________________________________________________________________
DESCRIPTION

RFC #3 specifies the standard content and form for exchanging HTML formatted
documents over OpenStep pasteboards.
_____________________________________________________________________________
RATIONALE

OpenStep relies extensively on the use of pasteboards for interapplication

data exhange (services, filters, cut/paste, drag-and-drop). While OpenStep
includes several common document formats (RTF, PostScript, TIFF), it has not
defined a standard HTML exchange type.
_____________________________________________________________________________
SPECIFICATION

The specification consists of two interrelated pasteboard types. Per

OpenStep conventions the "richest" form is presented first. Furthermore,

HTML pasteboard types may preempt other representation formats, such as
RTF, RTFD, and ASCII. If a URIPboardType (RFC#2) is included, it must be
a reference to the original provenance of the included HTML code.

The HTML code embodied within the pasteboard data can correspond to a single
selection range; HTML fragements must be well-formed, but may exclude <HEAD>
sections. Of course, an entire document must have exactly one <HEAD>

extern char * HTML3PboardType "WebStep HyperText Markup Language 3.0"

This is an optional HTML level-3 conformant data stream. It can thus leverage
HTML+ specific features such as tables, advanced forms, stylesheets, etc.
This RFC does not cover style-sheets, another HTML3-centric datatype.
HTML3 is based on: http://www.w3.org/hypertext/WWW/MarkUp/html3-dtd.txt

extern char * HTMLPboardType "WebStep HyperText Markup Language 2.0"

This is the "normal" HTML level 2 data stream. It should be conformant, and
authoring tools may not generate deprecated Level 1 tags (<XMP>, etc). HTML
Level 2 is at: http://www.hal.com/users/connolly/html-spec/HTML_TOC.html

extern char * DTDPboardType "WebStep Document Type Definition 1.0"

If the HTMLPboardType payload is using an experimental or non-standard DTD
(as documented in its <HEAD> DTD attribute), it may use the DTDPboardType
to reference an exact SGML grammar for the DTD.

It is understood that any corresponding ASCII, RTF, RTFD, or other types,
must correspond to a best-effort redering of the HTML into those types.

This information can also be manipulated as UNIX files. WebStep suggests
registering .html, .html3, and .dtd respectively.

DISCUSSION ITEMS:
[Discussion point: is it approporiate for WebStep to define HyperTeX
and TeX exchange types for the nascent HyperDVI developemnt under NS/OS:
extern char * HyperTeXPboardType "WebStep HyperTeX Source 1.0"

This would also imply the best-effort filtering]
Should WebStep define filter services for HTML-2-RTF or RTF-2-HTML?
_____________________________________________________________________________
CONFORMANCE TESTING

User-level testing includes:
* cut/paste of HTML selections
* working with .html files
_____________________________________________________________________________
EXAMPLE IMPLEMENTATIONS

eText will be conformant in its .92 release
OmniWeb uses different Pasteboard naming strings
SpiderWoman is unknown
Pages is unknown