One of the requirements for Cyc (or any comparable project) is a
language that can be used by both humans and software systems to share a
conversation about the real world -- the "ontology" in Cyc terms. As you
suggest, a human infant has an analogous need, but the infant has an
important advantage: the environment she's entering already contains a
lingua franca which, by virtue of a co-evolutionary history, she can
adopt with incredible rapidity and efficiency.
Lenat and company at Cycorp, wanting to implement common sense (their
terminology), have had no choice but to construct their ontology in
authoritarian mode. This involves vast, unsustainable amounts of effort,
time, and money, and an inevitable "impedance mismatch" with the
inherently chaotic real world phenomena they want to come to grips with.
Similar remarks apply to any project with similar ambitions.
This will be changing rapidly in the very near future, because the
evolution of convergent lingua franca is underway and rapidly
accelerating -- by "convergent", I mean that this lingua franca is
reasonably learnable by both humans and infant software systems such as
Doron proposes. The evolution of shared convergent language is an aspect
of the convergence of human discourse (publishing) and computer-based
information systems as the Web, and is driven primarily by the economic
opportunity in e-commerce and other Web-based global systems integration
paradigms.
An important secondary driver/enabler/accelerator is the availability
and near-universal acceptance of XML as the appropriate substrate for
language building. (Most on this list are probably already tracking XML
dialect formation at pages such as
http://www.oasis-open.org/cover/xml.html#applications.)
By justifying the creation and explosive growth of the Web (as opposed
to the internet) the predominantly chaotic process of human
communication has created an environment in which one technology after
another is being pulled out of an economic "potential well" from
small-scale, authoritarian mode deployment to chaotic, explosive growth
deployment. In many cases there's an associated release of pent-up
technical capability.
Automated reasoning approaches of all kinds are going to get their first
real chance and first real tests over the next ten years. I believe this
will initiate a genuine co-evolution -- software intelligence : shared
markup framework :: infant : natural language -- whose ultimate
consequences are profoundly unpredictable and profoundly exciting.
Mike