NTLK The complete(!) scoop on Newton handwriting recognition

From: Mike O'Brien (obrien@leonardo.net)
Date: Sun Sep 03 2000 - 04:36:59 CDT

        There's been a bit of action on the list lately talking about
Calligrapher, Rosetta, and Newton HWR in general. Here's the complete story.

        Some years ago, not long after the breakup of the Soviet Union, some
scientists at the Soviet Academy of Sciences noticed that their paychecks
were not outstandingly large or regular any more. They decided that they
wanted to break out of the academic mold and go commercial. They wanted
to found a Russian equivalent of Bell Labs. The result was Paragraph
International. They had under their belts some hot mathematics (the Russians
are much better at math than the US, in general) allowing for extreme
compression of higher-order curves.

        One of their early contracts was with Apple computer. If I remember
correctly from my conversations with them at a conference, they contributed
two things: the original Newton handwriting recognition, and the compression
algorithm for storing line drawings on the Newton. They could take a full-
color sketch by Picasso, compress it to 17K, and re-expand it into an image
you couldn't tell from the original.

        It was at this conference that I got my hands on a 130 for the first
time, and found that cursive recognition now worked. When I asked them what
the problem had been with the earlier models, they told me that the problem
was (from their perspective at least) that Apple had kept them at such an
arm's length that they didn't even know what they were designing handwriting
recognition for. Once they got their hands on an actual Newton, they tuned
the parameters in their algorithm, and that's what shipped with 2.0. It was
further tuned for OS 2.1, allowing breaks between some characters in a word
for the first time. I find, for myself, OS 2.1 cursive HWR works rather
well, at least for text not full of capitalized acronyms or proper names.

        Apple, of course, hedged their bets and came up with Rosetta for
OS 2.0. This wasn't exactly as home-grown inside Apple as some folks might
think. I believe it was done with a three-man team, one of whom was an Apple
employee and the other two of whom were contractors. I could be wrong about
the mix, but that's what I remember. Later, the implementors described the
algorithm in the following paper:

Combining Neural Networks and Context-Driven Search for On-Line,
     Printed Handwriting Recognition in the Newton, Yaeger, L., Webb,
     B., Lyon, R., AI Magazine, Volume 19, No. 1, p. 73-89, AAAI (1998)

        This makes fascinating reading. In it, they debunk some common
beliefs about Rosetta, for example, the belief that it does not use
dictionaries (it does). Rosetta was designed to be trainable, as the
cursive recognizer is, but they ran out of implementation time and trained
it up once with fixed tables.

        My educated guess, and it is no more than that, is that Rosetta forms
the basis for Apple's currently reported plans for introducting HWR into the
PowerBook line, perhaps extending it to include the trainability that it was
originally intended to have. Meanwhile, Paragraph International, which has
been acquired and re-acquired a couple of times in the intervening years,
continues to improve and port Calligrapher to other hand-held devices,
including not only WinCE, but Psion as well.

        BTW, off the topic, aside from being outrageously overpriced, the
Psion netBook pretty much rocks as a latter-day eMate replacement, except
for the woefully WinCE-like UI. It's even got an Opera Web browser and a
full Java Virtual Machine. And for the sake of old-time memories, Calligrapher
runs on it.

Mike O'Brien

NewtonTalk brought to you by:

EVOTE.COM -- the ESPN of politics on the Internet! All the players, all the news, and the hottest analysis and features (plus 'toons!) anywhere.... visit http://www.evote.com today!

Need Subscribe/Unsubscribe info?

Visit the NewtonTalk section at http://www.planetnewton.com

This archive was generated by hypermail 2b29 : Tue Sep 12 2000 - 00:00:07 CDT