Re: [NTLK] Newton Spanish text recognition.

From: Eckhart Köppen (eck1001_at_gmx.net)
Date: Wed Sep 22 2004 - 08:45:31 PDT


On Tue, 21 Sep 2004 17:55:48 -0500, Larry Yaeger wrote:
> I was assuming the word list would be compressed
> into the standard dictionary format, which would reduce its
> footprint, but heap space might still be an issue. And I'm now
> wondering how/if the word list actually gets converted to the
> standard dictionary format; I don't remember that tool being built
> into the system (though that doesn't mean it wasn't).

Funny that this comes up now ;) For some time now I was planning to do
something about the dictionary on my German Newton. I'm mostly writing
in English, so having an English dictionary would be quite neat. I also
own two US 2100s, so why not take their dictionaries? I managed to
extract the raw dictionary in the correct format and use the DictFr
code from Paul to make it usable on my Newton. The biggest problem is
that there are multiple dictionaries, and ideally, all of them should
be extracted. But for now, I'm satisfied.

An alternative approach (and I guess this is what Paul is doing?) could
be to feed a word list programmatically into a clean user dictionary
and dump the data once finished. Adding the words should take care of
creating the correct compressed dictionary format. The dumped data
should be usable the same way, i.e. stick it in a resource file and
build a Newton package. But there is probably an upper limit for the
number of words you can process like this.

Eckhart

-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
Official Newton FAQ: http://www.chuma.org/newton/faq/
WikiWikiNewt for all kinds of articles: http://tools.unna.org/wikiwikinewt/


This archive was generated by hypermail 2.1.5 : Wed Sep 22 2004 - 11:30:02 PDT