From: Roman Pixell (roman_at_pixell.net)
Date: Tue Sep 21 2004 - 16:22:47 PDT
On Sep 22, 2004, at 12:55 AM, Larry Yaeger wrote:
> The English and alternate
> language hypotheses will compete for space in the recognizer's search
> engine, so that could conceivably affect accuracy, but the maximum
> amount of memory used for that purpose is fixed, so it probably will
> not affect heap usage.
yes, i recognise this - sometimes english interpretations are suggested
when i wirte in swedish.
>> what would the optimal size be?
>
> Sorry, I really don't know. More words -> greater coverage ->
> greater overall accuracy. But if you're running out of heap space,
> you clearly want to limit what you add to the system.
i will try to start with a word list of 30-40k (including some 25k
which i intend to steal somewhere) and see how it goes when i have the
time. the best would be like you say, to use analyse text from online
newspapers etc, in order to cover things that are not in a regular
dictionary.
thanks for all the answers!
/ ®
-- This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries Official Newton FAQ: http://www.chuma.org/newton/faq/ WikiWikiNewt for all kinds of articles: http://tools.unna.org/wikiwikinewt/
This archive was generated by hypermail 2.1.5 : Tue Sep 21 2004 - 17:00:04 PDT