Re: [NTLK] [ANN] Courier 0.1 (Web browser)

From: Eckhart Köppen (eck1001_at_gmx.net)
Date: Fri Aug 08 2003 - 08:28:56 PDT


On Fri, 08 Aug 2003 10:47:24 -0400, Victor Rehorst wrote:
> Would this be the place to implement converting HTML entities to Unicode
> characters? I'm trying to decipher how things work in there in order
> to add
> this in.

It is handled a bit lower, inside tox.c. The state machine currently
treats entities separately from whitespace and words, i.e. it signals
the end of a word or whitespace when an entity reference is starte
(look at the definition of the state STATE_WORD_CONTENT for an example).

The proper way of handling entities would be to give them an own
callback and save the entity reference in the parser buffer. It can
then be handled on the application level, e.g. in ToxAdapter. I'm just
about to add this callback... I let you know when it is done.

Eckhart

-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
List FAQ/Etiquette/Terms: http://www.newtontalk.net/faq.html
Official Newton FAQ: http://www.chuma.org/newton/faq/


This archive was generated by hypermail 2.1.5 : Fri Aug 08 2003 - 11:30:01 PDT