[NTLK] ANN: unwrap

From: Damien Batstone (djb_at_er.dtu.dk)
Date: Sun Sep 07 2003 - 08:14:16 PDT


Dear All,

Available here on my brand new newted a/c:
http://newted.dyndns.org/users/damienb/

For those of us who use their newt as a portable bookshelf.

I've written a very tiny C app for unwrapping wrapped text files. It works
well on all the pdf files, and project Guttenberg etc. It looks for
grammatical end of sentence before a line return before deciding whether to
keep it.

It's very useful, as I've never got paperback's unwrap feature working
properly, and paperback seems to hate pdf copied text, without unwrapping
(often crashing the newt). Source and win32 console executable are
included. Usage is: "unwrap <inputfile>"; e.g., "unwrap test.txt" It
defaults to "input.txt", so if you drop it in a directory with "input.txt",
and double click, it will unwrap it. It unwraps to "output.txt", and will
overwrite without warning. You have to do a text extract on a pdf before
unwrapping the resulting text file (using adobe, ghostview, or one of the
many free tools).

If by the way, anyone has paperback source, or specs for paperback, I would
love to do some work on this app.

Source is included, and please feel free to change it in any way, or
recompile for other platforms. I haven't been able to find a similar app
on the web. I've only started C/C++ recently, so corrections would be
appreciated.

Cheers,

Damien

-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
List FAQ/Etiquette/Terms: http://www.newtontalk.net/faq.html
Official Newton FAQ: http://www.chuma.org/newton/faq/


This archive was generated by hypermail 2.1.5 : Sun Sep 07 2003 - 08:30:00 PDT