[NTLK] Back to the topic of PDFNewt

From: John Charlton (johncharlton_at_mac.com)
Date: Tue Apr 16 2002 - 02:31:27 EDT


Will add my two cents, since I've done a little bit with Pdf format (and
gotten paid for it, most of the time!) generating it on-the-fly with perl
using my own modules.

The Pdf file format is open (more or less), anyone can read or write one.
(Thankfully MSoft hasn't seen it as a competitor and buggered it up with
proprietary 'extensions'). It doesn't take a lot of horsepower for basic
stuff, but can be abused by the unknowning or unwitted.

The design is based on the page metaphor, and the page dimensions are
hard-coded. When the fonts are embedded, or on the target system it is, and
was designed to be, WYSIWYG. For most people that's the point of using Pdf.
  (If you want maximum portability use ASCII!) Font substitution is done
when necessary using default Serif, Sans-Serif, or Mono-Spaced fonts and
the WYSIWYG breaks down a little.

Trying to take the content out and reformatting it, however, is fraught
with danger, though the Pdf->Html converters do try. One reason is that
text can be written to the page from anywhere inside the pdf file (it's not
linear). It's not as bad as a Word file though. :-)

Having said that it wouldn't be too difficult to extract text from a Pdf
document, ordered by the page it appears on. Graphics and images can also
be extracted and should scale reasonably well to the Newt screen (esp. the
2k series). Keeping the two together seems to me to be the problem. Lord of
your choice help you when the text relates to the graphic (like mine do),
you'll end up with a horrible mess, kind of like those Japanese
translations.

However, if the intent was to reproduce the page, as written, these
problems don't occur, font substitution notwithstanding. The problem
becomes - how usable is the document written for A4 (or 8.5x11) on a
480x320 screen? Horizontal scrolling is a pain, but do-able. With
medium-sized fonts and horizontal orientation you may even be able to read
a whole line!

One 'solution' would be to create the Pdfs for an A5 or A6 paper (you know,
  I love how the Europeans set paper sizes, an A5 is an A4 folded in half,
and so on, makes so much sense...), but if you could do that you'd just
make a NewtonBook of it, right?

Pdf conversion on the web to HTML to Newtscape (what an awesome programming
feat!) seems like the best hack for this, for those times when it's
necessary. Request an ASCII file where possible.

Sorry, I wasn't any help, it's late, hot, and I can't sleep.

jc

P.S. Anyone thought of a postscript interpreter? Could Ghostscript be
ported? I envy those with the knowledge, I'm just another perl hacker.

-- 
Read the List FAQ/Etiquette: http://www.newtontalk.net/faq.html
Read the Newton FAQ: http://www.guns-media.com/mirrors/newton/faq/
This is the NewtonTalk mailing list - http://www.newtontalk.net



This archive was generated by hypermail 2.1.2 : Sun May 05 2002 - 14:03:53 EDT