Re: [NTLK] my Newt just experienced a hiccup of the most painful variety

From: Paul Guyot (pguyot_at_kallisys.net)
Date: Wed Dec 03 2003 - 10:52:32 PST


Nathan,

Aux environs du 3/12/03 ā 11:56 -0600, sous le titre "Re: [NTLK] my
Newt just experienced a hiccup of the mos", Nathan Turnage prit sa
plus belle plume pour écrire les mots suivants:
>I don't want to discourage anyone from producing software for the Newt.
>Paul and Eckhart rock. But I just lost *all* the important data on my
>Newton, all of it. And the two problems I had that led to that colossal
>tragedy (for me anyway) were seemingly mundane tasks. 1) I was trying to
>clear the cache in Courier when the -10606 error prevented me from even
>getting the preferences open, and there was no way of getting the cache
>cleared or even deleting the prefs files (used trashpak, prefs cleaner,
>soup mover). 2) on the CF card, all I did was delete a file, and the
>partition was rendered useless. Everything had been backed up to that
>partition before it went, *everything*, so that I could do a brain wipe.
>My newton wouldn't back up (again the -10606 error) and my only option
>was to backup to a card.

If I understand properly, you had two problems.

First, you had a -10606 error when using Courier and a linear card.
Eckhart cannot be blamed for the -10606 error as this happens in the
kernel, in the storage engines, and I'm ready to bet that Courier
doesn't touch anything there or does anything that is supposed to
trigger such an error.

Please don't charge Echkart for this. This error (unless you did use
an ATA card at this point) came from either a hardware problem on
your Linear card or an unknown bug in Linear cards storage engine.

Second problem occurred when you moved everything to the ATA card. It
worked first but soon problems came again and you had this -10606
error again, this time related (as you seem to say) to the ATA card.

What I don't understand is if this error occurred because you copied
corrupted data from the linear card to the ATA card or if it occurred
on its own. How did you copy the data from one card to the other?

If it occurred on its own, and indeed you used the latest release
(RC6.1), then I'll let you say that ATA Support is not safe. It would
then be an extremely naughty bug you discovered and I'm sorry it
happened to occur with you and your important data. So far, no one
had lost a single bit of data with RC6.1 and such problems didn't
appear with my test suites. I won't bore you with theories of
computer science but basically you cannot test everything and I'm
just doing my best to test as many cases as possible to prevent this
kind of problems to happen. When they do happen, and they did in the
past, I do my best to fix them as fast as possible.

However, you might simply have moved the corruption from your linear
card to the ATA card and the system may have screwed up the ATA card
on its own. By screwing it up on its own I mean that if an ATA card
says not formatted it could be because there was an error at the very
first stage of mounting it, an error that could be caused by some
corruption at the root of the store.

Until RC6.1 (or maybe was it RC6), the archive included a Warning
file saying that your data is not safe on ATA cards. I removed it
because I didn't get any bit of e-mail from someone mentionning that
they lost their data with recent releases, you're the very first.
Until then, I just considered ATA Support at least as safe for data
as linear cards. It might indeed be if the corruption came from the
move of data between the cards. Did any error occur when you moved
data?

I want to stress four things:
(1) Newton storage is extremely robust because of journalizing and
transactions.
(2) Newton storage is extremely weak because if a data corruption
occurs, everything can be screwed up. Nevertheless, data corruption
can only happen because of bugs in the store engines because of (1)
(3) No software can be entirely bug-free and I know at least one way
to lose all your data on a linear card because of a bug in the linear
storage cards (I don't know any way on ATA cards but there very
likely are problems there).
(4) Eckhart is not responsible for everything that happened to you,
not as in Courier is bug-free, but as in there is no bug in Courier
that can trigger the -10606 on its own.

There is no easy way to get your data back from the ATA card.
Nevertheless, if you didn't reformat it, you can always send me a
disk image of your ATA card (you seem to have a Mac, Disk Copy can do
that if it has a PCMCIA card slot/drive) and I'll try to low-level
delete the corrupted soup to let you access your data back, if ever
this is possible. I make no warranty and I make no promise of any
result before, say, end of February. This is the best schedule I can
provide considering that I also have to work on my PhD thesis and
that no tool exist to do that. Just put the image on the web and send
me the URL privately.

I'm just sorry I didn't react earlier (I happen to be too tired to
read Newtontalk recently) because once you have a -10606 you have to
react immediatly by backing up everything you can before the system
(on its own) corrupts more things. Regular tools usually aren't
sufficient. I recall saving the content of a linear card that had
such a problem few minutes before any Newton would say about it:
"This card doesn't appear to be formatted, do you want to erase it"
at a Paris user meeting with Ronnie Simon. When ATA Support says "Not
Formatted" it means exactly the same.

Cheers,

Paul

-- 
Philosophie de baignoire - consultations sur rendez-vous.
NPDS/NewtonOS: http://newton.kallisys.net:8080/
Apache/FreeBSD: http://www.kallisys.com/
-- 
This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries
List FAQ/Etiquette/Terms: http://www.newtontalk.net/faq.html
Official Newton FAQ: http://www.chuma.org/newton/faq/


This archive was generated by hypermail 2.1.5 : Wed Dec 03 2003 - 12:00:00 PST