HP3000-L Archives

October 2000, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Emerson, Tom # El Monte" <[log in to unmask]>
Reply To:
Emerson, Tom # El Monte
Date:
Mon, 9 Oct 2000 22:10:37 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (34 lines)
That header just about says it all: we have a system failure each day when a
"cleanup" job hits a certain spoolfile -- fscheck dutifully indicates
corruption in the file label, and the ";FIX" option seems to have no
effect...

[and yes, purging the file via fscheck ALSO crashes the system]

This appears to be fallout from the SCSI controller problem from last week
[seems to be fixed now -- thanks for the tip on the controller being the
"likely culprit"!]  The "problem" was manifesting itself as a single-bit
error that did NOT trigger a read error; seems the controller was
mis-representing the "bit" in question because a later re-read of the same
file would NOT show the problem.

I suspect that during some "routine" operation of the system, data was read
from one location and copied to another [or used as the basis of writing
another sector]  With the "flipped bit" going by undetected, "bad data" got
into a routine via the side door and wasn't flushed out during the write to
the LABEL.  Later reads of this now shows "corruption", and attempts to
purge the file crash the system.

Now, the $64 question: how do I get rid of that file short of a reload?
[there's another file that fscheck reports an error on as well, but it's not
a spoolfile -- it's a data file in a test group/account, so would purging
the group "clean up" this file or is it likely to trigger the failure as
well?]

Tom Emerson
Sr. Systems Analyst
NDC | e COMMERCE
[log in to unmask]
626-258-4309
626-350-3832 FAX

ATOM RSS1 RSS2