HP3000-L Archives

October 1999, Week 5

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Scott Root <[log in to unmask]>
Reply To:
Scott Root <[log in to unmask]>
Date:
Sat, 30 Oct 1999 03:31:11 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (40 lines)
Need comments please.
Scenario,
Thursday AM we discover a hardware problem, 3 printers and 2 mag drives all
strung together off the same HPIB card are down, system cannot find the
physical device. So we have all users log off, take the system down, power
everything off and then reboot, no luck so we take it down again and swap
out the card. Reboot, all devices are there, all drives mounted and
available, everything seems fine so users log back on. Discover a problem
early AM Friday with batch processes in two databases, five on system. Some
how we have child records in the SVC_DETAIL dataset that have no parent
records in the MASTER dataset. Discspace throughout this time frame was
consistently above 22 million free sectors with no one drive having less
than 1 million free. Checked bootlog as well as other log files to no avail.
These datasets have not been purged and restored for near 6 months, if they
were restored corrupted at that time we would have had this problem then. We
were also able to rule out a security breach due to the limited few who have
access to cause such harm, that coupled with the fact that these same few
people must fix the problem.
Adager reporting did not find any problems. The RC and our software vendor
suggested that the hardware problem (the HPIB card) caused some data
corruption, but neither could say for sure or give an example of how.

Anyone ever had a problem like this?
We're running MPE/ix, 6.0 on a 957 with Image.

BTW, we purged all databases and restored from Wednesday night since no one
could say without a doubt that the corruption did not occur in the other
three databases also. Get to spend the weekend running all the batch
processes again and examining standard lists very closely.

Is this action enough? If this corruption was caused by the hardware problem
is it possible we have corrupted files elsewhere on the system as well?
Probably should run an FSCHECK too, don't you think? Just thought of that.

Well that was a mouthful, but I know how to spell "corruption" now.

Scott Root
Prescom Industries Inc.
703-905-4551

ATOM RSS1 RSS2