HP3000-L Archives

January 1997, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
David Largent <[log in to unmask]>
Reply To:
David Largent <[log in to unmask]>
Date:
Fri, 24 Jan 1997 19:26:16 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (75 lines)
On 23 Jan 97 at 22:55, I wrote:

> Greetings from a grumpy system manager.
>
> Earlier this evening, our 922LX (MPE/ix 4.0 - yeah, I know I'm
> behind, but I didn't have enough free space to upgrade until now)
> experienced the following...  :-(
>
>   System Abort 1101 from Subsystem 101
>   Secondary Status:  info=-44, subsys=113
>   System Halt 7, $044D
<Snip remaining gory details>

Well, just in case any of you were curious, the system came back to
life about 4 this afternoon - after being down just shy of 24 hours.
The bottom line is that we lost yesterday's work, since we had to
revert to the backup from yesterday morning.

I guess after almost seventeen years on HP3000s here at Gilbert, we
were "due" to loose some data.  To the best of my knowledge, this is
the first data loss we have experienced (that can't be traced to a
human doing something wrong) since we started using an HP3000 Series
30 way back in 1980!  I'd really rather it to have been 20 or 30
years though...

A few more details of today's events for the insanely curious...

We first tried replacing the controller - no difference.  We still
couldn't get the system to recognize the drive enough for us to even
think about pulling any data off of it.  Thus we concluded that it
was the drive, and replaced it.  At this point, I had high hopes
(wishful thinking) that we'd be able to bring up the system w/ the
new drive installed, and that it would recognize the data on the
original two drives, and I'd be able to backup at least that much
data.  Wrong again.  The system booted, but hung when the operator
logged on at the end of the boot process.  The processor showed 100%
utilization.  At this point we scracthed our heads a bit, and decided
there was no other alternative than to use the SLT and reload the
system.

So, we did just that.  After booting (install) w/ the SLT, much to
our amazement, the system behaved in much the same way!  The
processor again showed 100% utilization at operator logon time.  At
this point we really scracthed our heads!

Well, while we were thinking about that/waiting for it to get unbusy,
we took a look at a 2563B printer we've been having sporadic problems
with for the past few months.  We discovered that the HPIB address on
the printer was set completely wrong.  No, this is not what's been
wrong w/ the printer all along, it had just recently "set itself to a
random address".  It's been real flakely and done all sorts of
strange things.  As soon as the CE set the address correctly, the CPU
utilization dropped to 0.  We wondered aloud: has that been the
problem all along w/ the disk drive?  It seems the printer is on the
same HPIB channel as the disks, and maybe the wrong address was
confusing it.

Unfortunately, we didn't discover the printer problem until after
we'd gone past the point of no return w/ the SLT, and were thus
forced into a reload from yesterday morning's backup tape.  20-20
hind sight makes you wish you'd checked the completely unobvious
things first. Oh well.

I guess someone was right when they said "There are only two kinds
of computer users: those who have lost data, and those who will."

Sorry to have rambled on for so long again.

-dll
David L. Largent                     "My thoughts are my own,
Information Services Manager          unless I choose to share them!"
The Gilbert Companies, Inc.          Phone: 317/284-4461
P.O. Box 1032                        Facs:  317/288-2079
Muncie, Indiana  47308-1032          Email: [log in to unmask]

ATOM RSS1 RSS2