HP3000-L Archives

January 1996, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Rudderow, Evan" <[log in to unmask]>
Reply To:
Rudderow, Evan
Date:
Tue, 16 Jan 1996 13:36:00 EST
Content-Type:
text/plain
Parts/Attachments:
text/plain (159 lines)
Hi all,
 
Stan and Goetz replied to my question by asking me:
 - how soon after doing the VSOPEN did I do the DSTAT ALL, and
 - What console messages were being displayed
 
I replied to Stan that I did several DSTAT ALLs, from immediately after the
VSOPEN to a couple of minutes after the VSOPEN; I also related that I was
not at the console, so I don't know what messages were displayed (and I'm
too lazy/busy to look through the system logs).  I also mentioned that I
would have another chance to mount these puppies today and I'd try to get a
better idea.
 
Well that time has come, so I thought I'd describe the rest of my most
excellent disc drive adventures...
 
Some background:
 
At about 2:30 this past Saturday morning I got a call from the night
operator that my 3000/957 was hung and that it appeared to be a disc drive
problem since, in addition to the little green light on the cabinet
containing Ldev 4, 5, and 12, there were also three little amber lights.  So
I went into the office.
 
Now it's unreasonable to assume that three drives would fail at the same
time, so I figured it might be the power supply; so I cycled power on the
cabinet.  The drives cam back nicely and the backup (which was the only
thing running) resumed.  I had a few late night tasks to perform, so I stuck
around to do them; about 10 minutes later I heard a couple of those cute
little chimes that HP's SCSI disks make when they spin up and are recognized
by the system: the power supply had flaked out, but come back on it's own.
 The power supply flaked out a few more times while the backup finished;
sometimes it came back to reality on it's own, and sometimes it needed a
little help (i.e. my cycling power).
 
Now, while we are a 24 X 6-2/3 shop, we only have 5 X 8 coverage for
hardware; seeing as how Ldevs 4, 5, and 12 hold production data, and Ldevs
16, 17, and 18 hold development data; I took it upon myself to swap the
power supplies to minimize the effects of any downtime that the power
supply's threatened failure might cause.
 
After I was done I saw the DSTAT / VSOPEN described in my previous post on
this thread.
 
Well, the power supply died all the way at 11:00 yesterday morning
(actually, it didn't die *all* the way: it still did a workmanlike job of
supplying 5 volts -- enough to light the pretty little lights; it just
couldn't manage 12 volts, so the fans and discs wouldn't spin.  No doubt one
of you will suggest that I should have put in new batteries...).
 Interestingly, it's failure hung the system (but then I've done memory
dumps where transient objects were dumped off user volumes -- nobody ever
explained *that* to me).  So I powered off Ldevs 13 through 18 (all
development), did a control-B TC, a START RECOVERY, and let the users back
on.
 
My CE arrived with a new power supply yesterday afternoon; but it didn't put
out anything (I guess HP forgot to replace the batteries, too); anyway, at
about 11:00 this morning he should up (to the great dismay of the
development staff, who now must get back to work) with a working power
supply.
 
So, we put everything back together and power up the drives; I expect to see
something like this:
 
   LDEV-TYPE    STATUS      VOLUME (VOLUME SET - GEN)
  ----------   --------    --------------------------
  1-C3010M1     MASTER      MEMBER1         (MPEXL_SYSTEM_VOLUME_SET-0)
  2-C3010M1     MEMBER      MEMBER2         (MPEXL_SYSTEM_VOLUME_SET-0)
  3-C3010M1     MEMBER      MEMBER3         (MPEXL_SYSTEM_VOLUME_SET-0)
  4-C3010M1     MEMBER      MEMBER4         (MPEXL_SYSTEM_VOLUME_SET-0)
  5-C3010M1     MEMBER      MEMBER5         (MPEXL_SYSTEM_VOLUME_SET-0)
 12-C3010M1     MEMBER      MEMBER12        (MPEXL_SYSTEM_VOLUME_SET-0)
 13-C2490AM     LONER       MEMBER13        (CLOSET-0)
 14-C2490AM     LONER       MEMBER14        (CLOSET-0)
 15-C2490AM     LONER       MEMBER15        (CLOSET-0)
 16-C2490AM     LONER       MEMBER16        (CLOSET-0)
 17-C2490AM     LONER       MEMBER17        (CLOSET-0)
 18-C2490AM     LONER       MEMBER18        (UPSET-0)
 
But I don't; instead I see:
 
   LDEV-TYPE    STATUS      VOLUME (VOLUME SET - GEN)
  ----------   --------    --------------------------
  1-C3010M1     MASTER      MEMBER1         (MPEXL_SYSTEM_VOLUME_SET-0)
  2-C3010M1     MEMBER      MEMBER2         (MPEXL_SYSTEM_VOLUME_SET-0)
  3-C3010M1     MEMBER      MEMBER3         (MPEXL_SYSTEM_VOLUME_SET-0)
  4-C3010M1     MEMBER      MEMBER4         (MPEXL_SYSTEM_VOLUME_SET-0)
  5-C3010M1     MEMBER      MEMBER5         (MPEXL_SYSTEM_VOLUME_SET-0)
 12-C3010M1     MEMBER      MEMBER12        (MPEXL_SYSTEM_VOLUME_SET-0)
 13-C2490AM     MASTER      MEMBER13        (CLOSET-0)
 18-C2490AM     MASTER      MEMBER18        (UPSET-0)
 
No doubt because I had done a VMOUNT ON, AUTO in the small hours of Saturday
morning after swapping the power supplies around.  So I give it a few
minutes expecting, eventually, to see:
 
   LDEV-TYPE    STATUS      VOLUME (VOLUME SET - GEN)
  ----------   --------    --------------------------
  1-C3010M1     MASTER      MEMBER1         (MPEXL_SYSTEM_VOLUME_SET-0)
  2-C3010M1     MEMBER      MEMBER2         (MPEXL_SYSTEM_VOLUME_SET-0)
  3-C3010M1     MEMBER      MEMBER3         (MPEXL_SYSTEM_VOLUME_SET-0)
  4-C3010M1     MEMBER      MEMBER4         (MPEXL_SYSTEM_VOLUME_SET-0)
  5-C3010M1     MEMBER      MEMBER5         (MPEXL_SYSTEM_VOLUME_SET-0)
 12-C3010M1     MEMBER      MEMBER12        (MPEXL_SYSTEM_VOLUME_SET-0)
 13-C2490AM     MASTER      MEMBER13        (CLOSET-0)
 14-C2490AM     MEMBER      MEMBER14        (CLOSET-0)
 15-C2490AM     MEMBER      MEMBER15        (CLOSET-0)
 16-C2490AM     MEMBER      MEMBER16        (CLOSET-0)
 17-C2490AM     MEMBER      MEMBER17        (CLOSET-0)
 18-C2490AM     MASTER      MEMBER18        (UPSET-0)
 
But after 7 minutes all I see is this:
 
   LDEV-TYPE    STATUS      VOLUME (VOLUME SET - GEN)
  ----------   --------    --------------------------
  1-C3010M1     MASTER      MEMBER1         (MPEXL_SYSTEM_VOLUME_SET-0)
  2-C3010M1     MEMBER      MEMBER2         (MPEXL_SYSTEM_VOLUME_SET-0)
  3-C3010M1     MEMBER      MEMBER3         (MPEXL_SYSTEM_VOLUME_SET-0)
  4-C3010M1     MEMBER      MEMBER4         (MPEXL_SYSTEM_VOLUME_SET-0)
  5-C3010M1     MEMBER      MEMBER5         (MPEXL_SYSTEM_VOLUME_SET-0)
 12-C3010M1     MEMBER      MEMBER12        (MPEXL_SYSTEM_VOLUME_SET-0)
 13-C2490AM     MASTER      MEMBER13        (CLOSET-0)
 14-C2490AM     MEMBER      MEMBER14        (CLOSET-0)
 18-C2490AM     MASTER      MEMBER18        (UPSET-0)
 
I tried a VSCLOSE followed by a VSOPEN, then a VMOUNT ON, AUTO -- all to no
avail.
 
Nary a message on the console about Ldevs 15, 16, or 17.
 
So I tried SYSDIAG | SCSIDISK; it complained that it wasn't designed for a
C2490AM; my CE suggested SCSIDSK2; it voiced the same complaint, but what
the hell...  SO we ran Sect 17 on Ldev 15; it reported an error, then the
diagnostic hung.  Did a Control-Y to get back to the DUI> prompt, aborted
the diagnostic, and existed SYSDIAG.  Tried another VSCLOSE CLOSET; now it
reported that it could be closed because it was in the process of being
mounted -- and there were XM recovery messages on the console for Ldev 15.
 Did a DSTAT ALL, and now Ldev 15 was there.  So we did the same kind of
thing for Ldevs 16 and 17.
 
When all was said and done, all of the volumes were mounted and a SCSIDSK2
ACCESS LOGS showed no errors.
 
Weird.  I remarked to my CE that we learned something from this -- I just
didn't know what it was...
 
 
BTW - In the past there has been some discussion on this list about why HP
disc drives cost so much.  Someone authoritatively stated the reason has to
do with all of the neat mechanisms that ensure that a disc write completes
in the event the drive powerfails.  That's not it at all; the real reason is
that it takes 1/2 hour to put the stupid cover on the Series 6000 rackmount
cabinets: the premium we pay for HP discs is purely labor to do that one
task. ;-)
 
Anyway, much fun was had by all.
 
 -- Evan

ATOM RSS1 RSS2