HP3000-L Archives

July 2011, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Craig Lalley <[log in to unmask]>
Reply To:
Craig Lalley <[log in to unmask]>
Date:
Thu, 14 Jul 2011 15:37:52 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (131 lines)
When in doubt check/replace the terminator.

IMHO,

-Craig



________________________________
From: Jack Connor <[log in to unmask]>
To: [log in to unmask]
Sent: Thursday, July 14, 2011 3:10 PM
Subject: Re: HP3000/918LX DR tales of woe

I would check the MFIO card, James; you can swap it with your other box.

When you boot the box, do you see what I assume is LDEV 7 in a SHOWDEV?

It may be worthwhile to check the scsi address (normally 0 on the tape), although one would think it would pick up on that on bootup.

Fwiw...
jack

-----Original Message-----
From: HP-3000 Systems Discussion [mailto:[log in to unmask]] On Behalf Of James B. Byrne
Sent: Thursday, July 14, 2011 4:21 PM
To: [log in to unmask]
Subject: [HP3000-L] HP3000/918LX DR tales of woe

I wish to publicly thank Mr. Craig Lalley who went to extraordinary
lengths to assist us on Tuesday evening and Wednesday.  Thank you
sir.  I owe you several beer, if not something much stronger.

In addition, I am grateful for, and humbled by, the generosity
displayed to me by many on this list in the past 48 hours.  To all
of you a very heart-felt thanks.

Just so you realize how bad this could have been for us I ask your
indulgence to entertain an after action report.

The HP3000/918LX main system board failure occurred at approximately
07:40 local time.  At that moment the nightly backups to tape were
over 90% complete, but not finished.  The store-to-disk backup had
completed at about 04:45 but the ftp job to move those files to our
off site  was still running.  So both current external backups were
corrupted and the store-to-disc originals were on an internal drive
in the downed system.

That was unfortunate but we had the previous days backups, which
happened to be the full weekend SYSDUMP, and we have a warm spare
918 off site accessible over the Iternet via ssh.  So after a reload
at the DR site we would have had to recreate an entire days
processing in addition to our current work.  Not good but at least
doable.

However, sometime after the failure on Tuesday, our off-site DR
server also failed.  Recycling the power on the DR system did not
clear the problem.  So I had both the primary and the DR systems
down at approximately the same time.

We recovered the main system around 11:00 on Tuesday and immediately
ran into the Cognos licensing issue with PH839C.  Fortunately, we
keep 729C8 installed on both systems and we automatically rebuild
our object files from source several times a year against both
versions. This means that moving our application ( 1000+ Quiz
reports, QTP runs and Quick screens) between 729C8 and 839C versions
of Powerhouse is trivial.

Therefore we reverted to PH729C8, which has no cpu licence
limitation; rebuilt the software from source to ensure the most
current versions were in use; and then enabled access to the HP3000.
Our business application was available for our users after about
13:00 on Tuesday afternoon, with the exception of anything that
depended upon the SYSDATE function as we quickly discovered.
However, we were able to work around that limitation on an ad hoc
basis until close of business on Tuesday.

In total we lost approximately four hours of up-time, two of which
were solely due to Cognos licensing restrictions.

During Tuesday evening, through the kind courtesy of Mr. Lalley, the
HPSUSAN number on the new CPU was changed to the old number and I
was able to reconfigure the system to again use PH839C. After
spending Wednesday tidying up loose ends from the previous day's
difficulties the next thing required was a visit to the DR site and
find out what was wrong there.

When I arrived at the DR site on Thursday morning I observed that
the attention light on the DR 918 was lit up and that both the front
panel lights on the DDS3 were lit.  Power cycling the 918 did not
change this, the same lights came back on as soon as power was
restored.

While the DR system remained powered up I pressed the eject button
on the DDS3 for the extended period required to reset it.  After a
time both the lights on the DDS went out.  Power cycling the 918
then resulted in a successful boot.  However, subsequently inserting
a blank tape into the DDS3 drive after the reboot resulted in the
same outcome, a non-responsive 918 with both DDS panel lights lit
and the attention light on the 918 lit.  After another reset of the
DDS and a power cycle reboot the DR system was back online.

So, I now have a DR system that is up and responsive, but whose tape
drive load bay is essentially a kill switch.  I have spares of
course and so tomorrow I will be replacing that DDS3 unit.

However, I have never encountered, or even heard of, this sort of
behaviour on an HP3000.  Does anyone have any idea of what might be
wrong with that tape unit?

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:[log in to unmask]
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

Be sure to ask us about short term and long term rentals on servers, workstations, storage and more!

CONFIDENTIALITY NOTICE: This communication with its contents may contain confidential information. It is solely for the use of the intended recipient(s). Unauthorized interception, review, use or disclosure is prohibited and may violate applicable laws including the Electronic Communications Privacy Act. If you are not the intended recipient, please contact the sender and destroy all copies of the communication.

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2