HP3000-L Archives

February 2002, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Christian Lheureux <[log in to unmask]>
Reply To:
Date:
Wed, 6 Feb 2002 10:11:43 +0100
Content-Type:
text/plain
Parts/Attachments:
text/plain (135 lines)
Oh, oh, please be careful about 1458s !!! What that number means is simply
that a process was aborted while being critical, i.e. explicitly marked by
MPE not to see interrupts. A process will often be marked critical as a
necessary step of its life cycle, e.g. when it's doing a disk I/O completion
or something. What's not normal is when a critical process has to be aborted
anyway. This is the sign of some inconsistency. But the trouble is that
there really are many, many reasons for an inconsistency that could end up
in a 1458. Generally, we'll find as much information in the secondary info
as in the SA itself. In other words, you may view the 1458 as the problem
and the secondary info as the first-level cause. Then you have to look for a
second-level cause (in the case that was recently mentionned here, why the
heck was that VSM pointer invalid ???), and so forth, till you reach a valid
root cause. Then, and only then, you can pretend to have cracked the
mistery. Then, and only then, you can make an informed recommendation for a
patch, if any relevant patch exists. All other patch recommendations are
just guesses. I reckon that intuition may lead you to the right root cause
anyway, but then you're on your own, and you'd better be lucky, cause if
you've not installed the right patch, you're going to run into the problem
again, just a question of time. Hmmmm ... are we talking of a sandbox-type
system, or a production system ?

The beauty of 1458s and other general-purpose SAs is that they can walk the
dumpreader thru lots of MPE modules, like Process Management (the one which
ultimately brought the box down calling the 1458), VSM (a beautiful baby
with pointers all around - the one that caused the -31/107 sec. info), and
probably others before that. They tend to be very diverse and intellectually
challenging.

What an informed search for root cause requires is, of course, a memory
dump. And a dumpreader.

Christian Lheureux
Responsable du Département Systèmes et Réseaux / Head of Systems and
Networks Department
APPIC R.H.
business partner hp invent
Tel : +33-1-69-80-97-22   /   Fax : +33-1-69-80-97-14 / e-mail :
[log in to unmask]
"Le Groupe APPIC recrute, contactez nous !"



> -----Message d'origine-----
> De : HP-3000 Systems Discussion [mailto:[log in to unmask]]De la
> part de Hoxsie, Howard
> Envoyé : mardi 5 février 2002 21:11
> À : [log in to unmask]
> Objet : Re: [HP3000-L] System Abort 1458
>
>
> I had that same problem a couple of weeks ago on my N4000
> running 7.0pp1 and I do see the bad patch in my HPSWINFO
> file.  No one at the RC mentioned anything about the bad
> patch or the one word fix.  Needless to say I'm following up.
>  Thanks for the tip!
>
> FYI,
> Howard Hoxsie
> HPe3000 Systems Administrator
> nordstrom.com
> www.nordstrom.com
> (206) 215-7069 voice (206) 215-7869 fax
> [log in to unmask]
> 600 University Street, Suite 600
> Seattle, WA   98101-4102
>
>
> -----Original Message-----
> From: Bill Cadier [mailto:[log in to unmask]]
> Sent: Tuesday, February 05, 2002 11:37 AM
> To: [log in to unmask]
> Subject: Re: [HP3000-L] System Abort 1458
>
>
> Randy writes:
>
> > John,
> >  Do you use StreamX from Vesoft? If yes, we had the
> following system abort
> > on 6.5 pp2 + patches
> >
> > System Abort 1458 from  Subsystem 102
> > Secondary Status: Info=-300, Subsys 107
> > System Halt 7, $05B2
> >
> >  The system abort occurred when there is a typo in the job
> card and using
> > StreamX. You will get the error message "Nil ccb pointer -
> internal error.
> > (CIERR 9016)" when streaming the job with the typo. This
> corrupts the Job
> > Master Table and the next time a process transverses this
> table, a system
> > abort will occur. I think the command SHOWJOB will
> transverse this table.
> >
> >  The problem was introduced with patch mpelxv2b, witch is
> on pp3 for 6.5.
> > You may want to call the RC to see if the problem exist on 7.0
> >
> > Hope this helps.
> >
> > Randy Breitfelder
> > Nielsen Media Research
>
> Forgot about that one!
>
> MPELXV2(C) for 7.0 was marked bad but it is worth being sure it's not
> installed. There's a one word binary patch that can be
> installed to correct the
> problem if any version of MPELXV2 is installed. The Response
> Center should
> have all the details.
>
> I should mention that just plain old MPE STREAM can also
> submit a job with
> a malformed JOB card and result in the SA1458 Randy mentioned.
>
> Hope this helps!
>
> Bill
> HP/CSY
> reply to: cadier at hpixec01 cup hp com
> (add the dots and make the 'at' and @ and it'll work!)
>
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
>
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
>

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2