HP3000-L Archives

April 2005, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Paul, Guy (San/Storage Delivery)" <[log in to unmask]>
Reply To:
Paul, Guy (San/Storage Delivery)
Date:
Thu, 28 Apr 2005 14:43:40 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (131 lines)
 Are you sure you are on 6.5PP1?

The trace you sent looks to be the threaded plfd problem fixed with 6.5
patch
mpelxj6b and was fixed in 7.5-

The patches you reference are also 6.5 version patches so makes me
wonder..

> -----Original Message-----
> From: HP-3000 Systems Discussion 
> [mailto:[log in to unmask]] On Behalf Of John MacLerran
> Sent: Thursday, April 28, 2005 12:46 PM
> To: [log in to unmask]
> Subject: Re: Hung Job
> 
> oops,
> just as my finger was pressing 'send', my brain was screaming 
> 'wait!, you forgot something!'
> 
> we're on mpe 7.5 pp1.
> 
> John MacLerran wrote:
> 
> >dear hp3000-l folks,
> >
> >we currently have a hung job on our system. it won't go away with an 
> >abortjob. i did a search of the archives and found some information 
> >regarding patches that may help: mpelxb3a and mpelxf7a. looking on 
> >itrc, a query on either of those leads me to mpemx76b, and 
> i'm trying 
> >to figure out if that patch would help.
> >
> >what we're seeing is that a process goes into a wait state 
> and appears 
> >to never come back.
> >
> >SOS shows the following on the hung process:
> >                   Process Detail
> >&#9475;PIN : 1589 &#9475; Prog: PHDISPAT.PHW849D.COGNOS | 
> Pri: CL210 &#9475;     Switches
> >&#9475;Job :  187 &#9475; User: PHWEBJA,MANAGER.COGNOS  | 
> Type:   NM &#9475; CM->NM  0(0/s)&#9475;
> >&#9475;Ldev:   10 &#9475; Fath: 1526 Bro:      Son:     | 
> CM%:  0[ 0]&#9475; NM->CM  0(0/s)&#9475;
> >           CPU Usage                          Disc I/O Usage
> >&#9475;System % :      .0  [   .0] &#9475; I/Os Total :  0[  
>  0]   Rate Total:   0/sec&#9475;
> >&#9475;Ms Used  :       0[    582] &#9475;      Reads :  0[  
>  0]        Read :   0/sec&#9475;
> >&#9475;Per Trans:        [       ] &#9475;      Writes:  0[  
>  0]        Write:   0/sec&#9475;
> >&#9507;    Response/Transactions            Process Wait States
> >&#9475;Prompt Resp:  -  [   - ] &#9475; CPU:   [  ]  Mem:  [ 
>  ]  Dsc:  [  ]  Imp:  [  ]
> >&#9475;First Resp :  -  [   - ] &#9475; Pre:   [  ]  RIN:  [ 
>  ]  TWr:  [  ]  BIO:  [  ]
> >&#9475;Trans Count:   0 [    0] &#9475; Tim:   [  ]   FS:  [ 
>  ]  Msg:**[  ]  Oth:  [  ]
> >&#9475;Trans/min  :      0 [    0] &#9475; Current Wait: Wait queue
> >
> >
> >it's getting no time, consuming no i/os, and is in the wait 
> queue for 
> >some sort of a message, apparently.
> >
> >a trace on the process shows this:
> >
> >       PC=a.001869d8 enable_int+$2c
> >NM* 0) SP=41856360 RP=a.007620a0 
> >notify_dispatcher.block_current_process+$338
> >NM  1) SP=41856360 RP=a.00763f00 notify_dispatcher+$268 NM  2) 
> >SP=418562e0 RP=a.00230444 wait_for_active_port+$e8 NM  3) 
> SP=418561e0 
> >RP=a.002310a8 receive_from_port+$544 NM  4) SP=41856160 
> RP=a.00738eec 
> >extend_receive+$5b4 NM  5) SP=41855f60 RP=a.0073e8e0 
> >put_wait_queue.wait_for_resource+$b8
> >NM  6) SP=41855de0 RP=a.0073ee0c put_wait_queue+$37c NM  7) 
> SP=41855d20 
> >RP=a.00908188 is_wait_for_plfd+$2b0 NM  8) SP=41855be0 RP=a.00907834 
> >is_get_plfd_ptr_and_lock+$1a8 NM  9) SP=41855a20 RP=a.0135352c 
> >FFILEINFO+$1e8 NM  a) SP=41855960 RP=a.01353310 ?FFILEINFO+$8
> >         export stub: 15e.002329b8 _px_hpfopen+$540 NM  b) 
> SP=418554e0 
> >RP=15e.002300e4 _px_open+$2c NM  c) SP=41854e60 RP=15e.002300a4 
> >?_px_open+$8
> >         export stub: 755.009e29f8
> >NM  d) SP=41854e20 RP=755.009d7cc0
> >NM  e) SP=41854de0 RP=755.009d7b3c
> >NM  f) SP=41854d60 RP=755.009d7b00
> >         export stub: 755.002fd99c
> >NM 10) SP=41854d20 RP=755.002fd920
> >         export stub: 778.000528f8
> >NM 11) SP=41854ca0 RP=778.00000000
> >     (end of NM stack)
> >
> >questions:
> >1) is there any way to tell if the above named patch will 
> help this situation?
> >
> >2) is there any way to abort this process?  i've tried 
> abortjob and abortproc.
> >
> >we've come across this several times in the last few days; 
> each time, 
> >it's taken a start norecovery to get the job to go away. we have an 
> >open case with cognos, but i thought i'd find out if anyone 
> here could offer any insights.
> >
> >thanks!
> >john 'remove the nospam' maclerran
> >[log in to unmask]
> >
> >
> 
> --
> ----------------------------------------------------------------------
>   John MacLerran
>   IT Systems Analyst                       email:   [log in to unmask]
>   Idaho State University                             V(208) 282-2954
>   http://www.isu.edu/~macljohn                       F(208) 282-3673
> ----------------------------------------------------------------------
> 
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
> 

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2