HP3000-L Archives

July 2008, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Derek Drummond <[log in to unmask]>
Reply To:
Derek Drummond <[log in to unmask]>
Date:
Wed, 23 Jul 2008 11:11:11 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (213 lines)
Sometimes you must also ABORTIO to "release" the job.
 

-----Original Message-----
From: HP-3000 Systems Discussion [mailto:[log in to unmask]] On
Behalf Of Gary Robillard
Sent: Wednesday, July 23, 2008 11:01 AM
To: [log in to unmask]
Subject: Re: [HP3000-L] Job not responding...

Hello All,

Since you have Glance you could do a trace of the process(es) currently
running in the job to see what they are waiting on.  Also, taking a
memory
dump could be helpful to determine what was going on with the system.
But
it depends on your support provider (sounds like it is HP). Since you
were
able to run Glance/ix and issue commands from the console it was likely
something local to the job as Olav indicated.

From Glanceix you can use the ":" command to enter an MPE command (such
as
showjob), but this depends on the "MPE" directive setting in the
GLANCNFG.PUB.SYS file. If it is set to "MPE NONE" then MPE commands from
within Glance/ix are disabled. Probably best to do a :SHOWJOB before
running
glance and note the job number of the job that is having the problem.

Once in Glance/iX in the main screen, issue the "J" command and input
the
job number.  This should bring you to the Job/Session screen for the Job
(you should see the job number and logon of the job in the Process
Selection
Summary on the screen.

Below that will be the list of processes associated with the job.  If
none
are displayed type "A" so that Glance/iX will display all processes,
otherwise it only displays processes that it detects as 'interesting'.

You will need to look at each PIN number displayed for the Job/Session
to
see what they are doing.  If you are connected via reflections you can
usually ignore the VTSERVER process and start with any other processes.
OK,
it is possible that the VTSERVER is hung, but usually there is a child
process.

You select the process with the P command. Once in the process screen
you
should see the scheduling state (Long Wait, Short Wait, Executing
(probably
only if you are looking at Glance/iX's PIN), etc.  You should also see
the
Wait State of the process (Terminal Read, Timer, Message, etc.). These
give
clues as to what the process is doing.

Next, you can use the F3 key (or the number 3) to perform a stack trace
of
the process.  It does require some internals knowledge to interpret the
stack trace, but you can make some educated guesses.

Using Olav's example of a process waiting on an empty message file, you
would see something like the following:

Procedure Trace for Pin   95 is: 

       PC=a.001a99d8 enable_int+$2c
NM* 0) SP=4184e2f0 RP=a.0145ef48
notify_dispatcher.block_current_process+$334
NM  1) SP=4184e2f0 RP=a.01460d94 notify_dispatcher+$268
NM  2) SP=4184e270 RP=a.00253444 wait_for_active_port+$e8
NM  3) SP=4184e170 RP=a.002540a8 receive_from_port+$544
NM  4) SP=4184e0f0 RP=a.00754ee8 ipc_wait_process+$3b0
NM  5) SP=4184def0 RP=a.00e7b4f0
tm_msg_var_buf_disc.tm_msg_long_wait+$538
NM  6) SP=4184dcf0 RP=a.00e82e94
tm_msg_var_buf_disc.tm_msg_complete_waited_rea
d+$140
NM  7) SP=4184db70 RP=a.00e83864 tm_msg_var_buf_disc.tm_read+$32c
NM  8) SP=4184daf0 RP=a.00e855a8 tm_msg_var_buf_disc+$144
NM  9) SP=4184d970 RP=a.0117b690 fread_nm+$95c
NM  a) SP=4184d8b0 RP=a.013f5634 FREAD+$d4
NM  b) SP=4184d470 RP=a.013f552c ?FREAD+$8
         export stub: a.013c67ec tprint+$6a4
NM  c) SP=4184d370 RP=a.00fae80c hxprint+$2b0
NM  d) SP=4184cef0 RP=a.011ac26c exec_cmd+$b3c
NM  e) SP=4184ce30 RP=a.011ab6fc ?exec_cmd+$8
         export stub: a.011ae2cc try_exec_cmd+$c8
NM  f) SP=4184cdb0 RP=a.011ab474 command_interpret+$318
NM 10) SP=4184c930 RP=a.011ab128 ?command_interpret+$8
         export stub: a.011aee0c xeqcommand+$18c
NM 11) SP=4184c330 RP=a.011aec6c ?xeqcommand+$8
         export stub: 95.00006774 
NM 12) SP=4184c2b0 RP=95.00007460 
NM 13) SP=4184c230 RP=95.00000000 
     (end of NM stack)

While it is a long list, you can see that FREAD was called by hxprint
(the
CI PRINT command executor), and we are blocked (waiting) for the read to
complete.  Since it is a message file that is empty we will wait until
aborted or a record is written to the MSG file. Also, glanceix is likely
to
show "OTHER IO" as the wait, with the scheduling state as "Long Wait" in
this case.

Thanks,

Gary Robillard



-----Original Message-----
From: HP-3000 Systems Discussion [mailto:[log in to unmask]] On
Behalf
Of Olav Kappert
Sent: Wednesday, July 23, 2008 10:30 AM
To: [log in to unmask]
Subject: Re: [HP3000-L] Job not responding...

Rao:

Maybe a deadlock situation.  Maybe the use of a message file that is 
empty.  Could be waiting upon a console reply.

So many possibilities, so little time, well it's to late now to be sure.

Regards, Olav.

Rao, Raghu wrote:

>Hi all, 
>
>We had a strange situation last night.. We are a Health Plan using
>Amisys software (not Amisys Advance). Our production box is a Hp3k
N4000
>series.. 
>
>One of the claims payment extract job which has been running fine so
far
>from past 8 yrs, now all of sudden dies.. It was supposed to run for
>maximum 3 hours and kept on running for 8 hours. When I logged in at
>about 2:30 AM to view this job, there were no production jobs running
at
>that time other than this hung job and our other regular system jobs.
>This hung job gets displayed on SHOWJOB command on the MPE prompt,
>however on GLANCE it shows 0% CPU time. This hung jobs did not respond
>to any commands. We aborted the job using regular ABORTJOB command and
>no response. The job still shows EXEC state. We tried ABORTPROC and it
>says that an abort is already pending on this job. We kept trying for 2
>hours to get any response from this job and it was totally hung and
>non-responsive. 
>
>Then we finally called HP at about 5:45 AM to figure out if they have
>any special ABORT JOB commands to kill this job. Their feedback was to
>SIMPLY REBOOT the machine, which was kind of surprising to us. But we
>had no other choice (as this job was one of our core jobs doing claims
>payment processing). We went ahead with the REBOOT which finally
knocked
>off that job. 
>
>But we are clueless as to what happened to this job... The job run
>STDLIST show blank after 8:12 PM. No other logs are showing anything
>positive. We investigated 4 production claims which this job could
>possibly be accessing at that particular moment when this job got hung,
>but further testing on those claims this morning revealed nothing
>fishy.. 
>
>Has anyone been through this deadlock before ? Any tips, pointers, etc
>would be really appreciated.. 
>
>Thanks and best regards.
>
>  
>
>>Raghu Rao
>>
>>    
>>
>
>
>********************************************
>This communication and any files or attachments transmitted with it may
contain information that is confidential, privileged and exempt from
disclosure under applicable law. It is intended solely for the use of
the
individual or the entity to which it is addressed. If you are not the
intended recipient, you are hereby notified that any use, dissemination,
or
copying of this communication is prohibited by federal law. If you have
received this communication in error, please destroy it and notify the
sender.
>********************************************
>
>* To join/leave the list, search archives, change list settings, *
>* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
>
>  
>

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2