Mark,
Did you check your network status?
LINKCONTROL @,a
and the standard NETTOOL.NET.SYS --> RESOURCE --> DISPLAY?
It's a possibility.
-Craig
--- On Sat, 9/10/11, Mark Ranft <[log in to unmask]> wrote:
From: Mark Ranft <[log in to unmask]>
Subject: Tidal OCS Express - shutting down
To: [log in to unmask]
Date: Saturday, September 10, 2011, 10:24 AM
After running for a year, last week my client's EXPRESSJ,MGR.EXPAGENT jobs
shutdown. This occurred on two systems both the production and the test
system. This was not expected. It shutdown normally. The jobs went to
:EOJ. I started the jobs again. After review console log messages, we
blamed the shutdown on network errors that occurred just prior to the job
stopping.
What else would cause both jobs to shutdown like that?
Note the master scheduler is on a non-MPE system. These are just agents.
Now today, one week later (minus 18 minutes) both jobs stops again.
Interestingly, the test system stopped about an hour later, which again was
about one week (minus 4 minutes) from when it was streamed.
What would make a job that normally runs constantly end after first one year
and then one week. Will it be stopping after one day next? I am a little
baffled.
Here is the output from the Express Agent log.
2011 Sep 10 08:54:15 agentd MGR.EXPAGENT 4: Message received by pid
56295826: UPTM
2011 Sep 10 08:54:15 agentd MGR.EXPAGENT 4: Load is 99.9999 (#56295826)
2011 Sep 10 08:54:15 agentd MGR.EXPAGENT 4: Message received by pid
56295826: TIME
2011 Sep 10 08:54:15 agentd MGR.EXPAGENT 4: Time is 1315644855)
2011 Sep 10 08:54:16 cqd MGR.EXPAGENT 1: cqd: shut down (pid=86507785)
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 4: RecvMsg msg=0
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 4: shutdown msg -- going down
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 clockd MGR.EXPAGENT 1: clockd message queue read error
-1.
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: 0 children waiting
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 1: efiled shutting down
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 1: do_shutdown
2011 Sep 10 08:54:16 agentd MGR.EXPAGENT 4: Message received by pid
56295826:
2011 Sep 10 08:54:16 agentd MGR.EXPAGENT 4: Socket service terminated
(#56295826)
2011 Sep 10 08:54:16 clockd MGR.EXPAGENT 1: Server 'clockd' shutdown.
2011 Sep 10 08:54:16 agentd MGR.EXPAGENT 4: Shutdown received (#61210855)
2011 Sep 10 08:54:16 agentd MGR.EXPAGENT 1: Server 'agentd' shutdown.
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: pt_del for 1174847
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 4: pt_del for 1174876
2011 Sep 10 08:54:16 jobd MGR.EXPAGENT 1: jobd (68878603) shutdown.
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 4: sending CLEARTIME_OP to clockq
2011 Sep 10 08:54:16 efiled MGR.EXPAGENT 4: closing efileq
Does anyone know what may be happening here?
Mark Ranft
Pro 3K
* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
|