Subject: | |
From: | |
Reply To: | |
Date: | Thu, 29 Apr 1999 11:55:50 -0700 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
Scott writes:
> Bottom line, :ABORTPROCESS will not abort any process that is critical.
> Period.
Ergo, :ABORTPROCESS will not abort any process which :ABORTJOB could not
abort. It's just a more fine-grained :ABORTJOB, not a magical "process-
be-gone" command.
In my opinion this is a step backwards in that it destabilizes the 3000
platform by allowing operations staff to take potshots at arbitrary
processes in a carefully designed and constructed process tree. The
developers of the system probably never tested what happens if arbitrary
processes get killed at arbitrary points in their execution, thus
increasing the chances of data corruption and other problems resulting
from use of this new command.
Further, the only mechanism available to a developer to protect herself
from :ABORTPROC is to set the process "critical" for its entire execution.
This means of course that the process will not be abortable *at all* now,
and any error in the program will result in the entire system crashing
with a SA1458 (Process Aborting While Critical). Hardly an improvement.
Users are asking for a product that "Kills bugs dead permanently right
now", but, as has been pointed out, this is not practical.
The problem is not the inability to 'kill' certain processes. In fact
there are two problems:
1) Processes get stuck in states where they cannot be aborted.
This is not a deficiency in the :ABORT[JOB|PROC|whatever] commands.
It is a result of a complex system which is either buggy or not
designed to avoid these situations. If you don't want non-abortable
stuck processes, ask HP and the other software developers to ensure
that this doesn't happen. Of course if you want more money spent on
this, you'll have to expect something else to suffer.
2) Users can't tell *why* something is stuck and why they can't abort it.
There seems to be a standard human response of "if you can't understand
it, try to make it go away". Several people today have asked for more
information as to why processes are stuck. I suggest (actually I
suggested to HP several days ago) that if there was a TELESUP type
utility that people could run which would explain to them why a
process was not currently abortable, that this would practically
eliminate the need for a "super" :ABORTPROC command. Either users
would accept the stuck nature of the process once they understand
exactly why it is stuck, or they would complain to HP (or whomever)
about the stuck process and ask for the associated "bug" to be fixed.
The utility would give enough information for the user to feel
confident that they understand exactly what the process is doing and
why it is stuck (this means a textual explanation, not just a bunch
of stack traces) and also the technical information that a developer
would need to investigate and fix why the condition occurred in the
first place.
G.
|
|
|