Doug writes: > [...]I have always felt there is a need for a kill process > command, with the caveat that a process that is set critical must > be left alone. And therein lies the technical problem in providing the :NUKEPROC command. I believe that *all* blocked processes are marked "critical". The only reason you're able to abort *any* blocked process today is that the abort commands know how to cancel certain kinds of pending I/O, thus allowing the process to wake up, cease to be critical, and then notice the abort request and act on it. Once a process enters "system code", a system manager *cannot* know what the effect of just "aborting" that process will be. And because there is no way to block a process in user code, *all* hung (blocked as opposed to looping) processes are in system code while they are blocked. I realize many people feel it's their fundamental right to shoot themselves in the foot with the weapon of their choice whenever they feel like it, but in this case it's just not practical. If you want Unix, you know where to find it. A related issue is that processes cannot be "killed", they can only be "requested" to commit suicide at their earliest convenience. So it's easy to ask a process to go away, but the hard part is being able to wake up a blocked process so that it can act on that pending request. Today's abort commands can abort processes that are blocked for certain reasons, most notably I/O of one sort or another. Unfortunately there are lots of other resources that a process can block on (file and db locks, semaphores controlling many different kinds of operating system structures, etc.) but teaching the abort code how to extricate a process from each of these cases would be quite expensive compared to just finding the bugs that cause the hangs in the first place. On the other hand, "looping" processes should be easier to kill and the new HP :ABORTPROC[ESS?] command should take care of these unless they are continuously critical for some reason (Though I've seen at least one major 3rd party tool that seems to stay critical all the time). > The question I have is this. What difference in potential corruption is > there with an abortproc command versus rebooting the system? If you shutdown (or even just halt) the system as a whole, then things like the Transaction Manager (XM) ensure that the system remains in a consistent state. If you just arbitrarily release resources owned by processes, then the structures protected by the locks you've just freed may be in an inconsistent state, so if you continue to let the system run after that then all of your integrity and security may be out the window, and there's no way of knowing what will happen. This is the same reason why virtually all unexpected failures and errors within the operating system itself result in instant System Aborts. It limits the damage that might be caused by the unknown state that the system has gotten itself into somehow. > Historically, rebooting an MPE system to solve a problem was almost > unheard of. Now it is commonplace, and worse, an accepted practice. Historically an HP3000 was a much simpler world than it is today. MPE plus COBOL, IMAGE, VPLUS, spoolfiles, printers, a tape drive, and serially connected terminals was all you needed to run a business. MPE/V systems quite happily ran large organizations with significant numbers of users on only a megabyte or two of memory and 1MHz or so of CPU. Today we have all sorts of new things like networks, Posix, and all of the things that they bring with them. The offer was made: Good, Fast, and Cheap; pick any two. People went for fast and cheap for some reason. G.