Marek writes:
> When I came to work this morning I found our hp3000 out of running
> (937RX, MPE 5.0).
> The messages at the console were following:
>
> SYSTEM ABORT 1457 FROM SUBSYSTEM 102
> SYSTEM HALT 7, $ 05B1
>
Without a memory dump, we can't tell for sure. But...here's how to
decode the information you posted. (For a more detail version,
see:
http://www.allegro.com/papers/dump.html
Logon as MANAGER.SYS, then:
:debug
= errmsg (#1457, #98)
'A system process is being terminated due to a trap.'
= errmsg (#32765, #102)
'Process Manager'
...which you already knew.
The "#98" is a magic number, and is always used to decode the system
failure number.
The =errmsg (system_failure_number, #98) won't show messages for all
possible system failures, because the system message catalog isn't
complete. I recommend filing a bug report each and every time you
see a system failure number that *isn't* in the catalog!
The "#32765" is a magic number, and is used to get a "subsystem" name
from a subsystem number ... it doesn't always work, as not all subsystems
have their name in the system message catalog.
> And at the bottom left corner there were blinking by turns:
>
> FLT BF04
> FLT 0105
> FLT 02B1
> FLT DEAD
Ok..."BF04" surpises me a little, I'd expect BF07". Anyway, look at
the next two lines, with 0105 and 02B1. That's saying "I have a
16 bit error number for you. Starting at the left edge of the number,
the first packet (01) is 05, and the second packet (02) is B1. Thus,
the 16-bit error number is hex 05B1, which is decimal 1457.
> I red in the Error Messages Manual that the subsystem 102 was the
> Process Manager but there were no error message '1457'. I performed
> the soft reset and the system has gone up without troubles.
A soft reset (control-B, then TC), is necessary if you are planning to
take a memory dump.
If you *aren't* taking a memory dump, a hard reset (control-B, then RS)
is recommended ... it initializes memory better.
--
Stan Sieler [log in to unmask]
http://www.allegro.com/sieler.html
|