HP3000-L Archives

July 1999, Week 5

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Jerry Fochtman <[log in to unmask]>
Reply To:
Jerry Fochtman <[log in to unmask]>
Date:
Thu, 29 Jul 1999 14:25:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (61 lines)
At 01:07 PM 7/29/99 -0400, Mike Hornsby wrote:
>At night it may be appropriate to "auto/dump/restart" but certainly
>during prime time doing a dump may be a total waste of time. Many
>dumps don't show anything anyway.

Sorry, but I have to disagree.  All too often the only way to get
enough information and a direction to resolve a problem is by analyzing
the dumps.  And in a number of instances, it sometimes takes several
dumps of a problem to really capture what is happening.  Granted there
always are exceptions to the rule, but more often than not, problems
are best resolved by using the dumps.

With a SA1458, a dump can be quite useful to help identify the cause.
The ability to examine the stack trace of the process involved in
the trap is also useful.  Yes, taking the dump can be painful in
terms of the amount of time it takes.  But on the other hand,
experiencing repeated system failures increasing the potential of
corrupting a databases. And the repair effort, especially for some
of the large databases, could easily exceed the effort of taking a
dump in the first place so the root cause could be identified/addressed.

The technique Mike offered for determining what process 'may' be
running may not be accurrate, thereby sending one down the wrong path
to understand what occurred or in looking for a potential area to
study as to a cause.

I do agree with Mike in that it would be nice if there was an option
to have the dump process only capture the data relavent to the last
n-processes that were executing on each CPU along with all the
system-level structures/etc..  This may help provide enough information
to get a first-cut on the issue without incurring the longer time it
takes to capture the entire environment.  But this would probably take
a major re-write of the dump process itself, building-in knowledge of
the system archecture, which would be no simply task....

However, I don't believe an option to throw the console into DEBUG so
someone can look around is an effective way to minimize downtime.  Most
sites would not have anyone with the knowledge/experience/skill to
affectively use this mechanism.  And those individuals who do have
skills probably already know how to trap a system failure and if
necessary/appropriate, unwind it to the point of killing the process
without causing a system abort....

Just my $.02....

/jf
                              _\\///_
                             (' o-o ')
___________________________ooOo_( )_OOoo____________________________________

                        Thursday, July 29th

          Today in 1958 - National Aeronautics and Space Administration
                          was authorized by Congress.

___________________________________Oooo_____________________________________
                            oooO  (    )
                           (    )  )  /
                            \  (   (_/
                             \_)

ATOM RSS1 RSS2