HP3000-L Archives

July 1995, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Jeff Kell <[log in to unmask]>
Reply To:
Jeff Kell <[log in to unmask]>
Date:
Sat, 22 Jul 1995 22:37:17 EDT
Content-Type:
text/plain
Parts/Attachments:
text/plain (41 lines)
On Fri, 21 Jul 1995 21:47:15 GMT Neil Kelly said:
>Can't answer your questions re. Posix perf but I can tell you why I was
>trapping from the shell.  I found the error to occur when disk space became
>highly fragmented.  Using the volutil contigvol command can help you generate
>needed contiguous disk space.
 
You are correct, but imprecise.  "Fragmented" has meant, until this incident,
a lack on contiguous free space on the requested object (volume set, class,
etc).  Fragmentation was not a problem unless ALL candidates were fragmented.
So why does this Posix malloc() trap (traps 68) seem to croak when only ONE
volume is "fragmented".
 
I have previously stated problems with malloc()/fork()/exec()/shell commands,
but narrowing it down, it's still a common thread of malloc() which is called
internally by all of the above.
 
>My understanding is that the problem was caused by process management not
>receiving an error condition correctly from vsm during a fork.  A patch,
>MPEHXB5, is available.  The patch causes the error condition to be processed
>correctly.  The result is that the fork fails rather than the malloc.
 
MPEHXB5 simply replaces the fragmentation-induced trap with an unfounded error
message (Resource busy, try again).  I experienced these aborts with >500K
sectors contiguous free space available to transient, and 1.5M sectors of
aggregate free space available to transient.  And after installing MPEHXB5,
I was getting "Resource busy, try again" under the same environment.
 
Abort, error message, whatever.  You're treating a symptom, not the disease.
This started with the "new" disc space allocation business that tries to keep
ldev 1 free (coincidence?).  Find whatever piece of code MPEHXB5 patched to
intercept the trap and work backwards.  The trap is a false alarm.
 
Sorry for soap-box mode, but this one bit me hard and I guess I have a
microchip on my shoulder :-).  If you are having this error, I'd suggest
you refuse to do the recommended CONTIGVOL (if you are not REALLY fragmented)
and escalate your problem report.  After I "fixed" my system (w/CONTIGVOL on
one isolated system volume that was fragmented/full) I was brushed off since
the "symptom" went away.
 
[\] Jeff Kell <[log in to unmask]>

ATOM RSS1 RSS2