HP3000-L Archives

April 1995, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Stan Sieler <[log in to unmask]>
Reply To:
Date:
Sat, 1 Apr 1995 23:55:48 GMT
Content-Type:
text/plain
Parts/Attachments:
text/plain (52 lines)
Jerry Fochtman ([log in to unmask]) wrote:
: On Wed, 3/15, Tony Furnival <[log in to unmask]> published a list of
: the SIGMPE enhancement, including:
 
: > SIG ID  Description
: > 95P04   Need ability  to break IMAGE deadly embrace
 
: I would recommend dropping this item from the SIGMPE list based upon:
 
:   1)  I don't believe this is an MPE issue, but rather belongs to
:       IMAGE/SQL.
 
Actually, I think this is a definite MPE problem.  The deadly embraces
are generally caused by code that waits on semaphores of one kind or
another.  Within the operating system, this is generally a routine
called "cb_lock" (or a variant).
 
None of these routines have a timeout parameter.  I think that if
semaphore locking had such parameters, and returned "failed" after that
amount of time elapsed, we'd never again have a deadly embrace.
 
Consider, for example, if IMAGE used such a timeout when doing a DBLOCK,
and if it set the timeout to 10 minutes.  The worst that would happen
if two MR-capable programs locked databases A and B (in one case) and
B and A (in the other case) is a 10 minute lockout of the database ...
and then each program's second DBLOCK would return with a new IMAGE
error ("probable deadlock prevented").
 
Additionally, *mandatory* use of such a parameter by the operating
system would preclude deadlocks between any combination of
things like: FLOCK, DBLOCK, GETSIR, cb_lock, OBTAIN.
 
Yes, converting to this scheme would cause a bit of work for the lab :)
but, if we added a non-omittable timeout parameter to cb_lock, GETSIR,
OBTAIN, and made FLOCK & DBLOCK have implied timeout values (perhaps
values found in system_globals, and perhaps defaulting to 30 minutes)
the compilers would assist the lab in quickly finding the source code
that needs updating.
 
The real major chunk of work in many cases is adding appropriate error
action to the source code of the operating system.  In the case of DBLOCK,
and FLOCK, there is probably a fairly obvious path to take when the
internal cb_lock fails (due to a timeout).
But, in many parts of the operating system there isn't such a path...
because the authors (including me) of code that called cb_lock never
expected it to fail...because it *couldn't* fail, by design.
 
Anyway...this is a real MPE issue, not just a SIGImage issue.
 
Stan Sieler
[log in to unmask]

ATOM RSS1 RSS2