HP3000-L Archives

November 1996, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Stan Sieler <[log in to unmask]>
Reply To:
Stan Sieler <[log in to unmask]>
Date:
Fri, 1 Nov 1996 09:16:46 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (131 lines)
Hi all,

I was filling out my responses to the survey, and noticed that on a large
number of them my response was:

    We already have that in the HP3000 community (usually from
    a third party vendor)....why spend effort duplicating & competing
    instead of innovating?

BACKUP:
   Things addressed by other vendors:
      >  Unattended Backup (stacker devices,       |
      >  Faster Backup and Recovery devices        |
      >  Higher Capacity backup and recovery       |
      >  Device Library & Media Mgmt. to Manage    |
      >  Remote Backup of HP 3000 to and from      |

   Things not addressed:
      >  Shared Tape devices between multiple hosts|_________ |_________

   So, for me the vote is simple in this area: put resources where
   in useful areas where HP can provide new functionality.
   That would first be the "Shared Tape devices", followed by opening up
   the I/O architecture to make it easier to add third party I/O devices
   (Faster & Higher capacity)

HIGH AVAILABILITY

   This section was the best for offering options where HP could
   contribute new functionality.

>  Software Mirroring of System Volume Set   |

      Aside from a few debatable comments about boot-up problems to
      be solved, this sounds like a reasonably straightforward item
      to implement.

>  Improved Update Time (Stage/iX for new    |

      An item that requires HP's involvment.

>  Online Database Maintenance (Dynamic      |

      An item already promised and being worked on

>  Online Abort of Hung Jobs & Sessions      |_________ |_________

      Ah...Dole's 15% tax cut promise ... sounds good, but how you gonna do it?
      :)

      I won't mention that the ABORTJOB command already provides a means
      of aborting jobs/sessions online. :)

      The basic reason some jobs/sessions can't be aborted is that they
      are in a state where the abort request is postponed.  Sure, we
      could implement a Unix-like "kill -9 <processID>", where we would
      say "blow it away, no matter what".   ... but that generally wouldn't
      really solve underlying problem, and would almost certainly cause other
      problems shortly afterwards.

      I suggest we need to look at the underlying reasons this request is
      often seen.  These include:

         1) the hung job has a tape drive allocated, and we want to use that
            drive for something else.

            Ok...how about a command/utility like "deallocate tape".

            To be *safe*, it would probably have to introduce a bitmap where
            there is one bit per possible job number, and one bit per
            possible session number.  When we say "deallocate ldev=7 job=#J9",
            all pending I/Os to ldev 7 would be aborted, and bit 9 of the
            ldev 7's "job deallocate map" would be set.  Then, even if
            one (or more) of the hung processes in job #j9 wake up later,
            and try to access the drive, sendio would reject the attempt
            because of the bitmap bit.

         2) the hung job has a file opened in some manner that prevents
            other processing that needs the file.

         3) the hung job is still applied against the LIMIT value,
            perturbing the normal flow of jobs.

            Ok, modify the job/session logic to say "we'll pretend this
            job/session is gone, and set a bit in a table somewhere so that
            if it ever un-hangs and terminates, we won't double decrement
            the number of current jobs/sessions by accident".

         4) the hung job is using CPU resources more than we'd like.

            Ok...how about a "remove job from CPU scheduling" command/utility?

>  Reduce System Interruptions due to        |
>      Network Failures & Hangs              |_________ |_________

      Sounds great, but I suspect it's similar to the "abort a job"
      item...probably harder to achieve than one would think (otherwise
      Eero & friends would have done it for us already :)

>  Immediate switching of volume sets to 2nd |
>      system in case of system failure      |_________ |_________

      This needs a better description, as I can see 3 explanations for
      it:

         1) allow the disks of MPEXL_SYSTEM_VOLUME_SET (let's call it SYS)
            to be moved from one computer and then connected (presumably by
            uncabling & recabling) to a new computer, as a new & separate
            volume set (not SYS)

            This would have repercussions on the design of volume sets,
            particularly the group/account directory stuff.
            However, it might lead to much needed rethinking of that aspect
            of volume sets.

         2) allow the disks of some volume set (other than SYS) to be
            moved (ditto) to another computer.   Note that this would require
            either work in volume set design, or that the second computer
            already have the group/accounts setup.

            Been there, done that.  This is already feasible today, with
            no extra hardware, no extra software.

         3) Design/build/sell a fancy hardware switch that allows
            specific individual disk drives on a SCSI (or HPIB?) chain
            to be switched from one computer to another.
            Note that this still requires the same software that #1 and #2 do.

Stan Sieler                                          [log in to unmask]
                                     http://www.allegro.com/sieler.html

ATOM RSS1 RSS2