HP3000-L Archives

June 1995, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Guy Smith <[log in to unmask]>
Reply To:
Guy Smith <[log in to unmask]>
Date:
Tue, 13 Jun 1995 15:52:47 EDT
Content-Type:
text/plain
Parts/Attachments:
text/plain (63 lines)
TI questions:  Our largest production database is driven by
an eight character ASCII field.  We have both automatic and
manual masters using this key as their primary index.  So
far so good.
 
The problem is that the algorithm used to generate this key
goes back to the days when we managed datasets with a few
thousand records, not a few million.  The key is basically a
capitalized letter followed by a six digit serial number and
a check digit The letter increments after the serialization
is finished.
 
I'm beginning to believe that this key is less than
efficient.  Some master secondary lists go to 80+ records
(avg. chain about 1.7, 30% secondaries on datasets 60-80%
full).  I suspect this explains some of the disc queues we
have witnesses, especially on the masters where we are not
caching in just the key values (and has anyone discovered a
good way of verifying dataset disc usage given that mapped
file access has eliminated the usefulness of FILERPT/RedWood
for measuring TI IO).
 
Our applications programmers note that there is nothing
magical about the internal structure of the key.  No program
expects or disects the format, so providing we stick with an
eight character field, we should be able to convert to
something more efficient with only minor mayhem.
 
So here are the questions, loaded as they are:
 
1)  Assuming the eight character restriction, what would be
the most efficient key we could design (note that key
duplication is not an option)?  We are in complete control
of the code, so any algorithm that can be coded could be
implemented.  Assume that we only have odd weekends for
capacity changes and might have to suffer *very* full
datasets.
 
2) How would the proposed key design effect chains assuming
that we do no data conversion on the old keys, but instead
start generating the new key and adding it to the existing
datasets.
 
3) I've tended towards separating active automatics to
different spindles under the assumption that removing
contention between datasets would reduce potential disc
queues.  Again this assumption dates back to the days of
small datasets and slow drives.  I'm beginning to believe
that it would be better to take these masters and evenly
spread each across all spindles in the volume set.  Anyone
else with large datasets have experience or opinions on
this?
 
4) How did I even get by without HP3000-L?
 =======================================================================
Guy Smith                                Voice:  804-527-4000 ext 6664
Circuit City Stores, Inc.                  FAX:  804-527-4008
9950 Mayland Drive                      E-Mail:  [log in to unmask]
Richmond, VA 23233-1464         Private E-Mail:  [log in to unmask]
 
The thoughts expressed herein are mine and do not reflect those of my
employer, or anyone with common sense.

ATOM RSS1 RSS2