LISTSERV - HP3000-L Archives

HP3000-L Archives

October 1997, Week 4

HP3000-L@RAVEN.UTC.EDU

	LISTSERV Archives
	HP3000-L Home
	HP3000-L October 1997, Week 4

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: New B-tree feature
From:	Wirt Atmar <[log in to unmask]>
Reply To:	[log in to unmask]
Date:	Sat, 25 Oct 1997 18:01:25 -0400
Content-Type:	text/plain
Parts/Attachments:	text/plain (100 lines)

I wrote earlier in reply to Ken Paul:

> The reason that IMAGE uses the right-most 32 bits is because that is where
>  most of the information lies for an integer number. Integer-based formats
>  begin counting in the right-most (least significant) bit. Real numbers, on
>  the other hand, begin counting at the other end, more or less at the
>  left-most (most significant) bits -- and that's where the greatest amount
of
>  "information" (unexpectedness) lies for a real number. IMAGE,
unfortunately,
>  doesn't discriminate between reals and integers and always hashes its
>  address values based on the right-most bits, leaving us with the R4
problem.

Please allow me to add just a note or two about the problem with real
numbers.

After I wrote the above note, I read Ken Paul's comments last night in the
3000 News/Wire about the current limits in the size of IMAGE datasets. In the
News/Wire's October Flash Paper, Ken is reported to have said:

"Paul believes the most critical to-do item for the [IMAGE] labs is one of 'a
major need to address the current limitation issue. This will probably be my
crusade for the next several years.' The biggest HP3000 customers are
encountering real limits in databases, and the IMAGE lab, Paul believes 'will
probably do a small fix by making the block pointer an unsigned number
instead of a signed number, which will double the true limit of an IMAGE
dataset. But [Adager colleague] Fred White and I would like to see IMAGE
convert all pointers to 32-bit or even 64-bit and get it over with. This
would require a similar program like DBCONVert, which would also make it
difficult to move databases from one machine on a new version of IMAGE to a
machine on an older version of IMAGE. But we feel that the Lab will have to
bite the bullet sooner or later.'"

I wholly agree with what Ken and Fred have to say. Converting IMAGE and the
file system into a flat address space large enough to hold any imaginable
amount of information is clearly going to have to be one of the primary goals
of the next few years. Databases are going to grow very large during the next
several decades. "Chunking," as was done with Jumbo datasets, can only be
considered a clumsy, stopgap solution. And doubling dataset sizes by
gathering a bit here and there will only hold things off for a few years, at
most.

To make these changes to the internal structure of existing IMAGE databases
will require however a second version of the DBCONVERT program. But as long
as such a program is required for one functionality, it can repair other
primary deficiencies as well during its one-time execution.

One of those deficiencies is the real number problem that Ken spoke of, where
"hashing" is done off of the bottom 32-bits in real numbers (hashing in this
case is limited to mean that the integer and real number conversions that
IMAGE uses to create an address into a master dataset is relatively simple;
no matter how a hash algorithm is implemented, the intent is use the most
non-uniform, "unexpected" portion of the information held in the search item
value and convert it into a relatively uniformly distributed address
somewhere within the current capacity of the master dataset). As I mentioned
earlier, real numbers larger than 32-bits in length don't act this way. The
informational portion of a real number lies in its high-order bytes, not the
low-ones.

Because disc space is no longer anywhere near as precious as it once was --
and because the use of real numbers have significant user-friendly values
over the use of integer numbers with implied decimal points, it seems only
reasonable to expect the increased use of long reals in the future. Indeed,
we're seeing that trend in all of the new databases that we encounter at
customer sites (except, of course, for those sites that use COBOL as their
primary application development language).

Correcting the poor choice of the "hashing" algorithm that currently exists
within IMAGE for long reals is a relatively desirous thing to do -- and I
would like to add it as an addendum rider to Ken's crusade. If we're going to
convert databases sometime in the near future, we might as well fix
everything we can in one fell swoop.

Wirt Atmar


PS: On a second subject, Cortlandt Wilson has a nice article/opinion piece in
the same issue of the News/Wire on the subject of the still-missing "908",
the "personal mainframe" HP3000. I not only agree with everything Cortlandt
has to say, I have for some time now shamelessly appropriated his "personal
mainframe" phrase since I heard him first say it six months ago. Our best
customers take total psychological ownership of their HP3000s and treat them
exactly as if they were "personal mainframes," regardless of the machine's
actual size or the number of people they have to share it with.

However, over the years I've noticed that taking such total psychological
ownership of an HP3000 is a great deal easier if the machine is smaller
rather than larger -- and that our customers who started on micros and who
now run 987s still treat their 987s as if they were PC-like devices. They can
do anything on these machines that they wish because they are totally in
control. If I had to pick one single reason why I believe the 908 is so
important, it is to create a new generation of people who understand how to
use the HP3000 for imaginative and creative business purposes without getting
too entangled in all of the technical detail.

As I tell all of our customers, if you can run a micro, you can run a 997.
There is no difference between the machines -- other than the size of the
lease payments.

ATOM RSS1 RSS2

RAVEN.UTC.EDU