HP3000-L Archives

February 1995, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Wirt Atmar <[log in to unmask]>
Reply To:
Date:
Tue, 14 Feb 1995 16:03:00 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (83 lines)
Melissa writes:
 
>Network database are not built for joins
>- the relational model is the one _designed_ to
>handle joins. (ref "An Introduction to Database
>Systems", vol. 1,  C.J. Date)
 
There is a certain level of mythology in every aspect of computing, and this
is certainly one of those areas. A network-structured database should be no
worse than a "relational" database in performing high-speed,
easy-to-implement joins, and indeed, in the case of IMAGE, it can be much
better.
 
Chris Date is a very prolific author, but there is no reason to either (i)
fawn over his statements, or (ii) to dismiss them as bull, Melissa (please
take no offense at my poor sense of humor). Much of what he says, just like
anyone else, is opinion. And unfortunately, some (to much) portion of what he
says is commercially motivated.
 
The ease and speed at which joins can be accomplished will almost totally lie
in the design of the query language, not in the database file structure
itself. The primary contribution of the database's design structure to the
speed and ease of joining multiple tables will rest in its capacity to find
and isolate records easily and quickly. In that regard, no database structure
that I know of is better than IMAGE.
 
I, too, have a commercial interest in what I say, so my opinions should be
highly tempered with that knowledge. For the last ten years, I have been
designing a query language into our report writer (and, as with most
projects, 90% of the design was accomplished in the first few months ten
years ago). When I began, I did not know SQL, other than in the most general
sense. With the rise of interest and commonality of SQL, the primary question
that haunted me was, of course: "Did they know something that I didn't? Did
they do it better than we did? Did we really screw up and do this all wrong?"
 
Of the seven deadly sins (lust, gluttony, sloth, etc), pride is the most
serious.  While our report writer is graphically oriented, and that's what
most people see of it initially, over the years, I have increasingly become
most proud of the decisions we made in the design of the query language of
our product (and thus have assured myself of a ride across the River Styx
into the Kingdom of Hell). We designed the report writer primarily for
business people (CFO's, accountants, bookkeepers, engineers, vice presidents,
etc.) and we tell our customers several things repeatedly, two of which are:
 
   o  If you are familiar with the concept of joins and multifinds, we want
you to forget everything you know. You're making the process far too
complicated if you think in those terms.
 
   o  The answer is always, "Yes."
 
My hubris (the Greek version of "pride goeth before a fall") in all of this
is derived from a number of years' experience now. Our customers, who are for
the most part non-technical people, now tend to believe that a report is not
getting "complicated" until they are extracting information from eight to ten
datasets (tables) -- and a few have generated reports that extract and "join"
information from as many as twenty or thirty datasets in one report -- and
sometimes they don't see that as particularly complicated.
 
And these reports run quickly. As Denys said earlier, the speed of the "join"
will depend on the quality of the optimizer. What makes our report writer so
fast for joins? Really nothing much other than care and the efficient use of
the information provided by the IMAGE database file structure. The primary
technique employed is "perpendicular" file reads. What "perpendicularity"
means is that as we track down the records in one dataset, we use the
qualifying value(s) found in the dataset to perform a highly optimized, but
independent "perpendicular" search in another dataset. And if need be, a
second perpendicular read, and so on.
 
This is the technique that everyone who programmed reports using IMAGE
intrinsics used 10-15 years ago; we just implemented every trick we knew to
make it as efficient as possible. And it is surprisingly fast. (There is no
easy way to perform comparisons, but we estimate, on average, that our query
language is three to eighty times faster than most relational database
queries, regardless of the commercial source -- and if b-trees are ever
implemented as a hard-core part of IMAGE, that disparity will become
significantly greater).
 
As I said earlier, there is a lot of mythology in every aspect of computing,
but the notion that "network databases are not well constructed for joins" is
one that you absolutely shouldn't believe.
 
Wirt Atmar

ATOM RSS1 RSS2