Subject: | |
From: | |
Reply To: | |
Date: | Mon, 8 Jul 1996 12:48:44 -0400 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
In a message dated 96-07-06 16:37:47 EDT, [log in to unmask] (Bill Lancaster)
writes:
<< I remember a paper several years ago by Dave Merit, who, at the time, was
at
Bradmark. His studies showed, and were confirmed by others, that the
benefit of using a prime number wasn't that it was the best capacity,
rather, it was never the absolute worst in the range. So, in the absence of
doing master set capacity sampling,
pick the prime. Just keep in mind that if you desire better performance,
you will want to do some sampling.
Also keep in mind that sampling is generally slow and does take resources to
perform.
>>
Actually, the first time I heard the debunking or primaries was at the 1984
Ananheim conference. Someone gave a paper on the subject and showed that
while primes were usually (perhaps even always) acceptable, there were not
usually the best value for a given range. The paper went one to show that
powers of 2 were to be avoided at all costs. I do not believe that David
Meritt was at Bradmark in 1984, nor do I remember Bradmark as a player in the
industry at that time.
One last comment about sampling. Whilst it is an acceptable strategy for
hashing keys, it is useless for binary keys. With binary keys, you should be
able to calculate/deduce an exact capacity whereby there are no synonyms at
all even when all entries are used. Fred White wrote an excellent paper on
the subject a few years back and I also published some stuff on the subject
about 18 months ago in the HP Croc.
Kind regards,
Denys. . .
|
|
|