HP3000-L Archives

December 2000, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Wirt Atmar <[log in to unmask]>
Reply To:
Date:
Tue, 5 Dec 2000 14:10:04 EST
Content-Type:
text/plain
Parts/Attachments:
text/plain (108 lines)
I'll break netiquete and repeat a private message that was sent to me by
Patrick Mullen this morning:

I earlier wrote:

>  >However, IF you really did want to store BLOBs in IMAGE, you can do that
>  >right now, at least up to a file length of about 4K, which is large enough
>  >for a signature or a small image. To do this, simply create a dataset with
>  >perhaps only two items, a unique ID number for the record and a 4K long
>  >dataitem. IMAGE doesn't check to see if what you  write into this very
long
>  >dataitem makes any sense or not. It's just bits to IMAGE, but if your
>  >interpreting program thinks of it as an JPEG file, then that's what it is.
>  >

Patrick wrote in response:

>  hmmm...a detail dataset (BLOB-D) with 2 fields (blob-name & blob-data) one
>  key (blob-data) to an automatic master (BLOB-A)..key field type...doesn't
>  matter..
>
>  the user could write a program to fragment the blob into 'n' pieces af
>  length (blob-data) and assign a blob-name to each piece.  DBPUT could be
>  called to write each record...this way there is, essentially, a way to
>  store blobs of almost any size...
>
>  Upon retrieval, DBGET would retrieve the chain (mode 5) and, with the
>  program's sufficiently large buffer, the inverse algorithm could be
>  employed to re-create the blob...
>
>  What do you think?

I actually think that Patrick's idea is an excellent one, and I'm surprised
that I didn't think of this earlier because we already do something very
similar for other purposes. Nonetheless, without modifying IMAGE in any way,
BLOBs of any size can now be put into an IMAGE database right now -- and in a
generally very efficient manner -- using "chunking."

As I mentioned earlier, BLOBs stored within IMAGE could come to be a
significant problem for backups, simply due to the amount of data that's
going to be stored in the database. But on the other hand, you would have all
of the data security that is intrinsic to IMAGE, and for which Donna was
asking. While Patrick's method would probably not be the first choice for
static BLOB data, it would be excellent for dynamically changing BLOBs. IMAGE
would do an exceptionally good job of managing deleted BLOB data records and
refilling that now unused space appropriately.

Let me expand a little bit on Patrick's idea. If I were to design BLOB
support into an IMAGE database, I would create three datasets: BLOB-ID (an
automatic master), BLOB-MASTER (a detail dataset), and BLOB-DATA (a detail
dataset), in this manner:


BLOB-ID     (automatic master)

   NAMEID          X20   (keyed item)


BLOB-MASTER (detail dataset)

   NAMEID          X20   (search item)
   FILETYPE        X4
   DESCRIPTION    4X80
   BYTECOUNT       I2


BLOB-DATA   (detail dataset)

   NAMEID          X20   (search item)
   SEQNUMBER       I1    (sort item)
   LENGTH          I1
   DATAFIELD       I2000


The FILETYPE dataitem in BLOB-MASTER would be used to store the BLOB's file
type (txt, jpeg, gif, midi, etc.), while the BYTECOUNT variable would be used
to store the length of the complete BLOB. DESCRIPTION would obviously be used
to hold a verbal description of the BLOB, while NAMEID would simply be a
guaranteed unique ID of your creation.

The SEQNUMBER field in the BLOB detail dataset would be used to load and
guarantee extraction of the BLOB chunks in order.

The LENGTH variable would be used to store the length of the data in the
current record's DATAFIELD. If the record was an intermediate chunk of the
BLOB, the LENGTH might be represented as a -1, indicating that the DATAFIELD
is not only full, but that more is to come. If the record was the last chunk
in the sequence, a positive-valued LENGTH would indicate not only that end
but also how much of this final chunk is to be read.

While I specified a 4000-byte long DATAFIELD in this example (basically the
largest that you can use in IMAGE), if the average size of your BLOBs were
substantially smaller than that, a smaller size would be more efficient and
warranted.

But if the BLOBs were larger, say perhaps on average a 200 kilobyte BLOB
(which is actually a fairly large image), a 4000-byte datafield would result
in only 50 sequential records. Nonetheless, there is a stopping point to
process. A very high resolution chest x-ray (say, 10 Mbytes) would result in
2500 sequential records. It's these kind of records that should probably be
stored outside of the HP3000, on some sort of easily addressable, high disc
memory device.

All-in-all, I think this actually a very nice idea, and for the appropriate
circumstances, it should prove very useful.

Wirt Atmar

ATOM RSS1 RSS2