HP3000-L Archives

February 2004, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Walter Murray <[log in to unmask]>
Reply To:
Walter Murray <[log in to unmask]>
Date:
Wed, 18 Feb 2004 23:18:52 -0600
Content-Type:
text/plain
Parts/Attachments:
text/plain (63 lines)
 [Moving this article to a thread of its own:]

"John Clogg"  wrote:
> Joe Andress asks:
> >Given that a record has binary data, regardless of the field definition,
> >what will happen when (due to bad luck) 2 bytes of binary data just
happens
> >to equate to a record termination byte sequence of line feed, carriage
> >return?
>
> I haven't seen anyone offer an answer to this, so here is my understanding
of the answer, based on rather sketchy knowledge of the subject.  Others
will undoubtedly correct me if I misstate anything:
>
> Although the file system on Unix does not manage records as such, your
program can still do so.  In other words, if your program defines a record
length of 100 bytes, each READ you do will retrieve the next 100 bytes of
data from the file.  The file system may not regard that as a record, but
you really don't need it to; you are responsible for managing your own data.
Demarcating the end of a record with a carriage return is only necessary
when accepting free-form input, which will be ASCII data and should not
contain "accidental" return characters, or when outputting data to a printer
or terminal, which is also ASCII.
>
> The shortened version of the above is that binary files do not use a
carriage return character (or CR-LF) as a record separator.  I'm not sure
how variable-length records are handled.

This opens a can of worms, and you really have to understand how your
operating system and file system handle this issue.  It's definitely
something you have to be aware of when you have binary (COBOL COMP) data
items in your file records.

Using the terminology of the C language, there are text files and binary
files, and there are text streams and binary streams.  In MPE
record-oriented (non-byte-stream) terms, text files and streams correspond
to files built as ASCII; binary files and streams correspond to files built
as BINARY.  In POSIX/iX, as in UNIX in general, there is no distinction
between text and binary in this regard.  It's meaningless to ask the
operating system whether a file is text or binary.

So, in HP-UX, if you ask wc(1) how many "lines" are in a file, or you ask
vi(1) to open a file for editing, how does it know the difference between a
carriage-return and/or line-feed used to mark the end of a line, and a CR
and/or LF that happens to be embedded in a binary number?  It doesn't!
Programs that deal with lines treat any LF character as the end of a line.
Your program has to know where a record ends, based on its knowledge of the
structure of the record..

With fixed-length records, that's easy enough.  With variable-length
records, well, it's up to your program to adopt some convention, like maybe
a byte-count at the beginning of each record.

By the way, in C and in UNIX, it's the line-feed character (ASCII code 10
decimal) that marks the end of a line.  A carriage-return character (ASCII
code 13 decimal) has nothing to do with it.  Other operating systems can and
do use other conventions, and may even give you a choice.

Walter (trying hard not to let too many worms out of this can :-)

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2