HP3000-L Archives

November 1998, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Mark Bixby <[log in to unmask]>
Reply To:
Date:
Wed, 4 Nov 1998 08:58:24 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (39 lines)
John Dunlop writes:
>
> Folks,
>
> I am dealing with an application that reads barcoded information into
> flat files onto the HP3000. Unfortunately, sometimes the odd character
> or two gets mis-read (we are only talking say 10 in 5,000) and this
> "invalid character" can cause the next application to refuse to load
> the file. Therefore, I have been experimenting with ways to scan the
> file to pinpoint these bad characters. The only valid characters are
> numbers 0-9, all upper case characters and spaces.

...complicated CI example deleted...

> This works fine but is slow and inefficient.
>
> I would be interested to hear from anyone who could suggest a
> better/faster/more efficient way of scanning each character of a
> datafile.

POSIX to the rescue:

        grep -v '^[0-9A-Z ]*$' DATAFILE

This says select all lines that are not (-v) consisting of multiple (*)
digits (0-9), uppercase letters (A-Z), and spaces ( ) from the start of the
line (^) through the end of the line ($).

POSIX regular expression pattern matching (regexp) packs a lot of power in
just a few characters.

For full regexp documentation, go into sh.hpbin.sys and say "man regexp".
--
Mark Bixby                      E-mail: [log in to unmask]
Coast Community College Dist.   Web: http://www.cccd.edu/~markb/
District Information Services   1370 Adams Ave, Costa Mesa, CA, USA 92626-5429
Technical Support               Voice: +1 714 438-4647
"You can tune a file system, but you can't tune a fish." - tunefs(1M)

ATOM RSS1 RSS2