HP3000-L Archives

June 2002, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Shahan, Ray" <[log in to unmask]>
Reply To:
Shahan, Ray
Date:
Wed, 19 Jun 2002 07:28:15 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (55 lines)
Using SUPRLINK from Robelle to 'merge' these two files prior to the CI
script would dramatically speed up the process too...

Ray Shahan

"Life is what happens while you're busy making plans", John Lennon

> -----Original Message-----
> From: Gavin Scott [SMTP:[log in to unmask]]
> Sent: Tuesday, June 18, 2002 7:07 PM
> To:   [log in to unmask]
> Subject:      Re: looping out of control
>
> Donald writes:
> > I want to compare file a(single value) to file b(multiple values) and
> > extract the multiple values into one record.
>
> Your last example command file would appear to be trying to read each
> record
> from file A then compare it against every record in File B.  This is a
> fine
> way to do it if you're not in a hurry, since, if your two 100,000 record
> files happened to be full it would require ten billion iterations through
> your inner loop and 100,000 FCOPYs of file B's input into its Message
> File.
>
> To get this to work in a reasonable amount of time (like before the sun
> burns out and it's too dark to read the output) you should probably sort
> both files on your "key" value, then perform a "merge" of the two files to
> create your output.  Basically you read a record from each file and based
> on
> comparing the two key values you read the "lesser" file until the keys
> match
> and then read the B file until the key changes while accumulating your
> output.  There are various boundary conditions to deal with when you hit
> the
> end of a file to make sure you don't lose the last record, etc.
>
> The "merge" solution will run in O(N+M) time rather than the original
> O(N*M), a factor of improvement similar to the number of records in the
> larger file.
>
> This is a complex enough task that most people would use a "real"
> programming language rather than a CI script, but it's certainly possible
> to
> implement using the CI if you insist.
>
> G.
>
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2