LISTSERV - HP3000-L Archives

HP3000-L Archives

July 2002, Week 3

HP3000-L@RAVEN.UTC.EDU

	LISTSERV Archives
	HP3000-L Home
	HP3000-L July 2002, Week 3

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Deduping files (wait a minute)
From:	"Paul H. Christidis" <[log in to unmask]>
Reply To:	[log in to unmask][log in to unmask]
Date:	Mon, 15 Jul 2002 09:52:23 -0700
Content-Type:	text/plain
Parts/Attachments:	text/plain (42 lines)

However, Jeff's suggestion, while it eliminates superfluous processes, will
NOT "solve" Allen's original request of producing a list of "unmatched"
records.  If my understanding of the "man" pages of "sort" is correct, the
NODUPE file will consist of more that 4,949,999 in the "best" case, where
all 50,000 records matched, and 5,050,000 records in the case where none
matched.

Regards
Paul Christidis

At  02:42 PM 7/12/2002, Jeff Kell wrote:
>If you just want to remove the duplicates, it's easier:
>
>shell/ix> cat OLDFILE NEWFILE|sort|uniq>NODUPE

Since sort can take a list of files to sort as arguments, that's a wasted
use of cat.  And sort also has the -u option which removes duplicate keys,
so that's a wasted use of uniq; use:

sort -u FILE [FILE2 ...] >NODUPE

>You can specify keys for both sort and uniq, and you can trim out the keys
>beforehand with cut (among others).

The flexibility of sort, uniq, cut, grep, find, sed, and similar tools is a
big part of what makes the unix shells so powerful.

--
Jeff Woods
[log in to unmask]
[log in to unmask]
Quintessential School Systems

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2

RAVEN.UTC.EDU