HP3000-L Archives

July 2002, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Jeff Kell <[log in to unmask]>
Reply To:
Jeff Kell <[log in to unmask]>
Date:
Fri, 12 Jul 2002 17:42:32 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (26 lines)
Jeff Kell wrote (before thinking too hard):
>
> "Porter, Allen" wrote:
> >
> > I'm looking for opinions and experiences with deduping large fixed ASCII
> > files.

> I said almost this a few days ago :-)  If you can get the "keys" of both
> files (in your case, the name) into flat files, say OLDFILE and NEWFILE,
> then drop into the shell (:sh.hpbin.sys -L) and do:
>
> shell/ix> cat OLDFILE NEWFILE|sort|uniq -d>MATCHFIL
> shell/ix> cat NEWFILE MATCHFIL|sort|uniq -u>NOMATCH

If you just want to remove the duplicates, it's easier:

shell/ix> cat OLDFILE NEWFILE|sort|uniq>NODUPES

You can specify keys for both sort and uniq, and you can trim out the
keys beforehand with cut (among others).

Jeff

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2