HP3000-L Archives

January 1995, Week 5

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Steve Butler <[log in to unmask]>
Reply To:
Steve Butler <[log in to unmask]>
Date:
Mon, 30 Jan 1995 12:30:17 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (31 lines)
On Fri, 27 Jan 1995, Stan Sieler wrote:
 
> Jeff Kell ([log in to unmask]) wrote:
> : Does anyone have, know of, or have source code for a "fuzzy" string
> : match?  Not just a phonetic key (Soundex), but rather one which could
> : tell you "how closely" two strings match?
>
 
In the mid-70's there was an article in one of the IEEE publications
about measuring the 'distance' between two words in a 'dictionary'.  It
was called the LEVENSTEIN METRIC.  We implemented it in COBOL along with
the PETERSON SOUNDEX to look up medical record numbers on-line.  In
Southern California most names can be spelled several different ways.
Finally had it tuned so that if the name was in the database (KSAM file),
it would pop out in the top half dozen records returned.
 
I may even have a BASIC version floating around someplace.  It's been a
few years, but I could probably re-develop the code without recourse to
the article.
 
E-mail me if you need more specifics.
 
+----------------------------------------------------+
| Steve Butler          Voice:  206-464-2998         |
| The Seattle Times       Fax:  206-464-2905         |
| PO Box 70          Internet:  [log in to unmask] |
| Seattle, WA 98111    Packet:  Not currently active |
+----------------------------------------------------+
All standard and non-standard disclaimers apply.
All other sources are annonymous.

ATOM RSS1 RSS2