HP3000-L Archives

January 1995, Week 5

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
John Clark <[log in to unmask]>
Reply To:
John Clark <[log in to unmask]>
Date:
Mon, 30 Jan 1995 12:19:00 WET
Content-Type:
text/plain
Parts/Attachments:
text/plain (34 lines)
Wirt Atmar wrote, re Jeff Kell's request for a fuzzy string matching algorithm:
 
>The idea behind PAUP is that there initially existed a single source string
>(of DNA) that is now represented by a whole handful of existing strings,
>where each existing string has very likely been permuted multiple times by a
>number of simple character substitutions (called point mutations),
>translocations (sections of the string are rearranged), insertions,
>deletions, and/or inversions (parts of the strings inverted, including simple
>transpositions).
>
>The manner by which information gets screwed up genetically is exactly the
>same as the way information gets screwed up in computers: one error at a
>time.
>
>The problem you're asking to solve is quite a bit easier than the one PAUP is
>applied to. In yours, I'm presuming that you will have a dictionary of fixed
>strings to match an unknown string against. In a phylogenetic (family tree)
>analysis, the problem is inverted -- and more complex. You are given a whole
>dictionary of strings and your problem is to cluster the strings together so
>that the least number of changes are required (parsimony) to get the strings
>to converge back to a single (often hypothetical) ancestral root string.
 
Fascinating!!  This is speculation but I'll hazard that the full rigour of
PAUP analysis has already been brought to bear on the problem of
reconstructing root languages from a babel of dialects.  While you're
roaming about the UT campus, Jeff, you might check with the Linguistics
Department and see if they have anything to offer that's already adapted to
the problem of human language.  For that matter, look for where linguists
hang out on the net and post a query there. I bet you'll get some answers.
 
John Clark                                Phone: (416) 366-4846
The National Ballet of Canada               Fax: (416) 366-1894
Internet: [log in to unmask]        Compuserve: 70521,2050

ATOM RSS1 RSS2