LISTSERV - HP3000-L Archives

HP3000-L Archives

July 1995, Week 1

HP3000-L@RAVEN.UTC.EDU

	LISTSERV Archives
	HP3000-L Home
	HP3000-L July 1995, Week 1

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Word extraction CI evaluator function
From:	Jeff Vance <[log in to unmask]>
Reply To:	Jeff Vance <[log in to unmask]>
Date:	Thu, 6 Jul 1995 11:03:23 -0700
Content-Type:	text/plain
Parts/Attachments:	text/plain (86 lines)

Hi Jeff,
 
On Jul 6,  9:23am, Jeff Kell wrote:
> Subject: Re: Word extraction CI evaluator function
> On Wed, 5 Jul 1995 11:50:30 -0700 you said:
 
> >   word(search_str,start [,nth] [,rtn_var] [,delims])
> >
> >where
> >   search_str is the string that the word is being extracted from
> >   start      is the beginning index in search_str to start looking for the
> >                 word
>
> Not sure if you need 'start' -- read on.
 
I wanted start to be available so that a subset of search_str can be specified.
I've found many times that without a start parm I need to use lft or rht to
extract out the part of the string that I am interested in performing some
other operation on.  A start parm may reduce this extra step.
 
> >   delims     is a string containing characters that will be used to delimit
a
> >                 word.  Default delimiters are:
> >                 blank,comma,semicolon,equalsign,
> >                 tab,parenthesis,brackets,single and double quotes.
>
> That's getting a little "too" sensitive.  I'd settle for blank, comma,
> semicolon, and tab.  Further, I'd consider strings inside parens, brackets,
> single and double quotes to constitute one "word".
 
My list comes from the delimiter list used by the native mode CI parser.  Less
is ok with me.  I agree that strings inside parens, brackets, single and double
quotes to constitute one "word"; however are the parenthesis, brackets or
quotes
part of the word or just a delimiter?  If they are in the delim list they are
delimiters otherwise they are part of the word.
 
>
> >Example:
> >   setvar buffer input()
> >   setvar i 1
> >   while i <= len(buffer) do
> >      setvar next_word word(buffer,i,,i)
>                                     ^^^^ from above, should be(buffer,i,i)
                                                                        ^^^
Actually, I like this order better than my original suggestion.
 
> >      if next_word = "..." then ...
> >   endwhile
>
> How about an alternative:
>     setvar buffer input()
>     while words(buffer) > 0
 
Here is an example where word count could be used.
 
>        setvar next_word word(buffer,1)
>        do whatever
>        setvar buffer subword(buffer,2)
>     endwhile
>
> In this context you specify words(buffer) returns number of words in buffer
>                             word(buffer,n) returns n'th word of buffer
>                          subword(buffer,n[,m]) returns words n thru m, or
>                                                end of string if m omitted.
>
> No matter what you do, please consider "blank" to be "whitespace" -- one or
> more blanks.  The word() functions should ignore leading/trailing whitespace
> (and/or your other delimiter(s)) in the returned word(s).
 
I do, it does.
 
> >  pmatch()    - returns true if a pattern is found in a string.  Uses MPE
file
> >                   name wildcard patterns.
>
> Awww, c'mon, how about a grep regular expression instead of a wildcard? :-)
 
When's the next 4 day weekend!?...
 
 
Jeff Vance
 
 
--

ATOM RSS1 RSS2

RAVEN.UTC.EDU