HP3000-L Archives

June 2000, Week 1

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Ted Ashton <[log in to unmask]>
Reply To:
Ted Ashton <[log in to unmask]>
Date:
Wed, 7 Jun 2000 19:47:16 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (55 lines)
Thus it was written in the epistle of Wirt Atmar,
>
> Languages are designed for the context in which they operate.

Well said, Sir.

> If you're going to handle strings as strings, then they should be manipulated
> in a consistent manner. If you're going to handle text data in the simpler
> format of a fixed memory allocation variable, as occurs in IMAGE, then that
> too has to be done in a consistent manner.

<*applause*>  Well said throughout.

Perl was designed to work in the Un*x world, where all files are bytestream and
trailing blanks are probably there on purpose.  Python shares that attribute.
What Benji did in his code was to pull in a library routine to do the work.
Doing the same in Perl would result in equally pretty code :-).

The "ugliness" about which Wirt commented was, I think, the regular expression,
not Perl, per se, as regexen are found many other places (including Python).
Regular expressions are extremely compact because they must contain so much
information that if the tokens were the length of English words, the
expressions would prove unwieldy.  The sample he quoted was:
  /(.{0,3}\S)\s*$/

Working from the back to the front, we have:
  //  -- the outer slashes mark this as a regular expression match;
  $   -- the dollar sign anchors the regex to the end of the string or, if
         there is a newline at the end, just before the newline.
  \s* -- the \s matches any whitespace character, not just spaces but also
         tabs (and in some situations, other things as well).  The * means
         that there may be any number of these or none at all.
  ()  -- the parens delineate what I want to put in $last_four.
  \S  -- this character (the last of the last four) must NOT be whitespace
  .{0,3} -- the dot matches any character except a newline and the {0,3} says
            that we may have 0 to 3 of them.  I could've left the 0 out.

Trying to write that with words instead:

  regex(capturing(optional_char optional_char optional_char non_whitespace)
        anynumber(whitespace) eos)

might be tolerable for this example, but would be extremely cumbersome for
common usage.

Ted
--
Ted Ashton ([log in to unmask]), Info Sys, Southern Adventist University
          ==========================================================
This is not mathematics, it is theology.
[On being exposed to Hilbert's work in invariant theory.]
                                        -- Gordon, P
          ==========================================================
         Deep thoughts to be found at http://www.southern.edu/~ashted

ATOM RSS1 RSS2