HP3000-L Archives

September 2000, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Stan Sieler <[log in to unmask]>
Reply To:
Stan Sieler <[log in to unmask]>
Date:
Fri, 15 Sep 2000 17:12:00 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (131 lines)
Re:

> What is SPT/iX?

and

> What IS SPT/ix, and why should we buy it, eh? Esp with a bug 8~))...?

Good questions!

HP Software Performance Tuner/XL
...sort of a Native Mode "Sampler", if you knew that product on the
Classic HP 3000.

SPT monitors a single program, periodically asking "where is it executing"
(many times a second), and then lets you analyze that data.
This lets you find "hot spots", areas that use a lot of CPU.

This is a *VERY* useful product, one that I highly recommend!

I used it to find some performance problems in the B-Trees code in
IMAGE, speeding up my first version by about 50%!

I used it on De-Frag/X, and sped it up significantly, too.

I have no idea how much it costs, sorry!

And, the bug affects people who use long procedure names coupled with
nested Pascal procedures (or SPLash! subroutines) with long procedure names.
The workaround is to use a shorter name.


Here's a sample of one kind of output, where it's telling me where I
spent time within a given code file (in this case, the program file):

   Code file : XOVER.PUB.XOVER

                             Sample     % of                  % of
   # Procedure Name          Count    Samples       Transaction Process Time


   get_td_record            (   155)    68.9       8.4% |****************        |
   get_td_record$2$get_rec* (    12)     5.3       0.6% |*                       |
   xo_id                    (     4)     1.8       0.2% |                        |
   xo_get_saved_buffer      (     3)     1.3       0.2% |                        |
   (11391C)EX_get_td_recor* (     3)     1.3       0.2% |                        |
   xo_release_buffer        (     3)     1.3       0.2% |                        |
   ?init_read_td_file       (     2)     0.9       0.1% |                        |
   ?xo_hex32                (     2)     0.9       0.1% |                        |
   $$lr_wa_30               (     1)     0.4       0.0% |                        |

   (More)>

That says that of the time I spent in XOVER code (as opposed to, say, XL.PUB.SYS
or NL.PUB.SYS), the most popular CPU user was get_td_record.  The distant
second was a nested procedure within get_td_record, call "get_rec" something-or-other.

I can then ask to "zoom in" on one of those, like get_td_record:

     Procedure: XOVER.PUB.XOVER:get_td_record

     Procedure    Procedure    Sample    % of
   Offset Range   Statement    Count    Samples


    $344/$350         ??      (    1)     0.6  |                                |
    $854/$860         ??      (    1)     0.6  |                                |
    $c74/$c80         ??      (    1)     0.6  |                                |
    $ca4/$cb0         ??      (    1)     0.6  |                                |
    $d04/$d10         ??      (   93)    60.0  |*******************             |
    $d14/$d20         ??      (   51)    32.9  |**********                      |
    $d44/$d50         ??      (    1)     0.6  |                                |
    $d54/$d60         ??      (    1)     0.6  |                                |
    $d94/$da0         ??      (    2)     1.3  |                                |
    $da4/$db0         ??      (    1)     0.6  |                                |
    $dc4/$dd0         ??      (    1)     0.6  |                                |
    $dd4/$de0         ??      (    1)     0.6  |                                |

Now, with a compiler listing, I can see where get_td_record + $d04 through
get_td_record + $d10 is ... that's the most "popular" section of that procedure.

Looking at a SPLash! compiler listing, I see that $d04/$d10 is part of the
code generated by source line 421, which happens to be a large MOVE statement.
So, I'm now thinking of ways to speed that up.  In this case, I can tell
SPLash! to use "millicode" for the move, instead of in-line code.  The SPLash!
move millicode was hand written in assembler and optimized by Jacques Van Damme.
So...
...

After that change, when I analyze XOVER again with SPT and do a "File Sample Profile",
I get:

                             Sample     % of                  % of
   # Procedure Name          Count    Samples       Transaction Process Time

   (12FB58)chunks           (    85)    58.2       8.1% |*************           |
   get_td_record            (    19)    13.0       1.8% |***                     |
   get_td_record$2$get_rec* (     8)     5.5       0.8% |*                       |
   xo_id                    (     3)     2.1       0.3% |                        |
...

In this case, I didn't gain a lot (you can't quite compare the numbers
between the two runs, because I didn't setup an apples-to-apples
comparison environment...sorry!)

BTW, (12FB58)chunks is SPT's "cute" way of saying "SPLash! millicode".

Now, in get_td_record, I see:

 $844/$850         ??      (    1)     5.3  |*                               |
 $994/$9a0         ??      (    1)     5.3  |*                               |
 $c64/$c70         ??      (    2)    10.5  |***                             |
 $cc4/$cd0         ??      (    1)     5.3  |*                               |
 $cd4/$ce0         ??      (    1)     5.3  |*                               |
 $ce4/$cf0         ??      (    2)    10.5  |***                             |
 $d04/$d10         ??      (    1)     5.3  |*                               |
 $d14/$d20         ??      (    2)    10.5  |***                             |
 $d34/$d40         ??      (    1)     5.3  |*                               |
 $d54/$d60         ??      (    1)     5.3  |*                               |
 $d64/$d70         ??      (    1)     5.3  |*                               |
 $d74/$d80         ??      (    1)     5.3  |*

...i.e., there's no hot spot in get_td_record.

I did some timings tests, and found that I've sped up XOVER by 12.5%
for some cases.

--
Stan Sieler                                           [log in to unmask]
www.allegro.com/sieler/wanted/index.html                  www.sieler.com

ATOM RSS1 RSS2