HP3000-L Archives

May 2002, Week 2

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Gavin Scott <[log in to unmask]>
Reply To:
Gavin Scott <[log in to unmask]>
Date:
Wed, 8 May 2002 12:04:56 -0700
Content-Type:
text/plain
Parts/Attachments:
text/plain (51 lines)
Wirt writes:
> Everything that Gavin writes is correct, but the 30% performance increase
> that I mentioned is the overall increase that we see in our own programs,
> when very carefully measured from the perspective of an outside observer.

I kind of suspect that Wirt's programs are more CPU-intensive than most,
since I think the 30% is at least on the order of the maximum possible
improvement resulting from an OCTCOMP.

The CM emulator is *extremely* fast.  It's written in hand-coded PA-RISC
assembly code and uses all the tricks in the PA-RISC architecture.

An OCTed program is *not* a "native mode" program really.  It's simply one
in which the sequence of instructions that the CM emulator would execute to
emulate the CM program is exploded out into one big sequence of Native Mode
instructions that implement the CM operations in the original program.

This eliminates the instruction decode overhead, but your OCTed CM program
is still performing all of the same CM operations that the original program
did, it uses the CM stack, and has all the same limits that a CM program
does.  In short, an OCTed program *is* a CM program, just one that runs
slightly faster.

The CM emulator has to do some extra work to update things like the emulated
Condition Code bits, and OCTCOMP is smart enough to eliminate these
operations if the result is never used, but this only adds a small
incremental improvement, not orders of magnitude.

There are some other OCTCOMP tradeoffs that come in to play that affect the
value of translating a program.  The OCTed program is generally about 10
times larger than the original CM code.  While disk space is cheap these
days, there is the possibility that the translated program will hog more of
the CPU cache and memory than the CM program would.  Jim Miller, one of the
authors of OCTCOMP, once told me he suspected that there might be cases in
which the CM code and the CM Emulator would both fit into cache and thus run
faster than the expanded OCTCOMP code which might not have as good a
locatily as the old code.

> What that means is that I am sure that there are sections of code that are
> increased in overall performance by a 1000x[...]

I don't think this is possible.

As with all performance optimizations, you have to try it and carefully
measure the results in your own real-world environment.

G.

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2