HP3000-L Archives

January 1998, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
WirtAtmar <[log in to unmask]>
Reply To:
WirtAtmar <[log in to unmask]>
Date:
Thu, 22 Jan 1998 15:47:11 EST
Content-Type:
text/plain
Parts/Attachments:
text/plain (86 lines)
Bill Lancaster writes:

> These are *very* good points Jeff makes.  It is difficult to educate some
>  users that this is so.  It is also hard to convince some that overall CPU
>  utilization isn't the whole picture and that you need to look further
>  before making any assumptions or decisions.  For example, a system which is
>  100% busy but is running mostly non-critical batch processing is a
>  completely different kettle of fish than a system running 100% busy the
>  majority of which is high-priority interactive processing.

Bill's comments rekindle one of my primary disgruntlements with simple
performance analyses. We manufacture a report writer, QueryCalc, in which we
do everything we can to drive CPU utilization to 100%. In fact, if we were to
do anything else, we would be derelict in our duty. One hundred per cent CPU
utilization is not a sin in a report writer, it is a profound virtue.

The reasons why are easy to understand. If a report is being run in the middle
of the night, with only one job running (as jobs should be run in batch), the
ideal condition is for us to isolate and read the data off of the discs into
main memory so that the disc drives go "clunk" just one time -- and then go
quiet. Reading information out of main memory is 100,000 times faster than
reading it from a disc drive. More than that, when you read something off of a
disc, you're placing the disc heads where somebody else (even your own
processes) doesn't want them to be, and that represents a great deal of real
inertia. But that inertia completely disappears if the data is memory
resident. All you're doing now is changing an electronic pointer when you
switch from process to process.

One of the questions that used to be asked of newcomers to user group meetings
ten years ago was: "Given your size of machine, how many jobs and sessions can
you be running simultaneously?" The answer was (regardless of your machine
size): "One." There was only one processor -- and it can only be doing one
thing at a time. Every other process that is not currently active is
suspended. It's only because the HP3000 can cycle between the processes fast
enough that you're given the illusion that it's actually doing more than one
thing at a time. Although the answer may be different nowadays due to the
multiple-CPU machines, the moral remains the same.

A report will consume a certain number of CPU cycles, whether or not it's
being run in session mode or batch. When running a report in batch, on a
machine we have all to ourselves, if we don't drive CPU utilization to 100%,
it means nothing more complicated than the CPU is sitting idle while we're
sitting and waiting for information to be retrieved from the various discs --
and that's our fault. We're not getting the data off of the discs in the most
efficient manner.

In such a situation, time "elongation" occurs. The report doesn't run in the
minimum amount of wall time. Quite clearly, if the CPU is not running, we're
not moving the report along. The CPU cycle count stays the same for the report
as a whole, but we simply aren't doing any useful work during those periods
when CPU utilization falls to zero while we're waiting for disc retrieval.

In that regard, elongation can be greatly aggravated when running multiple
large report jobs, in parallel. If by the time the fifth job executes within
its time slice all of the first job's data has been pushed out of main memory,
we have to simply waste a good portion of the first job's time slice
rebuilding memory from disc. This not only adds significantly the number of
non-productive CPU cycles expended, it greatly lengthens the time that all
five parallel jobs would take over that if they had been run sequentially, one
after another.

Similarly, if a large report is run in background, in E queue, during periods
of high C-queue session (human) activity, and we don't drive CPU utilization
to 100%, it also means that during those periods when the machine is quiet (no
human response-time processes are in progress), we don't have the data we need
to process the report in main memory. In this case, the fault is either
QueryCalc's (we aren't isolating and getting that data into main memory
appropriately) or there is "insufficient" memory in the machine, resulting in
at least a portion of our data being repeatedly flushed during those brief
periods when other, higher-priority processes grab command of the CPU.

In either case, a condition of non-100% CPU utilization is a first-order sin.
We very much want to drive utilization to 100% -- but we also want to be able
to give up processing on a drop of a hat (at the lowest possible inertia) and
let other, human-time processes get in and get out. If everything on an HP3000
is set up correctly, the users will almost never notice that a large report is
being processed in background. And the report itself will hardly notice the
users (elongation will be minimal).

The take-home moral from all of this is: "Don't let anybody look at a
performance tool's chart and see 100% CPU utlization and shake his head and
cluck his tongue." For a report writer, it's exactly what you want it to be
doing.

Wirt Atmar

ATOM RSS1 RSS2