HP3000-L Archives

March 2004, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
John Clogg <[log in to unmask]>
Reply To:
John Clogg <[log in to unmask]>
Date:
Fri, 26 Mar 2004 10:27:02 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (89 lines)
I don't know who is right on this question (I suspect both Denys and Craig are to some degree), but I would like to continue the discussion, because it is very informative.  First, Craig, I'd like to say that I don't think there is really any reason to be offended when someone dares to disagree with you on a technical matter.  HP has never documented the internal behavior of the XM very well, so there are bound to be differing understandings of its functionality.

You stated that in your situation, 60 of the 66 seconds spent on a checkpoint were spent "scanning memory".  I would like to understand that better.  To what end was this scan taking place?  The stated "looking for dirty pages" does not fit in well with my understanding of either XM or disc caching.  Is this the only reason more memory has the potential to degrade performance?  Please explain your example (a large jumbo with the majority of users accessing it), and why it would lead to degraded performance with more memory.  Please understand, I am not challenging your assertions, I am seeking to understand them better so I can make informed choices about memory sizing, etc.  I know Bill Cadier monitors this list.  Perhaps he could add something to our understanding.

John Clogg

-----Original Message-----
From: Craig Lalley [mailto:[log in to unmask]]
Sent: March 26, 2004 9:08 AM
To: [log in to unmask]
Subject: Re: internals, and patches


Mr. Denys,

First of all, I must commend you on your typing skills.  You are indeed a very
verbose individual.

> I will say it again, unless something dramatic has changed in MPE over
> the last lustrum or so, MPE does NOT scan main memory during a check
> point looking for dirty pages to post to disk.  It never did, and now,
> it never will.

I can only defer to the one who taught me, a Mr. Kevin Cooper... The last I
heard he works for a company called HP.  (Mr. Cooper are you still out there?)

>
> I said in response to Mr. Lalley's post on 3 February: "There are many
> parts to the XM.  During the life of MPE, there is an area in memory
> reserved for a queue to the master of the volume sets.  This queue gets
> emptied according to various parameters, one of them being activity and
> another one being time.  IIRC the maximum time between buffer flushing
> to the masters is about 500ms or half a second.  I believe this is
> called the serial write queue.  IMAGE/SQL and KSAM/XL are attached to
> the XM and can force the SWQ to flush to disk.  I believe most files on
> MPE are not attached to the XM.  I think it's only system tasks
> (directory and stuff,) IMAGE/SQL and KSAM/XL that are.  You can attach
> another file if you want to, but you have to program for that.

Remainder skipped

Mr. Denys, I agree with you that there are various reasons to create a
checkpoint.  However the primary reason is when the logfile fills up.  That is
why you will see a "heartbeat" signature on a disc I/O graph.

I also disagree with you, in certain situations, i.e. very large, mainly jumbo
files being accessed by a majority of people.  (kinda  like Ecometry works).
Adding more memory will actually make matters worse.

A while back I worked with Mr. Cooper to identify this issue.  Mr. Cadier had a
program that would time the XM check point intervals.  The intervals were 60
seconds apart, but the XM post was taking 66 seconds.  Since the disc subsystem
was an EMC frame with 8 GB of cache, posting 32MB certainly did not take 66
seconds.  Of this HP was able to determine that 60 seconds were spent scanning
memory and 6 seconds were spent posting to disc (cache on the EMC frame).

The solution proposed by HP was to DECREASE memory from 10GB to 8GB.  My
proposed solution was to INCREASE the XM log file from 64MB (2 *32MB halves) to
192MB (2 * 96MB) halves.

My thoughts were that the XM checkpoints would occur less frequently, every
three minutes, the memory scan would remain the same and the post should be
only slightly longer due to the cache on the EMC.

The results were commendable; the customer even considered adding MORE memory.
Although we (HP and I) discouraged them from doing so.

One of the beauties of an N-class 750 is that it can scan 16GB of memory in
about 20 seconds, reducing the overhead of the XM.

There are some other interesting side effects of this phenomenon is to watch
the maximum I/O queue length go to over 2000.  This was only occurring on high
speed raid arrays.  The truth is the disc queue length actually DOES go above
2000, but it only does it for a split second, enough to skew the data for the
rest of the interval.

Denys, I know it is very hard for you to accept some things that you don't
understand, so please take all of this in the spirit it was written.  :-)

Happy Friday,

-Craig

Now could someone answer my original question.  What happened to patch
MPEMX34A?

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2