HP3000-L Archives

September 2006, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
James Hofmeister <[log in to unmask]>
Reply To:
Date:
Thu, 28 Sep 2006 01:02:42 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (156 lines)
Hello Wes, Larry,

>> What is the correlation between users and the inbound buffer pool? I
opened
>> 15 sessions and didn't see a correlating increase in the  pool (maybe my 
>> notion that they are related is wrong).

The inbound buffer pool consist of "outstanding" frames that have been
received, but not processed by a upper level protocol and in turn a service
or user socket application.  

The most common upper level protocol traffic for the inbound buffer pool on
the 3000 is  TCP/IP, UDP/IP and AFCP (dtc/dts) as well as general broadcast
traffic that has not been discarded at the hardware and driver level.

In the case you mention of an "idle" session logged into MPE over Telnet or
NS-VT as well as an idle FTP services, no "outstanding" frames (these are
all services that run over TCP/IP, so we could also say no "outstanding"
packets) would be held in the inbound buffer pool.  The number of
"outstanding" frames in the inbound buffer pool increases as the amount of
traffic inbound to the system exceeds the performance of the service or the
application receiving the frames.  The general performance of the system of
course can impact the performance of a service or an application.

A simple example of an application which will result in an accumulation of
frames in the inbound buffer pool is FTP.  If a FTP Client is sending data
inbound to our MPE FTP Server faster than it can be processed, then a
limited number of frames will backup in the inbound buffer pool.  This is
especially noticeable when a client running on a fast system is sending data
to a MPE FTP server which is overburdened or outclassed in processor
capability.  Note: In this example it is not possible for (1) ftp connection
between a FTP Client and a FTP Server to use up a significant amount of the
buffer pool as the upper layer protocol TCP/IP controls the number of
outstanding packets and the maximum amount of outstanding bytes of data.

note:  Interactive "terminal" services Telnet and NS-VT tend to not have
many "outstanding" frames ever found in the inbound buffer pool due to their
nature of "low" volume data transfer (80-bytes*24-lines on a screen) as well
as their frequent slow pacing implemented (or impeded) by the PIBKBAC human
interface.  When I look at a dump, I seldom see more than "one" frame in the
buffer pool for a Telnet or NS-VT connection and if I find more, that
highlights a problem.

A less simple example is an application which sends data inbound as UDP/IP
traffic.  The pace of this data, number of outstanding packets and the
maximum amount of outstanding bytes of data is not controlled by an upper
level protocol.  It is important that an application receiving UDP data do
so in a timely manor.  It is also important that an application sending UDP
data take into consideration the amount of outstanding data at the
application layer and establish a method to receive acknowledgement of data
sent as well as retransmit data not acknowledged if it is important to do
so.

A somewhat common problem seen is a user socket application receiving TCP or
UDP packets which does not perform a socket "recv" in a timely manor.  An
example is an application which receives data on a socket and writes this
data to a database, then loops back to receiving data on the socket.  What
happens when a database lock is applied to the data set this application is
writing to?  The application waits on the lock, and can't perform the socket
"recv", thus any inbound data frames during the duration of this database
lock back's up in the inbound buffer pool.  Many other application problems
and design errors can contribute to this type of problem as well.

Here is a simplified perspective that has served me well concerning the
series of events of a frame received in the inbound buffer pool and its
traversing up the stack to the end service or user application.  Yes, it is
a simplified version of the 7 layer network protocol stack 8-) starting from
the bottom up!

The series of events is:

1. A frame is received on the LINK LAYER network interface card (NIC) and
uses a pre-allocated pointer to perform a DMA of the frame to the buffer
pool in real memory.

2. The pointer to this frame located in the buffer pool is passed to the
DRIVER LAYER PROTOCOL - Ethernet/802.3 or ADCP/DTS drivers are examples we
see on MPE.

3. The DRIVER LAYER PROTOCOL reads the frame in the buffer, identifies the
packet type and passes the pointer to the TRANSPORT LAYER PROTOCOL - TCP/IP,
UDP/IP or AFCP/DTS are examples we see on MPE.

4. The TRANSPORT LAYER PROTOCOL reads the packet, identifies the datagram
socket id type and passes the pointer to the SOCKET INTERFACE - TCP
Connection port, UDP Socket or DTS  are examples we see on MPE.

5. It is at this point where the data is "copied" from the pointer to the
frame in the buffer pool to the "service" data structure or a user socket
application data array.  

  In the case of "service's" Telnet, NS-VT or FTP the data is copied with a
BSD socket recv call to an internal data structure.

  In the case of an user socket application program the data is copied with
a BSD socket recv or NETIPC IPCRECV call to an data array defined in the
user application. 

6. After this copy is complete, the pointer to the frame in the buffer pool
is returned down the same protocol stack, but it may be held at any of the
layers to assure end to end acknowledgement and the ability to retransmit
data.  

7. Finally the pointer to the frame is passed down to the Link Layer to be
held in a pool for re-use for the next inbound frame DMA request. 

----------

So far I have told you how it "should" work and some cases of expected
frames buffered in in the inbound buffer pool.

The cases where it does not work as it "should":
- If we are unlucky we end up with leaked  frames orphaned in in the inbound
buffer pool.  Eventually the pool fills up and no further frames can be
received (no further connections can be made to the system and existing
connections fail).
- Unluckier yet for you is that the buffer manager code catches most of
these leaks and protects it's internal data structures with a SYSTEM ABORT
3890, but in the case of a "SA3890" the problem is visible and tends to get
fixed).

</james climbs up on soap box>
This is what us technical folks in the industry call "bugs" fixed in, yes
you guessed it "patches"...   If we get sufficient testing of the fixes,
they are included in General Release patches.  If not, then they are
included in Beta Test patches and if we get no commitment from any customer
to install a Beta Test patch the problem evidently is not important enough
to fix.
</james climbs down from soap box>

>> I know it's difficult for anyone to diagnose from so little 
>> information, especially when introducing network traffic. We 
>> haven't changed anything on our network or on the 3K, and it 
>> just started to occur early last week. The assistance has 
>> been very helpful in looking at network settings and has at 
>> least given us some direction to pursue (whether or not it 
>> leads to resolution has yet to be determined).

Network troubleshooting is one of the things that the guys at the HPRC are
very good at. Most of them are quite a bit better than I am for the day to
day troubleshooting.  They have the years of experience in reading the
network log files, looking at the link & network state information, general
troubleshooting and reading network traces if necessary.  I defer any first
pass analysis to these guys who do it every day and they engage me for the
corner cases which usually require dump reading and such other nasties.

Bottom-line: I would recommend you review and install the GR patches and if
you have the opportunity to have the folks at the HPRC look over this
problem.

Regards,
  James Hofmeister

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2