LISTSERV - HP3000-L Archives

HP3000-L Archives

August 1997, Week 1

HP3000-L@RAVEN.UTC.EDU

	LISTSERV Archives
	HP3000-L Home
	HP3000-L August 1997, Week 1

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Connection Assurance Timeout w/telnet server.
From:	Jim Hofmeister <[log in to unmask]>
Reply To:	Jim Hofmeister <[log in to unmask]>
Date:	Thu, 31 Jul 1997 14:28:04 GMT
Content-Type:	text/plain
Parts/Attachments:	text/plain (126 lines)

Jim Hofmeister ([log in to unmask]) wrote:
: Hello Friends,
: : Re: Connection Assurance Timeout w/telnet server.

: Two questions came up concerning this new Telnet Fix/Feature... and I
: thought I would answer them here for all to see...

Second question was:

> Jim Hofmeister wrote:
> > Re: Connection Assurance Timeout w/telnet server.
> >
> > The Fix/Enhancement is in for telnet connections to timeout when a TCP
> > connection is broken - in the same manor NS-VT connections are dropped
> > when a PC is powered down as an example.
>
> Hmmm... so the connection assurance (a.k.a. TCP keepalive) protocol is
> specific to individual services?  (Light bulb appears over my head)...

The Connection Assurance (CA) time out is default enabled with all of
the products coded in NETIPC (NS-VT) as an example.  The Connection
Assurance (CA) timeout is default disabled with all of the products
coded with BSD Sockets (Telnet) as an example.  The reason is the BSD
standard specifies the SO_KEEPALIVE (which on HP3000 we correlate to
CA Timeout) default is disabled... So, if you do not place a call in
code to SO_KEEPALIVE enable, the default for a BSD Sockets application
is no SO_KEEPALIVE = CA Timeout disabled on the 3000.  This new
enhancement to telnet is for a call to SO_KEEPALIVE to enable.

>
> > This fix follows the standard approach of using the configured CA
> > Timeout in NMMGR:
> >
> > Path:  NETXPORT.GPROT.TCP
> > [600  ]   Connection Assurance Interval (Secs)
> > [4  ]     Maximum Connection Assurance Retransmissions
> >
> > With the above default configuration, a session logon will drop in 40
> > minutes if the TCP connection is not responsive.
>
> If I recall correctly, if the above scenario happens on a NS/VT session
> it causes a logoff and a VT error 42 (Recall recent and past discussion
> of mysterious VTERRs aborting sessions).

Yes, with VT, the VTERR 42 is the common message for anycase where
the remote TCP has stopped talking to us.

>
> Then there were the cases of long FTP sessions aborting...

Yes, and probably still are... We have gone through several itterations
of trying to fix this after fixing problems where we prematurely
disconnect file transferrs to IBM systems...  We found the problem here
to be the IBM was not doing record by record ASCII to EBCDIC conversion,
but was doing the conversion at the end of the file transfer instead
which resulted in a long hang durring which the HP was more than happy
to disconnect... The current status/operation of this is FTPServers
are not timed out...  But a new command was introduced in FTP on 5.5
patch FTPEDL1 sr 4701-324616 "timeout" which allows the user to specify
the timeout...   With the current operation, I have heard of few/none
compaints about FTP server sessions aborting prematurely or hanging
around too long, so the new "timeout" command may be of little use.
One note here...  This timeout was NOT a TCP timeout and was not a CA
timeout, but was actually a NETIPC timeout.

>
> Then we have the Image/SQL notes about lowering the timeout value to,
> say, 60 seconds so that hung ODBC sessions (possibly holding locks)
> timeout within a reasonable period of time.  This makes the timers more
> sensitive.

I am not overly familiar with the "Image/SQL" product, or any
recommendations to reduce the CA timeout to 60 seconds to drop ODBC
sessions....  But I will say, I do not expect a CA timeout to clear
a image or semaphore lock.  If your process is in wait for a lock,
aborting the session will not clear this lock and the session will
stay around until the holder of the lock free's it...

>
> If you are on a busy network, particularly across one or more routers,
> switches, or bridges; it is possible for the keepalive packet to be
> "dropped" by the router/switch/bridge.

Yes, but the algorithm for the CA timeout is to send a packet multiple
times...  Default configuration is to send a packet every 10 minutes
for 4 retransmissions (total 40 minutes) for a TCP connection where
NO traffic has been seen.   If any traffic is seen on the TCP connection
the CA timer is set back to 0.  So network traffic should have little
to do with CA timeouts... unless your times have been changed from the
defaults (60 & 4) or you have a really nasty performing network, but
if that was the case, you would see MANY more TCP timeouts before you
would see CA timeouts..

> If things "work as designed" it takes <max-retransmissions> consecutive
> failures to abort a connection.  I wonder if the "retransmit counter" is
> really being reset after a successful keepalive request?  It would
> explain a lot of curious network problems I've experienced or heard
> about over the years specific to MPE.

The CA timeout operates seperately from the TCP times... TCP timeouts are
based on timing out a connection on the bases of NO response from the
remote system to data sent... I.E. The 3000 send a TCP packet to a PC
and the PC did not respond with an ACK...  The 3000 retransmitted the
TCP packet and the PC did not respond with an ACK... etc..  after the
configured number of retransmissions are sent, the connection will
be disconnected.

The CA timeout only operates when NO TCP traffic is been seen from
either direction (idle connection) and after the timeout is reached,
one packet is sent between the 3000 and the client.  If a ACK is
received, the timer is reset to 0.  If any TCP traffic between the
3000 and a client is seen, the timer is reset to 0.  If no ACK is
received, then the 3000 will wait for another timeout period and
perform the first retransmission... until the the number of CA
retransmissions are reached and the connection and session dropped.

I hope everyone finds this information useful.

Regards,

James Hofmeister
[log in to unmask]
Hewlett Packard
Worldwide Technology Network Expert Center
P.S. My Ideals are my own, not necessarily my employers.

ATOM RSS1 RSS2

RAVEN.UTC.EDU