LISTSERV - HP3000-L Archives

Jeff Kell ([log in to unmask]) wrote:
: In a more general sense, does NS/VT use keepalives?  It seems not since
 
 - Everything that runs over tcp/ip on MPE/iX uses TCP keepalive.  It is not
   up to the application level to do anything about this (i.e. VT nor any
   other OSI level 6/7 stuff) - there's not even an option for it.
 
   It's all in TCP's hands and NMMGR screen NETXPORT.GPROT.TCP has two
   parameters for TCP keepalive (or TCP Connection Assurance or CA) with
   the following default values:
 
   [600  ]   Connection Assurance Interval (Secs)
   [4  ]     Maximum Connection Assurance Retransmissions
 
   What this means is that every 600 seconds the timer will pop and a CA
   keepalive packet will be sent (yes,  _every_ 10 minutes regardless of
   whether there has been traffic or noti).  I was under the impression (and
   have requested a change that never happened) that it would only get sent
   if there had been no traffic since last keepalive timer pop.
 
   As to lenghty timeout, yes - I agree.  It should not be doing what it
   does...  Currently, (according to TCP lab) TCP will send a CA packet
   whenever the timer expires and if no response,  it'll be retried
   configured number of times.  The stupid thing about retrying is that
   it will not use the standard retry algorithm so that once sent,
   retries would come at 4s, 8s, 16s... intervals after the CA (with
   default config).  Nope.  It'll retransmit __EVERY_TEN_MINUTES__(!!) -
   four times,  i.e. worst case will be initial CA + 4 retries, 10 minutes
   wait on each - i.e. 50 minutes to detect a dead peer.  Sigh.
 
   IMHO a much better algorithm would be to store our TCP's last ACK number
   in, say, LAST_CA_TIME_ACK_VALUE.  When the CA timer pops next time,
   compare current ack value to LAST_CA_TIME_ACK_VALUE and if different,
   that means there has been data from peer since last CA time - i.e. there's
   no need to send a CA-packet, just update the LAST_CA_TIME_ACK_VALUE.
 
   If the ack value matched LAST_CA_TIME_ACK_VALUE, there has been no data.
   Send TCP CA packet and queue it to retransmission queue and do the normal
   retry sequence, i.e. time out in ~1..2 minutes if the remote is not
   responding.
 
   It makes absolutely no sense to keep retrying CA packet for 50 minutes,
   once every 10 minutes...  The only good thing about this is that at least
   those values are configurable.  In case someone feels like tweaking these
   numbers,  keep in mind that X.25 (in case you have X.25 link) uses
   IDLE TIMER which defaults to 300(?) seconds (guessing, check in NMMGR)
   and this timer should be longer than TCP's CA since the X.25 IDLE TIMER
   is used to close X.25 Virtual Circuits if no traffic during idle timer
   period of time.
 
   Given that we'll have to live with this behaviour at least for the
   time being and if you'd like to do something about it, maybe I should
   recommend some new values to try.  For example a value of 180secs
   (3 minutes) for TCP CA timer and 240secs(4min) for X.25 idle timer.
   This should detect and  kill TCP connection halves (with dead peer)
   in about 15 minutes,  yet still maintaining X.25 connections for
   helthy TCP-connections as CA-packets will flow within X.25 idle
   timer limits.  Since X.25 makes you pay for every packet, one doesn't
   want to set CA-timer too low...
 
Hope this helps,
Eero - HP CSY Networking lab, NS services.