HP3000-L Archives

March 2003, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Atwood, Tim (DVM)" <[log in to unmask]>
Reply To:
Atwood, Tim (DVM)
Date:
Thu, 20 Mar 2003 11:42:31 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (93 lines)
Yesterday morning we started having severe problems on our network. It is
causing problems for the HP3000 computers also, so I will consider this
on-topic. Problem description and what we have tried, etc. is below.

We have some network experts coming out at $300 / hour. But maybe one of you
geniuses out there have some suggestions I could try myself. I would sure
look like a hero if I saved the company some of that $300 / hour.

So any suggestions what to look for? Any HP3000 based tools to sniff out the
problem or get better error messages?

I will be the first to admit I am not a network guru (my expertise is
HP3000). This problem is stretching my network knowledge. So if anything I
say below is not worded quite right concerning networks, please forgive me
:-)

Our network is a mish-mash of old and new. Computer room backbone is IBM
Token Ring. Various bridges and hubs to Ethernet. Some fibre optic hubs.
Some hubs standard Ethernet.

Network has two segments. ###.###.248.### segment and ###.###.249.###.
Problem only seems to be on devices on the 249 segment.

On the HP3000, the symptoms are as follows: Every 5 to 7 minutes there is a
30 second period where multiple VT sessions get logged off. If a VT session
on the 249 segment happens to have activity during this 30 second period,
the user gets the message "VT Connection Terminated" and the session is
logged off. There are no console messages other than the logoff message. If
the VT session is idle during this period it appears to remain on fine.

The AS400 is also having dropped sessions and other network problems. It is
giving better messages. These messages have not helped us yet though. They
appear to be symptoms and as far as we can tell do not point to the cause.

AS400 Message:
Message type    Message ID      CPI591A Information     Severity        70
Date sent  . . . . . . :   03/20/03      Time sent  . . . . . . :   08:40:00
Message . . . . :   Controller on line TRLINE varied off or not recognized
by      local system.
Cause . . . . . :   A remote station attempted to establish a local area
network connection with this system. There is no controller on the local
system that is varied on with remote adapter address (ADPTADR parameter)
444444444444, source service access point (SSAP parameter) 04, and
Recovery  . . . :   Verify the proper controllers are varied on.  The line
destination service access point (DSAP parameter) 04.
does not have to be varied off to be used again.  Have the remote system
operator try the request again.
descriptions on LAN, repeat the request again.  Also refer to the APPN  If
the line is configured to allow automatic creation of APPC controller
**OR**
Message type    Message ID      CPF8B41 Information     Severity        00
Date sent  . . . . . . :   03/20/03
Time sent  . . . . . . :   05:13:41
Message . . . . :   Adapter has inserted or left the ring on line TRLINE.
Cause . . . . . :   An adapter has inserted or left the ring on line TRLINE.
The reporting adapter is 08005A433FFC and the nearest active upstream
neighbour (NAUN) of the reporting adapter is 1000F10F1F51.
Technical description . . . . . . . . :   Any adapter that inserts into or
removes itself from the ring causes a nearest active upstream neighbour
(NAUN) change report to be sent on the ring. These reports are sent as
medium access control (MAC) frames which are monitored by the configuration
report server that in turn reports them to the TRLAN manager.  The NAUN
change reports are logged to the system history log (QHST) by the TRLAN
manager.

The HP9000 computers are also dropping network links and sessions.

At this point we have power cycled and/or rebooted every bridge, hub and
router we could think of. One bridge failed to self-test when it was brought
back up. It has been replaced. Did not fix the problem.

Some of the bridges and hubs show significant network traffic in the same 5
to 7? 10? minute cycle. Or at least their activity and/or collision lights
go on solid for several seconds every 5 to 10 minutes.

The timing between these events seems to be inconsistent. When we first
looked at things yesterday morning, it seemed to be 5 to 7 minute intervals.
My most recent review of all console logs showed time between events
anywhere from 5 to 25 minutes. We also had someone sit out at one of the
bridges and look at the lights for a while. One time the interval between
the "solid on" events was about 20 minutes, other times about 5 minutes.

Thanks for any help or suggestions anyone can give.

Timothy Atwood
Holtenwood Computing
http://www.holtenwood.bc.ca/computing/
for Domtar Vancouver Mill
(Opinions expressed are mine and do not reflect Domtar)

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2