Subject: | |
From: | |
Reply To: | |
Date: | Tue, 15 Jan 2002 10:44:27 -0500 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
Gilles writes:
> The first thing to check is whether or not your system is being bombarded
> by lack of heartbeat signals from your dtc's.
>
> Type:
>
> :linkcontrol @;status=all
> Linkname: DTSLINK Linktype: IEEE8023 Linkstate: CONNECTED
> Physical Path: 56/56
> Current Station Address: 08-00-09-98-18-D3
> Default Station Address: 08-00-09-98-18-D3
> Current Receive Filter: broad(1) any(0) k_pckts(1) x_pckts(0)
> Current Multicast Addresses:
> 09-00-09-00-00-01 09-00-09-00-00-02 09-00-09-00-00-03
> 09-00-09-00-00-04
> Transmits no error 2472 Receives no error 7375
> Transmit byte count 332989 Receive byte count 951822
> Transmits error 0 Receives error 0
> Transmits deferred 1 Carrier losses 0
> Transmits 1 retry 0 CRC errors 0
> Transmits >1 retry 0 Frame losses 0
> Trans 16 collisions 0 Whole byte errors 0
> Trans late collision 0 Size range errors 0
> 802 chip restarts 0 Receives dropped 0
> Heartbeat losses 0 Receives broadcast 6605
> Receives multicast 0
>
> You should see Heartbeat losses of 0 or very close to 0.
Gilles is correct that you should check for Heartbeat losses on the LAN
card. Heartbeat losses on the system card cause slow network throughput most
notable in large file transfers. But the LINKCONTROL statistics only show
you if the transceiver on the HP3000 system itself is not providing SQE
heartbeat.
Lack of SQE Heartbeat on DTCs can cause system performance problems and is
not reported by the LINKCONTROL command. A DTC 'complains' to the host
system that it is missing SQE. The host system, your HP3000, will log the
heartbeat loss events to special log files stored on LDEV 1. These log
events occur continuously resulting in an I/O bottleneck on the system disk.
On some systems you can actually hear the system disk getting constant
usage.
How do you diagnose if you are subject to this problem? Frequently the
process that is logging the errors appears as the top DISC consumer in SOS
or Glance/iX. Or a system process will continually appear in a list of
active processes as seen in the :SHOWQ command.
:showq;active
DORMANT RUNNING
Q PIN JOBNUM Q PIN JOBNUM
A 39
C M163 #S9136
C M183 #S9140
D U189 #J6036
A stack trace of PIN 39 would look something like this:
$8 ($a3) nmdebug > pin #40;tr,i,d
PC=a.0017399c enable_int+$2c
NM* 0) SP=41643df0 RP=a.00789004
notify_dispatcher.block_current_process+$338
NM 1) SP=41643df0 RP=a.00870cd8 find_obj_cache_desc+$170
NM 2) SP=41643d70 RP=a.001baa64 wait_for_active_port+$e8
NM 3) SP=41643c70 RP=a.001bb6c8 receive_from_port+$544
NM 4) SP=41643bf0 RP=a.0075f5e4 extend_receive+$494
NM 5) SP=416439f0 RP=a.00a6cce0 xm_w_commitrecord+$1a0
NM 6) SP=416438b0 RP=a.00954e2c xm_end_system_trans+$340
NM 7) SP=416437b0 RP=a.00990fec sm_pin_eof+$448
NM 8) SP=416436b0 RP=a.00a4245c sm_write_eof+$110 <--------------
NM 9) SP=41643530 RP=a.00a42688 sm_cntl_64+$104
NM a) SP=416434b0 RP=a.00ee0490 tm_control_common+$e2c
NM b) SP=416433f0 RP=a.01558b14 tm_ord_fix_buf_disc+$250
NM c) SP=416432f0 RP=a.0143676c fcontrol_nm+$fa8
NM d) SP=41643230 RP=a.01435790 ?fcontrol_nm+$8
export stub: a.01436f7c FCONTROL+$50 <----------------
NM e) SP=41642d70 RP=a.01436ef8 ?FCONTROL+$8
export stub: a.01e9440c tio_dtcm.p_write_eof+$f0
NM f) SP=41642cf0 RP=a.01ebff1c tio_dtcm.x_log+$638 <---------------
NM 10) SP=41642c70 RP=a.01ec51a0 tio_dtcm+$42a0
NM 11) SP=416424b0 RP=a.01ec0eec ?tio_dtcm+$8
export stub: a.00748d74 io_receive+$e0
NM 12) SP=41642330 RP=a.0074c818 io_mgr_process+$320
NM 13) SP=416422b0 RP=a.0099c358 outer_block+$154
NM 14) SP=41642130 RP=a.00000000
(end of NM stack)
Reviewing the stack points to performance problem. Not only is this process
logging the heartbeat loss events, it is forcing a post of the records to
disk immediately via the FCONTROL. This is where the performance problem
lies.
Another method to investigate if you have this problem is to check for the
log files themselves. The system will write one set of log file for each DTC
configured on the system. The names of the log files are HxxxxxxA.PUB.SYS
and HxxxxxxB.PUB.SYS where 'xxxxxx' represents the last 6 characters of the
12-digit Ethernet/MAC address of the DTC. For instance, if the MAC address
08-00-09-00-75-BD then the file name will be H0075BDA.PUB.SYS.
:listf [log in to unmask],2
ACCOUNT= SYS GROUP= PUB
FILENAME CODE ------------LOGICAL RECORD----------- ----SPACE----
SIZE TYP EOF LIMIT R/B SECTORS #X MX
H0075BDA* 1W FB 5 66010 1 256 1 *
If the EOF of this file is very large then you should verify the SQE
settings on the transceiver connected to that DTC.
Doug.
Doug Werth Beechglen Development Inc.
[log in to unmask] Cincinnati, Ohio
* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
|
|
|