Hi Robert,
You could also run stmshut.diag.sys, delete the memlog file, then run stmstart.diag.sys
When diagmond restarts the memlogd process, if the memlog file does not exists it should be created.
I just prefer to have the memlogd process stopped instead of deleting the file out from under it...
Thanks,
Gary
> Date: Tue, 23 Feb 2010 14:25:32 -0800
> From: [log in to unmask]
> Subject: Re: How to clear Memory Log
> To: [log in to unmask]
>
> Thanks for the responses.
>
> I rebooted again today and found out there is no PDT option under the
> Service Menu.
>
> I found a reference to deleting the log file called memlog.
>
> It worked. Now STM tells me the memory log is empty.
>
> I will keep monitoring and replace the memory stick if it shows more
> problems.
>
> Procedure used to re-create a new memlog:
>
> Xeq Sh.Hpbin.Sys - L (get into posix)
> cd /var/stm/logs/os (change dir where the logs are)
> ls -l (look at the files)
> rm memlog.old (if you have an old one saved)
> mv memlog memlog.old (rename)
> touch memlog (create a new file)
> chmod 644 memlog (set the attributes)
> ls -l
> exit
>
> Btw, the same thing can be done for "log1.raw.cur".
>
> Thanks again and Cheers!
>
> ~Robert
>
>
> -----Original Message-----
> From: HP-3000 Systems Discussion [mailto:[log in to unmask]] On Behalf
> Of Gary S Robillard
> Sent: Tuesday, February 23, 2010 8:17 AM
> To: [log in to unmask]
> Subject: Re: How to clear Memory Log
>
> Hello All,
>
>
>
> I don't believe that the 928LX has a PDT (Page Deallocation Table) in PDC
>
> (Processor Dependent Code). So the deallocation is due to the logs in the
> memlog file.
>
>
>
> Since the DIMM appears to have a solid repeatable single-bit error, it is
> going to keep
>
> being added to the memlog file and being deallocated by memory management
> software.
>
>
>
> You might want to consider replacing the DIMM with the error, or adding some
>
>
> thresholding to your CSTM job ...
>
>
>
> The PDT article in the MPE 5.0 Communicator:
>
>
>
> (Note the last paragraph " MPD and Current Systems ", as it explains how the
>
>
> pages are deallocated on systems without a PDT):
>
> Chapter 10 Technical Articles Memory Page Deallocation (MPD) Steve Flynn
> Systems Technology Division MPD and Current Systems This article presents an
> overview of Memory Page Deallocation, a new
> feature available with MPE/iX Release 5.0. It does not cover detailed
> operation.
>
> When an HP 3000 is upgraded to MPE/iX 5.0, it also benefits from the MPD
> software. Most of the MPD operations described below operate in a
> similar manner. Please refer to the last section of this article for a
> discussion of the minor exceptions to MPD operation. Memory Failures. Memory
> boards are subject to two types of failures, hard errors and soft
> errors. Hard errors are caused by a single chip failure within a memory
> board, causing failures on all words associated with that chip. Soft
> errors occur when a bit within a word changes value. This is typically
> caused by decaying alpha particles from the surrounding casing material
> on the chip.
>
> HP's current memory design is single-bit correct, double-bit detect. It
> is important to note that our ECC design does not perform error
> correction on the memory cell itself, but fixes the value in the cache
> line. The memory cell still contains the failure. If this is a soft
> failure, the data in memory is corrected when the cache line is written
> back to memory. If this is a hard failure, the memory cell is always in
> error.
>
> In either case, if another failure were to occur on the same word, it
> would go from single-bit correct to double-bit detect and cause the
> system to fail the next time the word is read. The purpose of page
> deallocation is to permanently remove those pages from memory that
> contain single or double bit errors. Components of MPD MPD provides a
> mechanism where memory pages containing errors can be made
> unavailable for system use. A memory page is 4k bytes in size and is
> deallocated if it contains one of the following errors:
>
> * Solid single-bit error
>
> * A soft failure re-occurring within a 24-hour period
>
> * A double-bit error
>
> Numerous system components work together to implement memory page
> deallocation: Page Deallocation Table (PDT). This is a table that contains
> an entry for each memory page that has been
> deallocated, at some point in time, due to an error. Each entry contains
> the address and the nature of the error (single or double-bit).
>
> One important feature of this table is that it is implemented in
> Non-Volatile RAM, thus preserving deallocated pages between system boots.
> NOTE Older systems do not implement the PDT.
> Memory Selftest. Each time the system is reset, the memory selftest
> executes. If it finds
> a double-bit error, the address is entered into the PDT along with the
> fact that this was a double-bit error. MEMLOGP. The Memory Logging Process,
> MEMLOGP, is a process that periodically
> (every hour by default) checks the status of each memory controller on
> the system for occurrences of single-bit errors. MEMDIAG/LOGTOOL.
> Information about deallocated pages is kept in two places, the PDT, which
> is NVRAM based, and the MEMLOGP memory log file, which is disk based.
>
> MEMDIAG and LOGTOOL can be used to display the contents of the memory
> logfile. Information such as memory board slot number, physical address,
> page number and error type is displayed. The size of the PDT and number
> of entries currently in the table are also displayed. O/S Memory Manager.
> The O/S memory manager is involved during two phases, system boot and
> while the system is running.
>
> During the early portion of boot, the memory manager reads the PDT and
> deallocate any pages found there.
>
> Once the system is up, the memory manager provides services to MEMLOGP to
> allow pages to be deallocated online. Predictive. HP Predictive Support
> analyzes internal error logs on disk drives, system
> log files and memory logs for error trends. When an error rate exceeds
> its threshold, an EVENT is generated. HP Response Center Engineers and
> Customer Engineers analyze event information and take appropriate action
> to solve the problem.
>
> MEMSCAN is a software module within Predictive which scans system memory
> log files. MEMSCAN provides page deallocation trending information to
> support engineers such as PDT table size status and identification of
> boards or banks that have a significant number of pages deallocated.
> Bank deallocation or board replacement recommendations occur if the total
> number of deallocated pages exceeds a certain threshold. GENERAL OPERATION
> PD comes into effect while the system is being started as well as when it
> is online.
>
> During system startup, memory is tested and any pages with bad locations
> are made unavailable to the system.
>
> While the system is online an attempt is made to correct memory locations
> containing soft errors (scrubbing) and deallocated pages online, that
> contain solid errors. System Startup. The following shows the general system
> startup flow that occurs with
> respect to MPD.
>
> 1. Memory selftest executes. If any double-bit errors are discovered
> during testing, and there is not an entry in the PDT corresponding
> to this address, an entry is made.
>
> 2. During the boot process, the Operating System obtains the contents
> of the PDT. Each page in the PDT are made unavailable for
> allocation by the system's memory manager.
>
> 3. MEMLOGP reads the PDT and add any new PDT entries (discovered by
> selftest) which are not contained in the memory logfile. Online
> Operation. The following shows the operation of MPD while the system is
> online.
>
> 1. MEMLOGP wakes up and reads the memory controller status register
> and determines whether a single-bit error has been logged.
>
> 2. MEMLOGP requests the O/S memory manager to release the page for
> testing.
>
> 3. If the O/S cannot release the page, MEMLOGP logs the error in the
> memory log file as it does today.
>
> 4. If the O/S does release the page, MEMLOGP performs a scrubbing
> operation (write/read test) on the page.
>
> 5. If the single-bit error is reproduced (hard error), the page is
> entered into the PDT and memory log file. A request is made to
> the O/S memory manager to make this page unavailable for system
> use.
>
> 6. If the single-bit error is not reproduced (soft error) and another
> soft error WAS DETECTED at this location within 24 hours, the page
> is entered into the PDT and memory log file. A request is made to
> the O/S memory manager to make this page unavailable for system
> use. MPD and Current Systems The one exception to MPD operation is
> that older systems were not
> designed with a Page Deallocation Table. Because of this, the system
> startup routine is slightly different. During system startup if the
> memory selftest detects a double-bit error, the system does not boot
> (same operation as today), unlike the 3000 991/995. But, while the
> system was running, MEMLOGP was keeping track of deallocated pages in its
> disk-based memory log file. During startup, these pages are deallocated
> before the system comes up.
>
>
>
>
>
>
>
> Thanks,
>
> Gary Robillard
> ----- Original Message -----
> From: "Raymond D Legault" <[log in to unmask]>
> To: [log in to unmask]
> Sent: Tuesday, February 23, 2010 6:43:23 AM GMT -07:00 US/Canada Mountain
> Subject: Re: [HP3000-L] How to clear Memory Log
>
> HP3000 A/N-Class - How to clear entries in PDT table?
>
> DocId: MPEKBRC00017083 Updated: 7/20/05 4:01:00 AM
>
> PROBLEM
>
> What is the procedure to clear the Page Deallocation Table (PDT) on the
> HP3000 A-Class and N-Class series servers?
>
> For a brief summary of Page Deallocation Table (PDT), refer to the document
> ID
> TCKBCA00000264 (Enabling/Disabling/Verifying Page Deallocation Table).
>
>
> CONFIGURATION
>
> A-Class N-Class
>
> RESOLUTION
>
> Shut system down.
> Restart system to the Boot Menu.
> At the Main Menu prompt, enter service.
> At the Service Menu prompt, enter pdt to display the PDT entries.
> At the Service Menu prompt, enter pdt clear to clear entries;
> the following is displayed:
> Execution of this command will clear the Page Deallocation Table and then
> hard boot the system (memory will be reconfigured on boot) Continue? (Y/N) >
>
> Enter y
> ; the following is displayed: Resetting ... .. ..
> ********** VIRTUAL FRONT PANEL
> ********** SYSTEM BOOT DETECTED LEDs :
> RUN ATTENTION FAULT REMOTE POWER FLASH OFF OFF ON ON LED state:
> Running non-OS code (i.e. BOOT OR DIAGNOSTICS).
> Next, the Main Menu is displayed. From now on, restart the system like you
> normally do.
>
>
> Ray
>
> -----Original Message-----
> From: Robert Mpe [mailto:[log in to unmask]]
> Sent: Monday, February 22, 2010 2:22 PM
> Subject: How to clear Memory Log
>
> How to clear Memory Log
>
> Friends,
>
> I have 928LX on 6.5 PP3 with all patches applied.
>
> We had a memory error 2 weeks ago:
>
> Memory Controller in Slot 3A
> ==========================================================
> Slot: 3A
> Error Type: Single/hard: solid, repeatable single-bit error.
> Page Status: Deallocated: page is no longer in use.
> Bit Num / Bank: 29 / 0
> Logged By: Memlogd
> First Detected: Sat Feb 6 14:08:18 2010
> Last Detected: Sat Feb 6 14:10:18 2010
> Error Count: 2
> Error Addr: 0x4cd1068
> ==========================================================
>
> I have had other memory errors that the "Page Status" was "Active".
> I can get into CSTM, run logtool and use the 'CL' command to
> clear the log.
> But with the above Deallocated Status, ClearLog cmd does not work.
>
> I run a home-made STM diagnostic job every day to check the hardware
> status and I am getting tired of looking at this entry.
>
> The system has been rebooted twice since the memory error.
>
> Any idea how to clear this memory log?
>
> Thanks in Advance,
>
> ~Robert
>
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
|