 |
» |
|
|
 |
 |
 |
|
|
 |
|
<TITLE>
TITLE: HP Tru64 UNIX - v 5.1B-3 Multiple Virtual Memory Fixes
Copyright (c) Hewlett-Packard Company 2006. All rights reserved.
PRODUCT: HP Tru64 UNIX [R] V5.1B-3
SOURCE: Hewlett-Packard Company
ECO INFORMATION:
ECO Name: T64KIT1000910-V51BB26-E-20060928
ECO Kit Approximate Size: 4.39MB
Kit Applies To: HP Tru64 UNIX V5.1B-3 PK5 (BL26)
ECO Kit CHECKSUMS:
/usr/bin/sum results:
28933 4500
/usr/bin/cksum results:
2875295415 4608000
MD5 results:
02b397f6c90fde5305b2387f121b226c
SHA1 results:
f425f8fc206bcc844bda9cef055763e89948cc75
ECO KIT SUMMARY:
A dupatch-based, Early Release Patch kit exists for HP Tru64 UNIX V5.1B-3
that contains solutions to the following problem(s):
DESCRIPTION
This Early Release Patch (ERP) kit provides several virtual memory-related
fixes. Specifically, the ERP provides the following:
Enhancements to the vm_overflow tunable.
This tunable may be enabled to allow NUMA systems (GS80, GS160, GS320,
GS1280, ES47, ES80) to more easily allocate memory from other RADS
(Resource Affinity Domains). This is mostly a benefit for single
applications that use large amounts of memory. For other applications, the
added NUMA latency may actually degrade system performance.
To enable, set vm_overflow to 1. To disable, set to 0.
Changes to the way page migrations occur on NUMA systems to address poor
system performance due to excessive paging.
Corrections that address an incompatibility between cpus_in_rad and
gh_chunks/rad_gh_regions tunables that can result in the following boot
failures:
vm_mad_init[0]: unable to allocate vm_page array for region 0
vm_mad_init[0]: unable to allocate vm_page array for region 1
pmap_update_send: missing ack from cpu <n>
trap: invalid memory read access from kernel mode
Correction to a problem where a thread could be left waiting on the
original RAD when a page table page allocation requires an overflow to
another RAD. This problem presents itself as an increased elapsed time for
fork operations to complete.
Changes that increase the maximum value of cpus_in_rad to 64 and unhide
this tunable.
Correction for "ubc_wire: hash failed" panics on non-NUMA systems. The
following is a typical stack trace of this panic:
1 panic
2 ubc_wire
3 u_vp_oop_pagecontrol
4 u_anon_update_pmap
5 u_anon_fault_backed
6 u_anon_fault
7 u_anon_lockop
8 u_map_lockvas
9 plock
10 syscall
11 _Xsyscall
Corrections for the "not wired" panic with system V shared memory and
bigpages. This panic can occur if callers of the shmat syscall provide
addresses with different page alignments when attaching the same shared
memory region. A typical stack trace follows:
1 panic
2 pmap_lw_unwire_new
3 lw_unwire_new
4 vm_map_pageable
5 cfs_condio_issue_io
6 cfs_blkmap_directio
7 cfs_condio_rw
8 cfs_read
9 vn_read
10 rwuio
11 read
12 syscall
13 _Xsyscall
Corrections for a boot failure on systems with sparsely populated cpus.
Fixes for a Kernel Memory Fault panic in _OtsMove() called from vaious I/O
or filesystem routines. The following are typical stack traces:
Kernel Memory Fault
4 panic
5 trap
6 _XentMM
7 _OtsMove
8 bs_refpg_direct
9 fs_read_direct
10 fs_read
11 msfs_read
12 vn_pread
13 msfs_strategy
14 aio_rw
15 syscall
16 _Xsyscall
Kernel Memory Fault
0 stop_secondary_cpu
1 panic
2 event_timeout
3 printf
4 panic
5 trap
6 _XentMM
7 _OtsZero
8 cfs_condio_issue_io
9 cfs_blkmap_directio
10 cfs_condio_rw
11 cfs_read
12 vn_read
13 rwuio
14 read
15 syscall
16 _Xsyscall
Fixes for vl_unwire panic when gh_chunks are in use.
Fixes the panic: 'vm_pg_free: page wired' when bigpages are enabled.
Fixes bug in wiring code and light weight wirings.
Fixes lock management issues within the UBC that can lead to "mcs_unlock:
current lock not found" and "mcs_lock: time limit exceeded" panics.
Fixes a "kernel memory fault" panic when the vm tunable 'anon_rss_enforce'
is set to the hard limit (2).
Fixes the panic in pmap_pagemove(), when getblk() is invoked under certain
situation.
Fixes a race condition in the ubc bigpage allocation routine.
The Patch Kit Installation Instructions and the Patch Summary and Release
Notes documents provide patch kit installation and removal instructions
and a summary of each patch. Please read these documents prior to
installing patches on your system.
The patches in this ERP kit will also be available in the next mainstream
patch kit - HP Tru64 UNIX V5.1B-4.
INSTALLATION NOTES:
1) Install this kit with the dupatch utility that is included in the patch
kit. You may need to baseline your system if you have manually changed
system files on your system. The dupatch utility provides the baselining
capability.
2) The patch in this ERP kit does not have any file intersections with any
other ERP available at this time for this product version.
3) This ERP kit will NOT install over any Customer Specific Patches (CSPs)
which have file intersections with this ERP kit. Contact your normal
Service Provider for assistance if the installation of this ERP kit is
blocked by any of your installed CSPs.
INSTALLATION PREREQUISITES:
You must have installed HP Tru64 UNIX V5.1B-3 PK5 (BL26) prior to
installing this Early Release Patch Kit.
SUPERSEDE INFORMATION:
None
KNOWN PROBLEMS WITH THE PATCH KIT:
None.
RELEASE NOTES FOR T64KIT1000910-V51BB26-E-20060928:
Release Notes
This document summarizes the contents and special instructions for the
Tru64 UNIX V5.1B patches contained in this kit.
For information about installing or removing patches, baselining,
and general patch management, see the Patch Kit Installation
Instructions document.
1 Release Notes
This Early Release Patch Kit Distribution contains:
- fixes that resolve the problem(s) reported in:
o 19082 19182 19283 19290 19399 19411 19415 19435 19491 19580
* for Tru64 UNIX V5.1B T64V51BB26AS0005-20050502.tar (BL26)
This kit includes a patch which requires system reboot.
The patches in this kit are being released early for general customer use.
Refer to the Release Notes for a summary of each patch and installation
prerequisites.
Patches in this kit are installed by running dupatch from the directory
in which the kit was untarred. For example, as root on the target system:
> mkdir -p /tmp/CSPkit1
> cd /tmp/CSPkit1
> <copy the kit to /tmp/CSPkit1>
> tar -xpvf DUV40D13-C0044900-1285-20000328.tar
> cd patch_kit
> ./dupatch
2 Special Instructions
SPECIAL INSTRUCTIONS for Tru64 UNIX V5.1B Patch C1884.00
In the V5.1B-3 release, performance enhancements for NUMA class systems have
been provided through two new tuning options.
The vm_overflow feature changes how large memory applications "borrow" memory
from other resource affinity domains (RADs) when their local memory resources
are exhausted. This can be used on all NUMA class systems.
For the ES47, ES80, and GS1280 systems the previously hidden cpus_in_rad
variable can now be set to more than two CPUs; this allows pooling together the
CPU, memory, and I/O resources of a set of CPUs and treating it as a single
resource affinity domain (RAD). Like vm_overflow, this can allow large memory
applications to have access to more memory that is managed as if it were all
physically "local".
generic: cpus_in_rad
With the default value of cpus_in_rad (zero) every cpu is in its own
resource affinity domain. This is equivalent to setting the value to 1.
When the value of cpus_in_rad is set larger than 1, certain configuration
restrictions must be considered:
1) Values for the cpus_in_rad tunable must be a power of two.
2) Take caution setting cpus_in_rad to 64 on a 64 processor system. When
the number of RADs is decreased by increasing the value of cpus_in_rad,
the number of per-RAD locks needed to manage resources also decreases.
This may result in increased lock contention and may result in poor
performance or system panics. The system automatically adjusts the
generic: locktimeout tunable if cpus_in_rad is set to 64 on a 64
processor system, but depending on system load it may need to be
manually increased to avoid locktimeout panics. The maximum value
for locktimeout is 60 seconds. If the value needs to be increased, do
so in 5 second increments. If the maximum value is reached and the
system is unstable, reduce the value of cpus_in_rad and escalate the
problem through your support channels.
3) "Missing" cpus are included in the count of cpus in a rad.
Consider a system configured with cpus 0,1,4,5,8,9,12,13
Setting cpus_in_rad to 2 on this system would result in the following
resource affinity domain configuration:
RAD[0] - cpus 0, 1
RAD[2] - cpus 4, 5
RAD[4] - cpus 8, 9
RAD[6] - cpus 12, 13
Setting cpus_in_rad to 4 on this system would result in the following
resource affinity domain configuration:
RAD[0] - cpus 0, 1 (2 and 3 are missing)
RAD[1] - cpus 4, 5 (6 and 7 are missing)
RAD[2] - cpus 8, 9 (10 and 11 are missing)
RAD[3] - cpus 12, 13
Interaction of cpus_in_rad and the rad_gh_regions tunables:
When cpus_in_rad is increased, the number or RADs configured decreases.
If the system is configured with settings for rad_gh_regions, those
settings must also be changed.
Consider a system configured with cpus 0,1,4,5,8,9,12,13 and rad_gh_regions
configured to allocate 4 Gigabytes of granularity hint memory. With the
default setting of cpus_in_rad (zero) or cpus_in_rad set to 1,
rad_gh_regions would have the following settings:
rad_gh_regions[0] = 512
rad_gh_regions[1] = 512
rad_gh_regions[4] = 512
rad_gh_regions[5] = 512
rad_gh_regions[8] = 512
rad_gh_regions[9] = 512
rad_gh_regions[12] = 512
rad_gh_regions[13] = 512
If the system is configured to place 2 cpus in a rad (cpu_in_rad=2),
the rad_gh_regions settings would need to be changed to the following:
rad_gh_regions[0] = 1024
rad_gh_regions[2] = 1024
rad_gh_regions[4] = 1024
rad_gh_regions[6] = 1024
Because of how missing cpus are handled, if cpus_in_rad is set to 4, the
RADs would still contain only 2 cpus (2 existing, 2 missing) but the
rad numbers change, so rad_gh_regions would have the following settings:
rad_gh_regions[0] = 1024
rad_gh_regions[1] = 1024
rad_gh_regions[2] = 1024
rad_gh_regions[3] = 1024
vm: vm_overflow
When memory resources are depleted on a RAD in a NUMA system, the vm
subsystem will automatically overflow to another RAD to fulfill the
memory allocation request. The default overflow behavior is to:
attempt an allocation from the "local" RAD
if that fails, page out a page of memory and "steal" it
if that fails, attempt an allocation from the "next" RAD
if that fails, page out a page on that RAD and "steal" it.
This continues until allocation/stealing has been attempted on all
RADs.
Setting the vm_overflow tunable to 1 changes the order of page allocations
and page stealing:
attempt an allocation from the "local" RAD
if that fails, attempt an allocation from the "next" RAD
This continues until allocation has been attempted on all RADs.
If the memory allocation is still not successful revert back to
the original behavior stated above.
Using a setting of 1 may result in less paging activity for some
applications and improve performance.
3 Summary of CSPatches contained in this kit
Tru64 UNIX V5.1B
PatchId Summary Of Fix
----------------------------------------
C1884.00 Fixes to cpu_in_rad, gh_chunks and UBC
4 Additional information from Engineering
None
5 Affected system files
This patch delivers the following files:
Tru64 UNIX V5.1B
Patch C1884.00
./sys/BINARY/arch_alphapmap.mod
CHECKSUM: 52033 351
SUBSET: OSFHWBIN540
./sys/BINARY/generic.mod
CHECKSUM: 50039 12
SUBSET: OSFBIN540
./sys/BINARY/marvel_cpu.mod
CHECKSUM: 28870 147
SUBSET: OSFHWBIN540
./sys/BINARY/marvel_soc.mod
CHECKSUM: 10168 237
SUBSET: OSFHWBIN540
./sys/BINARY/vfs.mod
CHECKSUM: 51074 656
SUBSET: OSFBIN540
./sys/BINARY/vm.mod
CHECKSUM: 32272 674
SUBSET: OSFBIN540
./usr/sys/BINARY/alpha_init.o
CHECKSUM: 47020 142
SUBSET: OSFHWBIN540
./usr/sys/BINARY/pmap_init.o
CHECKSUM: 63788 145
SUBSET: OSFBIN540
[R] UNIX is a registered trademark in the United States and other countries
licensed exclusively through X/Open Company Limited.
Copyright Hewlett-Packard Company 2006. All Rights reserved.
This software is proprietary to and embodies the confidential technology
of Hewlett-Packard Company. Possession, use, or copying of this
software and media is authorized only pursuant to a valid written license
from Hewlett-Packard or an authorized sublicensor.
This ECO has not been through an exhaustive field test process.
Due to the experimental stage of this ECO/workaround, Hewlett-Packard
makes no representations regarding its use or performance. The
customer shall have the sole responsibility for adequate protection
and back-up data used in conjunction with this ECO/workaround.
|