Issue: High paging on AIX oracle DB server
Background/Observation:
i. AIX server with version 6.1 TL 7 (6100-07-06-1241)
running standalone Oracle DB 10.2.0.5
ii. Huge paging memory utilization found to be
caused by db processes mainly
#lsps -s
Total Paging Space Percent Used
30720MB 40%
#ps -ealf | head -1 ; ps -ealf | sort -rn +9 | head
F S UID
PID PPID C PRI NI ADDR SZ
WCHAN STIME TTY
TIME CMD
240001 A ora10g 3736258
1 0 60 20 f60b76590 152524 Jul 31 -
4:40 ora_dbw0_ORADB
240001 A ora10g 2883690
1 0 60 20 5c0bdc590 143288 Jul 31 -
4:29 ora_dbw1_ORADB
240001 A ora10g 1835488
1 0 60 20 ad0c2d590 125636 Jul 31 -
8:33 ora_lgwr_ORADB
240001 A ora10g 13041936 1
0 60 20 d95259590 118304 Jul 31 - 203:28 oracleORADB (LOCAL=NO)
240001 A ora10g 5046804 1 0 60 20 4321c3590 118292 Jul 31 - 176:28 oracleORADB (LOCAL=NO)
240001 A ora10g 9044626 1 5 62 20 1e5c9e590 118220 Jul 31
- 160:27 oracleORADB (LOCAL=NO)
240001 A ora10g 2228980 1 0 60 20 5513d5590 117520 Jul 31 - 176:38 oracleORADB (LOCAL=NO)
240001 A ora10g 11534512 1
0 60 20 5e7cde590 109812 Jul 31
- 185:14 oracleORADB (LOCAL=NO)
240001 A ora10g 6619604 1 0 60 20 6e0bee590 109796 Jul 31 -
2:12 ora_arc1_ORADB
240001 A ora10g 3801228 1 0 60 20 af0a2f590 109796 Jul 31
- 2:17 ora_arc0_ORADB
iii. Paging found to be getting reduced by few % on killing of old foreground oracle connection processes (LOCAL=NO)
iv. On clean
DB restart paging memory utilization getting wiped out till 4%
v. Post DB
restart with time lag of 1 week max. paging builds up again gradually
vi. Post 50%
of paging utilization application starts facing slowness
Analysis:
This phase
will help to understand aix paging & virtual memory manager concepts which are
backbone for fixing the paging issue. Later in this section optimal aix memory
relevant kernal parameter setting are explained
Understanding
AIX paging & caching
Swap/paging
allocation is controlled by the OS. Oracle is only requesting virtual memory to
the OS which gives whatever it can either from real memory or from
virtual/swaps pages.
AIX uses
caching method for OS memory allocation wherein once real free memory is
allocated it is being cached for re-use by other processes than releasing back
as free real memory. Release of memory pages to real memory happens only on
clean start/stop of database or server. Hence any graph or command would show
real free memory as 1-2 % but actual available memory for processes could be
more and calculated as below,
Actual
Physical RAM in server:
#prtconf
| egrep -i "Good Memory"
Good Memory Size: 57344 MB
Vmstat
command to get current memory utilization snap:
#vmstat 1 1
System configuration: lcpu=8 mem=57344MB
kthr
memory page faults cpu
-----
----------- ------------------------
------------ -----------
r
b avm fre re
pi po fr
sr cy in
sy cs us sy id wa
4 2 12629663 57902 0 1 0
12181 12992 0 2597 127129 6356 81 9
7 3
Here,
fre
shows real free memory 57902 , 57902/256
(page size 64*4) = 226.1 MB (As said earlier it is just 1% free which will
be reflected in graphs giving misinterpretation unless used NMON)
avm shows average memory used is 12629663
, 12629663/256 = 49334 MB = 48 GB
So, here
actual free memory including cache is
57344 (RAM) – 49334 (avm) = 8010 MB = 7.8 GB
Most of the
monitoring tools are unable to capture exact free memory on AIX os due to
caching method which end up showing 100% memory utilized adding confusion
AIX Virtual
Memory Manager (VMM)
The default values for the AIX VMM
are generally not appropriate for use with databases. The default values for
the AIX VMM will gradually allow up to 80% of physical memory to be used to
buffer file I/O. Since Oracle is already buffering file I/O in the SGA, the
same data is unnecessarily being buffered twice, and leaves only 20% of
physical memory to run the Oracle database(s) and all other programs. This
causes the majority of the Oracle database to be pushed out of physical memory
to paging space; thus greatly impacting database performance.
To check whether your system is
using the untuned default values for the AIX VMM, run the command:
/usr/sbin/vmo –a
/usr/sbin/vmo –a -F
If you do not have the /usr/sbin/vmo
file, you will need to have your AIX systems administrator load the AIX fileset
"bos.perf.tune".
The vmo command will list out all of
the VMM parameters and their current values. The parameters you want to examine
are the following:
MINPERM%, MAXPERM%, and MAXCLIENT%
Here is an example of the vmo
report:
#sudo vmo -a -F | egrep -i "PERM|MAXCLIENT"
maxperm =
12805458
minperm =
711413
minperm% = 5
maxclient% =
90
maxperm% =
90
numperm_global = 0
strict_maxclient = 1
strict_maxperm = 0
The untuned default settings are
MINPERM%=20%, MAXPERM%=80%, and MAXCLIENT%=80%. There is no "correct"
value for these parameters and only extensive testing will reveal the optimal
values. The optimal value may be different for different databases on the same
system, so keep this in mind when tuning the VMM and choose values which work
best for all of the databases you will be running on the system & do take
your AIX sys admin help.
Use the following values as a
starting point:
MINPERM% = 10-15%, MAXPERM% =
20-30%, MAXCLIENT% = MAXPERM%
To get a snapshot of how much
physical memory is being used by AIX to buffer file I/O, run the command:
/usr/bin/svmon -G
The svmon command is part of the
same AIX fileset "bos.perf.tune" that vmo belongs to. The last line
of the svmon output should be "in use". Add the values for "in
use / pers" and "in use /
clnt". Now divide the sum by the value for "memory / size". For best
database performance, this value should generally not be higher than 30%
(0.30).
Here is an example of the svmon
output:
#sudo svmon -G
size
inuse free pin
virtual
memory 131072
129432 1640 11704 50091
pg space 7864320 278498
pg space 7864320 278498
work pers clnt other
pin 11704 0 0 0
in use 47062 76126 6244 0
in use 47062 76126 6244 0
In this example, (in use / pers) 76126 plus (in use / clnt) 6244
equals 82370.
82370 divided by (memory / size) 131072 equals 0.628 or approximately 63% of physical memory being used by
AIX to buffer file I/O. This indicates the AIX VMM needs to be tuned to allow
more physical memory to be used by Oracle and other processes and less physical
memory to be used to buffer file I/O.
Remember that although AIX
associates this memory with the Oracle processes (because Oracle requested the
file I/O), all of the memory used to buffer file I/O is completely allocated
and controlled by AIX, not Oracle. If you need help checking, setting, or
tuning the AIX VMM, contact your AIX systems administrator and/or IBM Support.
IBM has introduced a new VMM
parameter which is also very helpful with this issue. The parameter is...
lru_file_repage
#sudo vmo -a -F | grep -i lru_file
lru_file_repage = 1
The default value is "1",
but it is recommended to set this to "0".This
setting hints to the VMM to only steal
file pages (from the AIX file buffer cache) and leave the computational
pages (from the SGA) alone.
This new lru_file_repage parameter
is only available on AIX 5.2 ML04 or higher and AIX 5.3 ML01 or higher.
NOTE: If you
are using an older AIX system which does not support the lru_file_repage
parameter, then you must use the "legacy" settings above. If you are
using a newer AIX system which does support the lru_file_repage parameter, then
you should use the revised settings below instead.
As documented in the IBM "VMM Tuning Tip" referenced below, if you are running on AIX 5.2 ML04 or higher, or AIX 5.3 ML01 or higher, IBM is now recommending the following VMM settings for use with programs which need to protect computational memory (like Oracle):
strict_maxperm=0 (default)
strict_maxclient=1 (default)
lru_file_repage=0
maxperm%=90
minperm%=5 (physical RAM <32 GB)
minperm%=10 (physical RAM >32 GB but <64 GB)
minperm%=20 (physical RAM >64 GB)
v_pinshm=1
maxpin%=percent_of_real_memory
strict_maxclient=1 (default)
lru_file_repage=0
maxperm%=90
minperm%=5 (physical RAM <32 GB)
minperm%=10 (physical RAM >32 GB but <64 GB)
minperm%=20 (physical RAM >64 GB)
v_pinshm=1
maxpin%=percent_of_real_memory
(IBM does not recommend decreasing
the maxpin% value)
Where "percent_of_real_memory" = ( (size of SGA / size of physical memory) *100) + 3
and
Set Oracle database parameter LOCK_SGA to TRUE in the pfile/spfile
and
Sum of all SGAs on the system do not exceed approximately 60% of physical memory
Where "percent_of_real_memory" = ( (size of SGA / size of physical memory) *100) + 3
and
Set Oracle database parameter LOCK_SGA to TRUE in the pfile/spfile
and
Sum of all SGAs on the system do not exceed approximately 60% of physical memory
References:
Excessive
Paging on IBM AIX POWER Systems (64-bit) After OS Upgrade (Doc ID 1503072.1)
AIX:
Database performance gets slower the longer the database is running (Doc ID
316533.1)
Solution:
Kernal parameter Settings:
numperm_global = 0
lru_file_repage = 0
minperm%=5
(physical RAM <32 GB)
minperm%=10
(physical RAM >32 GB but <64 GB)
minperm%=20 (physical RAM >64 GB)
minperm%=20 (physical RAM >64 GB)
minperm% = 5
maxperm% = 90
#sudo vmo -a -F | grep maxclient
maxclient% = 90
Solution 1, 2 & 3 are most effective.
No comments:
Post a Comment