Professional Documents
Culture Documents
Give Feedback... You have been directed to this document based on an ID match. Alternatively, click here to search on this phrase. Hide
Symptoms Yes
No
Changes
Cause
Document Details
Solution
Configure hugepages when using large SGA size:
Type:
PROBLEM
References Status:
PUBLISHED
Last Major
13-May-2014
Update:
22-Mar-2018
APPLIES TO: Last Update:
Oracle Database - Enterprise Edition - Version 9.2.0.8 to 11.2.0.4 [Release 9.2 to 11.2]
Related Products
Linux x86-64
Oracle Database - Enterprise
Edition
SYMPTOMS
Information Centers
For 9.2.0.8, 4 node RAC cluster on Linux x86-64 RHEL 4.0, frequently a node becomes unresponsive and an instance is evicted
with ORA-29740.
For 10g/11g, the symptom will be frequent node eviction (node reboot) with instance termination and error ORA-29702.
Document References
CHANGES
For example, from ps output on node 1, we can see during first 43 min at 02:00 hour, kswapd0 only used 21 sec CPU. But
between 02:43 to 02:47 gap, within 4 min, kswapd0 used 40 sec CPU. Between the next gap of 02:52 to 02:59, another 33 sec
CPU was used by kswapd0.
Similarly on node 4, during first 47 min of hour 02:00, kswapd0 only used 16 sec CPU, but during 8
mins gap of 02:47 to 02:55, this process used 1min 21 sec CPU.
This indicates that kswapd0 is working hard during CPU spike time. This could happen if there is a lot of memory pages need to
be maintained and hugepages is not configured.
There are 32GB physical memory on the server, max SGA is 12GB, but hugepages is not used, thus all memory is managed in 4k
https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=560618461853827&id=461662.1&_adf.ctrl-state=uaimt4tjs_441 1/3
6/1/2018 Document 461662.1
page size. The spinning in kernel mode caused by managing those memory page could cause a CPU spike and further cause an
instance eviction (or node eviction) when heartbeat ping is not responded to due to lack of CPU.
SOLUTION
If the expected value of 6146 does not appear the system will have to be rebooted because
there is either not enough memory or not enough physically contiguous pages free for allocation. If 6146 appears, then we are
done from an O/S standpoint.
4. After hugepages setup and instance restart, please check output of "cat /proc/meminfo", it should show large number of
hugepages being consumed, for example:
HugePages_Total: 6146
HugePages_Free: 3120 << this number is decreasing when more memory is consumed
Hugepagesize: 2048 kB
HugePages_Total: 6146
HugePages_Free: 6140 << this number does not decrease
Hugepagesize: 2048 kB
Then it is likely the hugepages setting is insufficient, please increase vm.nr_hugepages setup until the hugepages are consumed
after the instance restart. Please refer to the following note to obtain recommended value:
NOTE:401749.1 Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration
Note: In RHEL 3.0 environment, if similar issue is experienced and the process consumes most CPU is kscand0, one can consider
set kscand_work_percent=10 (default 100) apart from setup hugepages.
REFERENCES
Related
Products
Oracle Database Products > Oracle Database Suite > Oracle Database > Oracle Database - Enterprise Edition > Clusterware > Cluster Node Reboot/Eviction
Keywords
EVICTION; HEARTBEAT; HIGH CPU USAGE; HUGEPAGES; HUGETLB; INSTANCE EVICTION; NODE EVICTION
https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=560618461853827&id=461662.1&_adf.ctrl-state=uaimt4tjs_441 2/3
6/1/2018 Document 461662.1
Errors
ORA-29702; ORA-29740
Back to Top
Copyright (c) 2018, Oracle. All rights reserved. Legal Notices and Terms of Use Privacy Statement
https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=560618461853827&id=461662.1&_adf.ctrl-state=uaimt4tjs_441 3/3