HugePages是集成到Linux kernel 2.6中的一个功能。启用HugePages可以使用操作系统来支持比缺省的内存页(4KB)更大的内存页。使用非常大的内存页大小可以通过减少访问页表条目所需要的系统资源数量而提高系统性能。HugePages对于32位与64位系统都是有效的。HugePage的大小范围从2MB到256MB,依赖于内核版本和硬件架构。对于Oracle数据库,使用HugePages减少操作系统维护内存页
状态并增加Translation Lookaside Buffer(TLB)的撞击率。
1.使用HugePages来优化SGA
不使用HugePages时,操作系统将保持每个内存页大小为4KB,当为SGA分配内存页时,操作系统内核必须对分配给SGA的每个4KB页使用页生命周期(脏,可用,映射到进程,等等)持续更新。
使用HugePages时,操作系统页表(虚拟内存到物理内存的映射)很小,因为每个页表条目指向的内存页大小从2MB到256MB。同时内核有比较少的内存页生命周期被监控。例如,如果64位硬件使用HugePages,并且想要映射256MB的内存,你可能只需要一个页表条目(PTE)。如果不使用HugePages并且想要映射256MB内存,那么必须有256*1024KB/4KB=65536个PTEs。
HugePages提供了以下优点:
通过增加TLB撞击率来提高性能
内存页被锁定在内存中并且不会发生交换,对共享内存结构比如SGA提供了随机访问
连续内存页预分配除了用于系统的共享内存比如SGA不能用于其它的目的
因为使用大的内存页大小所以虚拟内存相关的内核有较少性能开销
2 对Linux配置HugePages
运行以下命令来判断内核是否支持HugePages:
[root@jyrac1 ~]# uname -r 2.6.18-164.el5 [root@jyrac1 ~]# grep Huge /proc/meminfo HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB
有一些Linux缺省情况下是不支持HugePages的。 对于这样的系统使用config_hugetlbfs和config_hugetlb_page配置选项来构建Linux内核。config_hugetlbfs位于文件系统并且当你选择config_hugetlbfs时需要同时选择config_hugetlb_page。
编辑/etc/security/limits.conf文件来设置memlock。memlock设置以KB为单位,并且当启用HugePages内存时,最大锁定内存限制应该被设置为当前可随机访问内存的90%,当没有启用HugePages内存时,最大锁定内存限制应该被设置成至少3145728KB(3GB)。例如,如果有2G可随机访问内存,并且增加以下条目来增加最大锁定内存地址空间:
[root@jyrac1 ~]# vi /etc/security/limits.conf grid soft memlock 2097152 grid hard memlock 2097152 oracle soft memlock 2097152 oracle hard memlock 2097152
也可以将memlock的值设置为比SGA的值大
以grid用户登录,并执行ulimit -l命令来验证新设置的memlock是否生效
[grid@jyrac1 ~]$ ulimit -l 2097152
以oracle用户登录,并执行ulimit -l命令来验证新设置的memlock是否生效
[oracle@jyrac1 ~]$ ulimit -l 2097152
运行以下命令来显示Hugepagesize变量:
[oracle@jyrac1 ~]$ grep Hugepagesize /proc/meminfo Hugepagesize: 2048 kB
完成以下过程来创建一个脚本用来为当前共享内存段计算hugepages配置的建议值创建一个hugepages_settings.sh脚本并增加以下内容:
[root@jyrac1 /]# vi hugepages_settings.sh #!/bin/bash # # hugepages_settings.sh # # Linux bash script to compute values for the # recommended HugePages/HugeTLB configuration # on Oracle Linux # # Note: This script does calculation for all shared memory # segments available when the script is run, no matter it # is an Oracle RDBMS shared memory segment or not. # # This script is provided by Doc ID 401749.1 from My Oracle Support # http://support.oracle.com # Welcome text echo " This script is provided by Doc ID 401749.1 from My Oracle Support (http://support.oracle.com) where it is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments on Oracle Linux. Before proceeding with the execution please note following: * For ASM instance, it needs to configure ASMM instead of AMM. * The 'pga_aggregate_target' is outside the SGA and you should accommodate this while calculating SGA size. * In case you changes the DB SGA size, as the new SGA will not fit in the previous HugePages configuration, it had better disable the whole HugePages, start the DB with new SGA size and run the script again. And make sure that: * Oracle Database instance(s) are up and running * Oracle Database 11g Automatic Memory Management (AMM) is not setup (See Doc ID 749851.1) * The shared memory segments can be listed by command: # ipcs -m Press Enter to proceed..." read # Check for the kernel version KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'` # Find out the HugePage size HPG_SZ=`grep Hugepagesize /proc/meminfo | awk '{print $2}'` if [ -z "$HPG_SZ" ];then echo "The hugepages may not be supported in the system where the script is being executed." exit 1 fi # Initialize the counter NUM_PG=0 # Cumulative number of pages required to handle the running shared memory segments for SEG_BYTES in `ipcs -m | cut -c44-300 | awk '{print $1}' | grep "[0-9][0-9]*"` do MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q` if [ $MIN_PG -gt 0 ]; then NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q` fi done RES_BYTES=`echo "$NUM_PG * $HPG_SZ * 1024" | bc -q` # An SGA less than 100MB does not make sense # Bail out if that is the case if [ $RES_BYTES -lt 100000000 ]; then echo "***********" echo "** ERROR **" echo "***********" echo "Sorry! There are not enough total of shared memory segments allocated for HugePages configuration. HugePages can only be used for shared memory segments that you can list by command: # ipcs -m of a size that can match an Oracle Database SGA. Please make sure that: * Oracle Database instance is up and running * Oracle Database 11g Automatic Memory Management (AMM) is not configured" exit 1 fi # Finish with results case $KERN in '2.2') echo "Kernel version $KERN is not supported. Exiting." ;; '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`; echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;; '2.6') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '3.8') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '3.10') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; '4.1') echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;; esac # End
执行以下命令来改变hugepages_settings.sh脚本的权限
[root@jyrac1 /]# chmod +x hugepages_settings.sh
运行hugepages_settings.sh脚本来计算hugepages配置的参数值
[root@jyrac1 /]# ./hugepages_settings.sh This script is provided by Doc ID 401749.1 from My Oracle Support (http://support.oracle.com) where it is intended to compute values for the recommended HugePages/HugeTLB configuration for the current shared memory segments on Oracle Linux. Before proceeding with the execution please note following: * For ASM instance, it needs to configure ASMM instead of AMM. * The 'pga_aggregate_target' is outside the SGA and you should accommodate this while calculating SGA size. * In case you changes the DB SGA size, as the new SGA will not fit in the previous HugePages configuration, it had better disable the whole HugePages, start the DB with new SGA size and run the script again. And make sure that: * Oracle Database instance(s) are up and running * Oracle Database 11g Automatic Memory Management (AMM) is not setup (See Doc ID 749851.1) * The shared memory segments can be listed by command: # ipcs -m Press Enter to proceed... *********** ** ERROR ** *********** Sorry! There are not enough total of shared memory segments allocated for HugePages configuration. HugePages can only be used for shared memory segments that you can list by command: # ipcs -m of a size that can match an Oracle Database SGA. Please make sure that: * Oracle Database instance is up and running * Oracle Database 11g Automatic Memory Management (AMM) is not configured
从上面的信息可以看到需要确认Oracle实例是否正在运行,如果是Oracle 11g不能使用AMM
[root@jyrac1 ~]# ps -ef | grep pmon grid 4116 1 0 Apr18 ? 00:00:03 asm_pmon_+ASM1 oracle 4944 1 0 Apr18 ? 00:00:03 ora_pmon_jyrac1 root 18184 29273 0 15:15 pts/1 00:00:00 grep pmon
上面信息可以看到Oracle实例正在运行。
[grid@jyrac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 15:20:23 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> set long 900 SQL> set linesize 900 SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ instance_name string +ASM1 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ memory_max_target big integer 1076M memory_target big integer 1076M [oracle@jyrac1 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 15:21:04 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options SQL> set long 900 SQL> set linesize 900 SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name string jyrac1 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hi_shared_memory_address integer 0 memory_max_target big integer 2G memory_target big integer 2G shared_memory_address integer 0
确实asm与数据库实例都启用了AMM,需要禁用AMM但是可以使用ASMM修改ASM实例,禁用AMM,但使用ASMM,如果是RAC所有节点都需要修改
SQL> alter system set sga_max_size=640M scope=spfile sid='*'; System altered. SQL> alter system set sga_target=640M scope=spfile sid='*'; System altered. SQL> alter system set pga_aggregate_target=320M scope=spfile sid='*'; System altered. SQL> alter system set memory_target=0 scope=spfile sid='*'; System altered.
这里对于memory_target不能使用reset否则会出现以下错误:
SQL> startup ORA-01078: failure in processing system parameters ORA-00843: Parameter not taking MEMORY_MAX_TARGET into account ORA-00849: SGA_TARGET 671088640 cannot be set to more than MEMORY_MAX_TARGET 0.
SQL> alter system reset memory_max_target scope=spfile sid='*'; System altered.
修改数据库实例,禁用AMM,但使用ASMM,如果是RAC所有节点都需要修改
SQL> alter system set sga_max_size=640M scope=spfile sid='*'; System altered. SQL> alter system set sga_target=640M scope=spfile sid='*'; System altered. SQL> alter system set pga_aggregate_target=320M scope=spfile sid='*'; System altered. SQL> alter system reset memory_max_target scope=spfile sid='*'; System altered. SQL> alter system reset memory_target scope=spfile sid='*'; System altered.
重启ASM与数据库实例,如果是RAC所有节点都需要重启,首先停止ASM与数据库实例
[grid@jyrac1 ~]$ srvctl stop asm -n jyrac1 -f [grid@jyrac1 ~]$ srvctl stop asm -n jyrac2 -f [grid@jyrac1 ~]$ srvctl stop database -d jyrac [grid@jyrac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.CRSDG.dg OFFLINE OFFLINE jyrac1 OFFLINE OFFLINE jyrac2 ora.DATADG.dg OFFLINE OFFLINE jyrac1 OFFLINE OFFLINE jyrac2 ora.LISTENER.lsnr ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.asm OFFLINE OFFLINE jyrac1 Instance Shutdown OFFLINE OFFLINE jyrac2 Instance Shutdown ora.gsd ONLINE OFFLINE jyrac1 ONLINE OFFLINE jyrac2 ora.net1.network ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.ons ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.registry.acfs OFFLINE OFFLINE jyrac1 OFFLINE OFFLINE jyrac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE jyrac2 ora.cvu 1 ONLINE ONLINE jyrac2 ora.jyrac.db 1 OFFLINE OFFLINE Instance Shutdown 2 OFFLINE OFFLINE Instance Shutdown ora.jyrac1.vip 1 ONLINE ONLINE jyrac1 ora.jyrac2.vip 1 ONLINE ONLINE jyrac2 ora.oc4j 1 ONLINE ONLINE jyrac2 ora.scan1.vip 1 ONLINE ONLINE jyrac2
启动ASM与数据库实例
grid@jyrac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 17:48:32 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ASM instance started Total System Global Area 669581312 bytes Fixed Size 1366724 bytes Variable Size 643048764 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASM diskgroups volume enabled SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ instance_name string +ASM2 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ memory_max_target big integer 0 memory_target big integer 0 SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ lock_sga boolean FALSE sga_max_size big integer 640M sga_target big integer 640M grid@jyrac2 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Wed Apr 20 17:48:32 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ASM instance started Total System Global Area 669581312 bytes Fixed Size 1366724 bytes Variable Size 643048764 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASM diskgroups volume enabled SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ instance_name string +ASM2 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ memory_max_target big integer 0 memory_target big integer 0 SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ---------------------- ------------------------------ lock_sga boolean FALSE sga_max_size big integer 640M sga_target big integer 640M [grid@jyrac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.CRSDG.dg ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.DATADG.dg ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.LISTENER.lsnr ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.asm ONLINE ONLINE jyrac1 Started ONLINE ONLINE jyrac2 Started ora.gsd ONLINE OFFLINE jyrac1 ONLINE OFFLINE jyrac2 ora.net1.network ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.ons ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.registry.acfs ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE jyrac1 ora.cvu 1 ONLINE ONLINE jyrac1 ora.jyrac.db 1 OFFLINE OFFLINE Instance Shutdown 2 OFFLINE OFFLINE Instance Shutdown ora.jyrac1.vip 1 ONLINE ONLINE jyrac1 ora.jyrac2.vip 1 ONLINE ONLINE jyrac2 ora.oc4j 1 ONLINE ONLINE jyrac1 ora.scan1.vip 1 ONLINE ONLINE jyrac1
从上面的信息可以看到asm实例已经启动了并且禁用了AMM
[grid@jyrac1 ~]$ srvctl start database -d jyrac [grid@jyrac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.CRSDG.dg ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.DATADG.dg ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.LISTENER.lsnr ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.asm ONLINE ONLINE jyrac1 Started ONLINE ONLINE jyrac2 Started ora.gsd ONLINE OFFLINE jyrac1 ONLINE OFFLINE jyrac2 ora.net1.network ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.ons ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 ora.registry.acfs ONLINE ONLINE jyrac1 ONLINE ONLINE jyrac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE jyrac1 ora.cvu 1 ONLINE ONLINE jyrac1 ora.jyrac.db 1 ONLINE ONLINE jyrac1 Open 2 ONLINE ONLINE jyrac2 Open ora.jyrac1.vip 1 ONLINE ONLINE jyrac1 ora.jyrac2.vip 1 ONLINE ONLINE jyrac2 ora.oc4j 1 ONLINE ONLINE jyrac1 ora.scan1.vip 1 ONLINE ONLINE jyrac1 SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name string jyrac1 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hi_shared_memory_address integer 0 memory_max_target big integer 0 memory_target big integer 0 shared_memory_address integer 0 SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 640M sga_target big integer 640M SQL> show parameter instance_name NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ instance_name string jyrac2 SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hi_shared_memory_address integer 0 memory_max_target big integer 0 memory_target big integer 0 shared_memory_address integer 0 SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 640M sga_target big integer 640M
数据库也已经成功启动并且禁用了AMM
再次执行hugepages_settings.sh脚本计算HugePages的大小
[root@jyrac1 /]# ./hugepages_settings.sh Recommended setting: vm.nr_hugepages = 649
编辑/etc/sysctl.conf文件增加参数vm.nr_hugepages = 649,并执行sysctl -p命令使用修改立即生效,但oracle实例并没有使用HugePages从HugePages_Total与HugePages_Free相等可以判断出来。
[root@jyrac1 /]# vi /etc/sysctl.conf vm.nr_hugepages = 649 [root@jyrac1 /]# sysctl -p [root@jyrac1 /]# grep Huge /proc/meminfo HugePages_Total: 649 HugePages_Free: 649 HugePages_Rsvd: 0 Hugepagesize: 2048 kB
重启实例
SQL> startup ASM instance started Total System Global Area 669581312 bytes Fixed Size 1366724 bytes Variable Size 643048764 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASM diskgroups volume enabled
查看asm实例的alert_+ASM1.log可以看到如下信息:
Starting ORACLE instance (normal) ************************ Large Pages Information ******************* Per process system memlock (soft) limit = 2048 MB Total Shared Global Region in Large Pages = 642 MB (100%) Large Pages used by this instance: 321 (642 MB) Large Pages unused system wide = 328 (656 MB) Large Pages configured system wide = 649 (1298 MB) Large Page size = 2048 KB
SQL> startup ORACLE instance started. Total System Global Area 669581312 bytes Fixed Size 1366724 bytes Variable Size 243270972 bytes Database Buffers 419430400 bytes Redo Buffers 5513216 bytes Database mounted. Database opened.
查看实例jyrac1的alert_jyrac1.log可以看到如下信息:
Starting ORACLE instance (normal) ************************ Large Pages Information ******************* Per process system memlock (soft) limit = 2048 MB Total Shared Global Region in Large Pages = 642 MB (100%) Large Pages used by this instance: 321 (642 MB) Large Pages unused system wide = 7 (14 MB) Large Pages configured system wide = 649 (1298 MB) Large Page size = 2048 KB [root@jyrac1 /]# grep Huge /proc/meminfo HugePages_Total: 649 HugePages_Free: 239 HugePages_Rsvd: 232 Hugepagesize: 2048 kB
从上面的信息可以看到已经使用了Hugepages
3.HugePages的限制
HugePages有以下限制:
a.对于Oracle 11g及以上版本数据库实例必须对memory_target与memory_max_target参数执行alter system reset命令,但对于ASM实例,对于memory_target参数只能设置为0。
b.AMM与HugePages是不兼容的,当使用AMM,整个SGA内存通过在/dev/shm创建文件来进行内存的分配,当使用AMM分配SGA时,HugePages不会被保留。
c.如果在32位系统中使用VLM,那么对数据库buffer cache不能使用HugePages。但对于SGA中的其它组件比如shared_pool,
large_pool等等可以使用HugePages。对于VLM(buffer cache)分配内存是通过使用共享内存文件系统(ramfs/tmpfs/shmfs)来实现的。
d.HugePgaes在系统启动后不受分配或释放,除非系统管理员通过修改可用页数或改变池大小来改变HugePages的配置。如果在系统启动时内存中没有保留所需要内存空间,那么HugePages会分配失败。
e.确保HugePages配置合理,如果内存耗尽,应用将不能使用HugePages。
f.如果当实例启动用没有足够的HugePages并且参数use_large_pages设置为only,那么Oracle数据库将会启动失败并向alert.log中记录相关信息。