ORA-4031错误深入解析

本文涉及的产品
公共DNS(含HTTPDNS解析),每月1000万次HTTP解析
全局流量管理 GTM,标准版 1个月
云解析 DNS,旗舰版 1个月
简介: Oracle共享池一个非常经典的错误,下面我们来详细解读,帮你理解共享结构。

报ORA-4031错误时,我们通常可以根据Oracle无法分配多少字节的内存,来判断共享池碎片的严重程度,以下是4031错误官方的解释:
[oracle@guoyj ~]$ oerr ORA 4031

04031, 00000, "unable to allocate %s bytes of shared memory (\"%s\",\"%s\",\"%s\",\"%s\")"

// *Cause: More shared memory is needed than was allocated in the shared

// pool or Streams pool.

// *Action: If the shared pool is out of memory, either use the

// DBMS_SHARED_POOL package to pin large packages,

// reduce your use of shared memory, or increase the amount of

// available shared memory by increasing the value of the

// initialization parameters SHARED_POOL_RESERVED_SIZE and

// SHARED_POOL_SIZE.

// If the large pool is out of memory, increase the initialization

// parameter LARGE_POOL_SIZE.

// If the error is issued from an Oracle Streams or XStream process,

// increase the initialization parameter STREAMS_POOL_SIZE or increase

// the capture or apply parameter MAX_SGA_SIZE.

一、重现ORA-04031错误:

1、第一个实验硬解析产生大量的碎片重现4031错误

2、当前的数据库版本

SQL> select * from v$version where rownum=1;

BANNER


Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

3、关闭ASMM,设置shared pool大小为120M

SQL> show parameter mem

NAME TYPE VALUE


hi_shared_memory_address integer 0

memory_max_target big integer 804M

memory_target big integer 804M

shared_memory_address integer 0

SQL> show parameter sga

NAME TYPE VALUE


lock_sga boolean FALSE

pre_page_sga boolean FALSE

sga_max_size big integer 804M

sga_target big integer 0

SQL>alter system set memory_target=0; --11g的新特性调整(SGA+PGA)

SQL>startup force;

SQL> alter system set sga_target=0;

SQL>startup force;

SQL> show parameter memory_target

NAME TYPE VALUE


memory_target big integer 0

SQL> show parameter sga_target

NAME TYPE VALUE


sga_target big integer 0

SQL> alter system set shared_pool_size=120M;

System altered.

4、创建表并插入数据

create table t1(id int,name varchar2(100));

begin

for i in 1 .. 100000 loop

insert into t1 values(i,'gyj'||i);

commit;

end loop;

end;

/

5、跑下面匿名块模拟产生大量的硬解析

declare

msql varchar2(500);

mcur number;

mstat number;

jg varchar2(4000);

cg number;

begin

mcur:=dbms_sql.open_cursor;

for i in 1..999999999 loop

msql:='select id from gyj.t1 where id='||to_char(i);

dbms_sql.parse(mcur,msql,dbms_sql.native);

dbms_sql.define_column(mcur,1,jg,4000);

mstat:=dbms_sql.execute(mcur);

end loop;

dbms_sql.close_cursor(mcur);

end;

/

6、这个实验可能要等好久才能看到4031错误,硬解析会占用共享池,但执行完后游标马上关闭,共享池内存也会随之释放,所以用这个实验来模拟4031错误不一定能成功,我也是偶尔做成,下面再来个实验,保证马上重现4031错误。。


8、第二个实验:一直打开游标不关闭,直到消耗完共享池导致4031错误

SQL> show parameter open_cursor

NAME TYPE VALUE


open_cursors integer 300

alter system set open_cursors=65535;

SQL> declare

2 msql varchar2(500);

3 mcur number;

4 mstat number;

5 jg varchar2(4000);

6 cg number;

7 begin

8 for i in 1..65535 loop

9 mcur:=dbms_sql.open_cursor;

10 msql:='select id from t1 where id='||to_char(i);

11 dbms_sql.parse(mcur,msql,dbms_sql.native);

12 dbms_sql.define_column(mcur,1,jg,4000);

13 mstat:=dbms_sql.execute(mcur);

14 dbms_sql.column_value(mcur,1,jg);

15 end loop;

16 end;

17 /

declare

*

ERROR at line 1:

ORA-01000: maximum open cursors exceeded

ORA-06512: at "SYS.DBMS_SQL", line 1199

ORA-06512: at line 11

如果报这个错误,说明alter system set open_cursors=65535;设置没生效,重启库

SQL> conn / as sysdba

Connected.

SQL> startup force;

ORACLE instance started.

Total System Global Area 455163904 bytes

Fixed Size 2229224 bytes

Variable Size 138415128 bytes

Database Buffers 310378496 bytes

Redo Buffers 4141056 bytes

Database mounted.

Database opened.

SQL> conn gyj/gyj

Connected.

再次执行

呵呵,耐心等待五分钟之内绝对报错。。。。

SQL> declare

2 msql varchar2(500);

3 mcur number;

4 mstat number;

5 jg varchar2(4000);

6 cg number;

7 begin

8 for i in 1..65535 loop

9 mcur:=dbms_sql.open_cursor;

10 msql:='select id from t1 where id='||to_char(i);

11 dbms_sql.parse(mcur,msql,dbms_sql.native);

12 dbms_sql.define_column(mcur,1,jg,4000);

13 mstat:=dbms_sql.execute(mcur);

14 dbms_sql.column_value(mcur,1,jg);

15 end loop;

16 end;

17 /

declare

*

ERROR at line 1:

ORA-04031: unable to allocate 80 bytes of shared memory ("shared pool","select job, nvl2(last_date, ...","SQLA^337fc737","ctxPlanSig:qksctxPlanSigGet")

ORA-06512: at "SYS.DBMS_SQL", line 1199

ORA-06512: at line 11

9、查看ORA-4031发生了几次

SQL> select indx,kghlurcr,kghlutrn,kghlufsh,kghluops,kghlunfu,kghlunfs

2 from sys.x$kghlu where inst_id = userenv('Instance');

select indx,kghlurcr,kghlutrn,kghlufsh,kghluops,kghlunfu,kghlunfs

*

ERROR at line 1:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","select /+ rule / bucket_cn...","SQLA","tmp")

现在不能做任何大的查询操作了,怎么办?

先退出回话,重新登录:

SQL> exit

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

[oracle@guoyj ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Wed Dec 12 20:50:21 2012

Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select indx,kghlurcr,kghlutrn,kghlufsh,kghluops,kghlunfu,kghlunfs

2 from sys.x$kghlu where inst_id = userenv('Instance');

INDX KGHLURCR KGHLUTRN KGHLUFSH KGHLUOPS KGHLUNFU KGHLUNFS


0 1233 30360 57700 518988 114 4096

kghlunfu:出现ORA-4031的次数,这里指出现了114次的4031错误

kghlunfs:最后一次出现ORA-4031时,申请的分配大小,这里是指最后申请共享池的大小是4096字节的chunk找不到,最终报错。。。。

实际你可以监控制告警日志alert_bxocp.log,里面会显示114次4031错误,为什么报这么多次错误呢,从不同的Free Lists里找空闲的chunk,最后到保留池(SHARED_POOL_RESERVED_SIZE)找,也没找到4096字节的chunk,最后在执 行sql的地方报错4031错误。

10、这里说到了free lists和SHARED_POOL_RESERVED_SIZE

首先说一下保留池,保留池的大小有个参数就是SHARED_POOL_RESERVED_SIZE

关于保留池的隐含参数:

SQL> @?/rdbms/admin/show_para

Enter value for p: SHARED_POOL_RESERVED

old 12: AND upper(i.ksppinm) LIKE upper('%&p%')

new 12: AND upper(i.ksppinm) LIKE upper('%SHARED_POOL_RESERVED%')

P_NAME P_DESCRIPTION P_VALUE ISDEFAULT ISMODIFIED ISADJ


_shared_pool_reserved_min_alloc minimum allocation size in bytes for reserved area 4400 TRUE FALSE FALSE

of shared pool

_shared_pool_reserved_pct percentage memory of the shared pool allocated for 5 TRUE FALSE FALSE

the reserved area

shared_pool_reserved_size size in bytes of reserved area of shared pool 6291456 TRUE FALSE FALSE

SHARED_POOL_RESERVED_SIZE,缺省是SHARED_POOL_SIZE的5%

_SHARED_POOL_RESERVED_PCT缺省 5%

_SHARED_POOL_RESERVED_MIN_ALLOC,缺省4400,超过这个大小才被认为是大对象

从_SHARED_POOL_RESERVED_MIN_ALLOC=4400字节看出申请4096字节的chunk不会到保留池去找,只有符合申请>=4400字节的chunk才会去保留池上找空闲chunk,申请4096字节只会在Free List去找,找不到就最后报错4031错误。

那么我们来看一下共享池的Free Lists管理free内存块(Chunk),把Free Lists链的内容dump出来看一下:

alter session set events 'immediate trace name heapdump level 2';

FREE LISTS:

Bucket 0 size=32

Chunk 074dca368 sz= 32 free " "

Chunk 07b000078 sz= 0 kghdsx

Bucket 1 size=40

Chunk 06e37cf50 sz= 40 free " "

Chunk 06de531b8 sz= 40 free " "

Chunk 071e9a318 sz= 40 free " "

Chunk 06f24c408 sz= 40 free " "

.省略...

.

.

Bucket 40 size=352

.省略...

.

.

Bucket 48 size=416

Chunk 0753eac30 sz= 416 free " "

Chunk 070da9858 sz= 416 free " "

Bucket 49 size=424

Chunk 070bf3000 sz= 424 free " "

Chunk 074197e10 sz= 424 free " "

Bucket 50 size=432

Bucket 51 size=440

Bucket 52 size=448

Chunk 074d45708 sz= 448 free " "

Bucket 53 size=456

Chunk 06e6b1e20 sz= 456 free " "

Chunk 06de86a80 sz= 456 free " "

Chunk 06e344648 sz= 456 free " "

Chunk 06e430e70 sz= 456 free " "

Bucket 54 size=464

Bucket 55 size=472

Bucket 56 size=480

Bucket 57 size=488

Chunk 06be70bc0 sz= 488 free " "

Bucket 58 size=496

Bucket 59 size=504

Chunk 06e405f48 sz= 504 free " "

Bucket 60 size=512

Bucket 61 size=520

Bucket 62 size=528

Bucket 63 size=536

Bucket 64 size=544

.省略...

.

.

Bucket 102 size=848

Bucket 103 size=856

Bucket 104 size=864

Bucket 105 size=872

.省略...

.

.

Bucket 252 size=16408

Bucket 253 size=32792

Bucket 254 size=65560

可以看出有一共有255个Free List链。。。。

每条Free List挂的空闲chunk大小具有一定的范围,从小到大。。。,结合我们上面的实验,产生114次的ORA-04031错误,是从这254链的某条Free List链开始找空闲chunk,直到找到最后一条为止,没找到4096字节的chunk。

二、4031错误产生原因

当尝试在共享池分配大块的连续内存失败(很多时候是由于碎片过多,而并非真是内存不足)时,Oracle首先清除共享池中当前没使用的所有对象,使空闲内存块合并.如果仍然没有足够大的单块内存可以满足需要,就会产生ORA-04031错误.

1、Oracle的bug

2、没使用绑定变量,sql没有足够的共享,产生大量不必要的解析

3、Shared pool设的太小

4、SubPool的划分可能也会导致各分区之间的协调问题,但从Oracle10g开始,允许内存请求在不同的SubPool之间进行切换,提高进请求成功的可能,但不能切换无限次,所以问题仍然存在)

5、open_cursors(0 to 65535) 设的过大

三、解决:4031错误的方法

1、刷共享池

alter system flush shared_pool;

不过这个办法只是暂时的,治标不治本。

2、保留池的设置

参数shared_pool_reserved_size,shared_pool_reserved_min_alloc

select REQUEST_MISSES from v$shared_pool_reserved;

 3、缓存大对象

select * from v$db_object_cache where sharable_mem > 50000

and (type = 'PACKAGE' or type='PACKAGE BODY' or type = 'FUNCTION' or type='PROCEDURE')

and kept = 'NO';

执行dbms_shared_pool.keep('对象名');

DBMS_SHARED_POOL

@?/rdbms/admin/dbmspool.sql

 4、参数open_cursor、session_cached_cursors的设置

5、共享SQL:应用绑定变量或修改参数cursor_sharing

a.使用绑定变量

declare v_sql varchar2(100);

begin for i in 1..30000 loop

v_sql := 'insert /bind/ into gyj.t1(id) values (:1)';

execute immediate v_sql using i;

end loop;

commit;

end;

/

b.cursor_sharing参数

如何找出不能共享cursor的sql

在v$sql查找执行次数较小的sql语句,观察这些sql语句是否是经常执行的。

select SQL_FULLTEXT from v$sql where EXECUTIONS=1 and sql_text like '%from gyj.t1%';

select SQL_FULLTEXT from v$sql where EXECUTIONS=1 order by sql_text;

alter system set cursor_sharing=SIMILAR | FORCE;

 6、增加shared pool空间

select sum(bytes/1024/1024) from v$sgastat where pool='shared pool';

select COMPONENT,CURRENT_SIZE from V$SGA_DYNAMIC_COMPONENTS;

show parameter sga_target

show parameter sga_max_size

alter system set shared_pool_size=300M scope=both;

五、查共享池碎片的一个脚本

SQL> set linesize 1000

SQL> col sga_heap format a15

SQL> col size format a10

SQL> select KSMCHIDX "SubPool", 'sga heap('||KSMCHIDX||',0)'sga_heap,ksmchcom ChunkComment,

2 decode(round(ksmchsiz/1000),0,'0-1K', 1,'1-2K', 2,'2-3K',3,'3-4K',

3 4,'4-5K',5,'5-6k',6,'6-7k',7,'7-8k',8,

4 '8-9k', 9,'9-10k','> 10K') "size",

5 count(*),ksmchcls Status, sum(ksmchsiz) Bytes

6 from x$ksmsp

7 where KSMCHCOM = 'free memory'

8 group by ksmchidx, ksmchcls,

9 'sga heap('||KSMCHIDX||',0)',ksmchcom, ksmchcls,decode(round(ksmchsiz/1000),0,'0-1K',

10 1,'1-2K', 2,'2-3K', 3,'3-4K',4,'4-5K',5,'5-6k',6,

11 '6-7k',7,'7-8k',8,'8-9k', 9,'9-10k','> 10K');

SubPool SGA_HEAP CHUNKCOMMENT size COUNT(*) STATUS BYTES


1 sga heap(1,0) free memory > 10K 94 R-free 19837104

1 sga heap(1,0) free memory 3-4K 54 free 144592

1 sga heap(1,0) free memory 1-2K 1 R-free 680

1 sga heap(1,0) free memory 3-4K 1 R-free 3344

1 sga heap(1,0) free memory 4-5K 151 free 606936

1 sga heap(1,0) free memory 8-9k 3 free 24576

1 sga heap(1,0) free memory 4-5K 2 R-free 7336

1 sga heap(1,0) free memory 0-1K 17490 free 2029992

1 sga heap(1,0) free memory 1-2K 11 free 5944

9 rows selected.

相关文章
|
索引 数据库管理 Linux
|
Oracle 关系型数据库
ORA-00060错误
今天有个同事说生产机有个PERL脚本老是报死锁的错误,脚本里面开启了并行度:错误如下: Failure: ORA-12801: error signaled in parallel query server P126, instance NMS_ODS_NDB1:wgods1 (1)ORA-000...
1326 0
|
SQL 关系型数据库 Oracle
ORA-00911错误
ORA-00911,"911"看着很霸气的错误号,虽然我还是Oracle的初学者,但每次碰到一个未见过的ORA错误号后,都有一种查案的赶脚,根据错误号、OERR、相关错误信息,判断错误原因以及找到解决方案或替代方案,虽然大部分可能还是参考前辈或官方,但碰到一次后,至少是似曾相识了,再次碰到时即使不记得,大概也能有个方向。
1094 0
|
Oracle 网络协议 关系型数据库
|
SQL Oracle 关系型数据库
|
SQL Oracle 关系型数据库
|
SQL Oracle 关系型数据库
|
SQL 关系型数据库 Oracle