[20180305]手工模拟buffer busy wait.txt

简介: [20180305]手工模拟buffer busy wait.txt --//一般出现buffer busy wait原因,主要是对热块,大量dml操作. --//一种提法:oracle读不会阻塞写,写不会阻塞读,实际上写一定程度会阻塞读,只不过时间很短罢了.

[20180305]手工模拟buffer busy wait.txt

--//一般出现buffer busy wait原因,主要是对热块,大量dml操作.
--//一种提法:oracle读不会阻塞写,写不会阻塞读,实际上写一定程度会阻塞读,只不过时间很短罢了.

--//以前vage提到过一种特殊情况导致buffer busy wait,那就是写日志缓慢,也会导致这种情况出现,昨天看
--//http://www.askmaclean.com/archives/why-slow-redo-write-cause-buffer-busy-wait.html
--//自己也重新测试看看:

1.环境:
SCOTT@book> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production


2.测试:
SCOTT@book> @ &r/s
SCOTT@book(41,2229)> @ &r/spid
       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
        41       2229 11184                    DEDICATED 11185       27        241 alter system kill session '41,2229' immediate;

--//进程号=11185
SCOTT@book(41,2229)> create table deptx as select * from dept;
Table created.

SCOTT@book(41,2229)> select rowid,deptx.* from deptx where rownum=1;
ROWID                  DEPTNO DNAME          LOC
------------------ ---------- -------------- -------------
AAAWGMAAEAAAAIrAAA         10 ACCOUNTING     NEW YORK

SCOTT@book(41,2229)> @ &r/rowid AAAWGMAAEAAAAIrAAA
    OBJECT       FILE      BLOCK        ROW ROWID_DBA            DBA                  TEXT
---------- ---------- ---------- ---------- -------------------- -------------------- ----------------------------------------
     90508          4        555          0  0x100022B           4,555                alter system dump datafile 4 block 555 ;


--//终端:
gdb $(which oracle) 11185
...

(gdb) b kcrf_commit_force
Breakpoint 1 at 0x96cf4e4

--//session 1:
SCOTT@book(41,2229)> update deptx set dname=lower(dname) where deptno=10;
1 row updated.
SCOTT@book(41,2229)> commit ;
--//挂起!!

--//终端界面:
Breakpoint 1, 0x00000000096cf4e4 in kcrf_commit_force ()

--//session 2:
SCOTT@book> @ &r/s
SCOTT@book(54,1201)> @ &r/spid
       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
        54       1201 11208                    DEDICATED 11209       28         38 alter system kill session '54,1201' immediate;

--//进程号=11209
SCOTT@book(54,1201)> @ &r/wait
no rows selected

SCOTT@book(54,1201)> select * from deptx;
--//挂起!!

--//session 3:
SYS@book> @ &r/wait
P1RAW            P2RAW            P3RAW                    P1         P2         P3        SID    SERIAL#       SEQ# EVENT                                    STATUS   STATE               WAIT_TIME_MICRO SECONDS_IN_WAIT WAIT_CLASS
---------------- ---------------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------------------------------- -------- ------------------- --------------- --------------- --------------------
0000000062657100 0000000000000001 00               1650815232          1          0         41       2229         85 SQL*Net message from client              ACTIVE   WAITED KNOWN TIME           6655081              21 Idle
0000000000000004 000000000000022B 0000000000000001          4        555          1         54       1201         51 buffer busy waits                        ACTIVE   WAITING                     3121034               3 Concurrency

SYS@book> @ &r/ev_name 'buffer busy waits'
    EVENT#   EVENT_ID NAME                                     PARAMETER1           PARAMETER2           PARAMETER3           WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------- ---------------------------------------- -------------------- -------------------- -------------------- ------------- ----------- --------------------
        95 2161531084 buffer busy waits                        file#                block#               class#                  3875070507           4 Concurrency

--//sid=54出现buffer busy waits,奇怪的是sid=41,出现的是SQL*Net message from client.
--//你可以发现buffer busy waits 的P1,P2参数与前面查询访问的块号一致dba=4,555.


--//终端界面:
(gdb) bt
#0  0x00000000096cf4e4 in kcrf_commit_force ()
#1  0x00000000096cb7bf in kcrfw_redo_gen ()
#2  0x0000000000e3d60b in kcb_commit_main ()
#3  0x0000000000e3c949 in kcb_commit ()
#4  0x0000000000f6412f in ktiCommitCleanout ()
#5  0x0000000000f63806 in kticmt ()
#6  0x0000000000f1e6dc in ktucmt ()
#7  0x00000000094facf5 in ktcCommitTxn ()
#8  0x000000000304023c in ktdcmt ()
#9  0x000000000210f400 in k2lcom ()
#10 0x0000000002128f65 in k2send ()
#11 0x0000000000f05073 in xctctl ()
#12 0x0000000000f03392 in xctCommitTxn ()
#13 0x0000000001ed651b in kksExecuteCommand ()
#14 0x00000000095c8398 in opiexe ()
#15 0x0000000001ba5992 in kpoal8 ()
#16 0x00000000095bbdad in opiodr ()
#17 0x00000000097a629f in ttcpip ()
#18 0x000000000186470e in opitsk ()
#19 0x0000000001869235 in opiino ()
#20 0x00000000095bbdad in opiodr ()
#21 0x00000000018607ac in opidrv ()
#22 0x0000000001e3a48f in sou2o ()
#23 0x0000000000a29265 in opimai_real ()
#24 0x0000000001e407ad in ssthrdmain ()
#25 0x0000000000a291d1 in main ()

--//如果在一个块内大量的修改与查询,就很容易出现buffer busy waits,我这里在commit设置断点,很明显
--//要写redo 缓存到redo文件.如果在应用中出现log file sync+buffer busy wait优先解决log file sync.
--//当然一些dml的热块也要注意.

--//转载:http://www.askmaclean.com/archives/why-slow-redo-write-cause-buffer-busy-wait.html

3.简单来说这个示例说明了几点:

OLTP类型的小DML操作一般都会是immediate block cleanout的,这要求在commit之前对block做change kcbchg
在commit kcrf_commit_force完成前都不会释放对该block buffer的buffer pin
由上述2点造成的buffer pin最终会影响select和其他insert/update/delete 形成buffer busy wait
由于慢的lgwr写redo log会造成 kcrf_commit_force commit的缓慢,表现在等待事件上就是log file sync
由于block cleanout时pin block buffer且commit 慢,则会导致更长时间的buffer busy wait
若log file sync是由lgwr 写redo log慢(log file parallel write)引起的,则它的另一个效应就是buffer busy wait增多

若看到AWR中log file sync+buffer busy wait是主要等待事件,则优先解决log file sync ,因为buffer busy wait实际可能是受害者

AWR中与commit cleanout相关的 Instance activity 有好几个

commit cleanout failures: block lost
commit cleanout failures: buffer being written
commit cleanout failures: callback failure
commit cleanout failures: cannot pin
commit cleanouts
commit cleanouts successfully completed

4.补充:
--//以上测试等一会:
--//session 3:
SYS@book> @ &r/wait
P1RAW            P2RAW            P3RAW                    P1         P2         P3        SID    SERIAL#       SEQ# EVENT                                    STATUS   STATE               WAIT_TIME_MICRO SECONDS_IN_WAIT WAIT_CLASS
---------------- ---------------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------------------------------- -------- ------------------- --------------- --------------- --------------------
0000000062657100 0000000000000001 00               1650815232          1          0         41       2229         85 SQL*Net message from client              ACTIVE   WAITED KNOWN TIME           6655081             255 Idle
0000000000000004 000000000000022B 0000000000000001          4        555          1         54       1201         51 buffer busy waits                        ACTIVE   WAITING                   237183753             237 Concurrency
0000000000000003 0000000000000090 0000000000000013          3        144         19        222          1      25198 buffer busy waits                        ACTIVE   WAITING                   225469340             225 Concurrency

SYS@book> @ &r/ev_name 'buffer busy waits'
    EVENT#   EVENT_ID NAME                                     PARAMETER1           PARAMETER2           PARAMETER3           WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------- ---------------------------------------- -------------------- -------------------- -------------------- ------------- ----------- --------------------
        95 2161531084 buffer busy waits                        file#                block#               class#                  3875070507           4 Concurrency

--//又出现一个会话阻塞.buffer busy waits的P1,P2的参数是file#,block#.

SYS@book> @ &r/bh 3 144
HLADDR           DBARFIL      DBABLK CLASS CLASS_TYPE         STATE             TCH CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ BA               OBJECT_NAME
---------------- ------- ----------- ----- ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- --------------------
0000000084F763C0       3         144    19                    xcur              173          0          0          0          0          0 000000007A33C000

SYS@book> @ &r/bh 4 555
HLADDR           DBARFIL      DBABLK CLASS CLASS_TYPE         STATE             TCH CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ BA               OBJECT_NAME
---------------- ------- ----------- ----- ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- --------------------
0000000084EB03C8       4         555     1 data block         xcur                2          0          0          0          0          0 00000000648CE000 DEPTX
0000000084EB03C8       4         555     1 data block         cr                  2  392461831          3          0          0          0 00000000648D0000 DEPTX
0000000084EB03C8       4         555     1 data block         cr                  1  392461277          3          0          0          0 00000000733B2000 DEPTX
0000000084EB03C8       4         555     1 data block         free                0          0          0          0          0          0 0000000070730000
--//HLADDR 不一样,不知道为什么出现buffer busy waits sid=222.

SYS@book> select PROGRAM,sid,serial#,sql_id from v$session where sid=222;
PROGRAM                           SID    SERIAL# SQL_ID
-------------------------- ---------- ---------- -------------
oracle@gxqyydg4 (MMON)            222          1

SYS@book> select PROGRAM,sid,serial#,sql_id,PREV_SQL_ID from v$session where sid=222;
PROGRAM                           SID    SERIAL# SQL_ID        PREV_SQL_ID
-------------------------- ---------- ---------- ------------- -------------
oracle@gxqyydg4 (MMON)            222          1               fsbqktj5vw6n9

SYS@book> @ &r/sql_id fsbqktj5vw6n9
old   1: select sql_id,sql_fulltext sqltext from gv$sqlarea where sql_id='&1'
new   1: select sql_id,sql_fulltext sqltext from gv$sqlarea where sql_id='fsbqktj5vw6n9'
SQL_ID        SQLTEXT
------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
fsbqktj5vw6n9 select next_run_date, obj#, run_job, sch_job from (select decode(bitand(a.flags, 16384), 0, a.next_run_date,                a.last_enabled_time) next_run_date,       a.obj# obj#, decode(bitand(a.flags
              , 16384), 0, 0, 1) run_job, a.sch_job  sch_job  from  (select p.obj# obj#, p.flags flags, p.next_run_date next_run_date,      p.job_status job_status, p.class_oid class_oid,      p.last_enabled_time l
              ast_enabled_time, p.instance_id instance_id,      1 sch_job   from sys.scheduler$_job p   where bitand(p.job_status, 3) = 1    and ((bitand(p.flags, 134217728 + 268435456) = 0) or         (bitand(p.jo
              b_status, 1024) <> 0))    and bitand(p.flags, 4096) = 0    and p.instance_id is NULL    and (p.class_oid is null      or (p.class_oid is not null      and p.class_oid in (select b.obj# from sys.schedu
              ler$_class b                          where b.affinity is null)))   UNION ALL   select q.obj#, q.flags, q.next_run_date, q.job_status, q.class_oid,      q.last_enabled_time, q.instance_id, 1   from sy
              s.scheduler$_lightweight_job q   where bitand(q.job_status, 3) = 1    and ((bitand(q.flags, 134217728 + 268435456) = 0) or         (bitand(q.job_status, 1024) <> 0))    and bitand(q.flags, 4096) = 0
                and q.instance_id is NULL    and (q.class_oid is null      or (q.class_oid is not null      and q.class_oid in (select c.obj# from sys.scheduler$_class c                          where c.affinity is
               null)))   UNION ALL   select j.job, 0, from_tz(cast(j.next_date as timestamp),      to_char(systimestamp,'TZH:TZM')), 1, NULL,      from_tz(cast(j.next_date as timestamp),to_char(systimestamp,'TZH:TZ
              M')),     NULL, 0   from sys.job$ j   where (j.field1 is null or j.field1 = 0)    and j.this_date is null) a   order by 1)   where rownum = 1

--//放弃,不知道为什么出现后续的buffer busy waits的等待事件.
--//在终端界面上执行:
(gdb) c
Continuing.

Breakpoint 1, 0x00000000096cf4e4 in kcrf_commit_force ()
(gdb) c
Continuing.


--//session 1:
SCOTT@book(41,2229)> commit ;
Commit complete.
--//提交成功.

--//session 2:
SCOTT@book(54,1201)> select * from deptx;
    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 accounting     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

--//session 3:
SYS@book> @ &r/wait
no rows selected

相关实践学习
阿里云图数据库GDB入门与应用
图数据库(Graph Database,简称GDB)是一种支持Property Graph图模型、用于处理高度连接数据查询与存储的实时、可靠的在线数据库服务。它支持Apache TinkerPop Gremlin查询语言,可以帮您快速构建基于高度连接的数据集的应用程序。GDB非常适合社交网络、欺诈检测、推荐引擎、实时图谱、网络/IT运营这类高度互连数据集的场景。 GDB由阿里云自主研发,具备如下优势: 标准图查询语言:支持属性图,高度兼容Gremlin图查询语言。 高度优化的自研引擎:高度优化的自研图计算层和存储层,云盘多副本保障数据超高可靠,支持ACID事务。 服务高可用:支持高可用实例,节点故障迅速转移,保障业务连续性。 易运维:提供备份恢复、自动升级、监控告警、故障切换等丰富的运维功能,大幅降低运维成本。 产品主页:https://www.aliyun.com/product/gdb
目录
相关文章
|
4月前
|
监控 Linux C++
perf_event_open学习 —— mmap方式读取
perf_event_open学习 —— mmap方式读取
|
8月前
|
Windows
如何在关闭socket连接的时候跳过TIME_WAIT的等待状态
如何在关闭socket连接的时候跳过TIME_WAIT的等待状态
112 0
|
网络协议 PHP
php stream_set_blocking设置非阻塞模式,php stream_set_blocking影响函数fgets()和fread()
php stream_set_blocking设置非阻塞模式,php stream_set_blocking影响函数fgets()和fread()
354 0
|
流计算
四十四、nimbus,supervisor进程自动停止(Read a frame size of ..., which is bigger than the maximum allowable...)
四十四、nimbus,supervisor进程自动停止(Read a frame size of ..., which is bigger than the maximum allowable...)
四十四、nimbus,supervisor进程自动停止(Read a frame size of ..., which is bigger than the maximum allowable...)
|
SQL Oracle NoSQL
gdb调试之--read by other session 等待事件的重现
发生read by other session 等待事件,肯定是两个会话同时物理读同一个数据块,假设有两个会话,会话1发生物理读,会话2正要去物理读发现会话1正在物读理产生,此时会话2发生等待,此等待就是read by other session 等待事件。
|
数据采集 网络协议
一次TIME_WAIT和CLOSE_WAIT故障和解决办法
昨天解决了一个curl调用错误导致的服务器异常,具体过程如下: 里头的分析过程有提到,通过查看服务器网络状态检测到服务器有大量的CLOSE_WAIT的状态。   在服务器的日常维护过程中,会经常用到下面的命令:   它会显示例如下面的信息: TIME_WAIT 814CLOSE_WAIT 1FIN_WAIT1 1ESTABLISHED 634SYN_RECV 2LAST_ACK 1 常用的三个状态是:ESTABLISHED 表示正在通信,TIME_WAIT 表示主动关闭,CLOSE_WAIT 表示被动关闭。
2714 0