思维导图
概述
delete from t where create_time < to_date('2017-01-01','yyyy-mm-dd');
当我们看到这条语句时,会想到什么呢? 一条再简单不过的按照条件删除数据库的操作。 如果大量存在,会不会引起系统性能问题呢?
当这个表的数据足够大时,按照这样的方式来清除数据,代价无疑是非常高昂的。
我们提出使用分区的方式来解决这个问题。比如 保留历史数据的骨折是,删除最早一个月的数据,按照这个需求,将需要删除数据的表,改为分区表。 按照每个月一个分区的方式创建分区表。
这样做的结果是,以前的delete操作就变成了下面的操作:
alter table t drop partition p_name 或者 alter table t truncate partiton p_name ;
这个操作对已一个几GB数据的分区表来讲,执行时间不超过10S。
比起使用分区来操作数据,DELETE方式操作数据的问题是显而易见的,至少存在如下两点:
- 消耗大量的系统资源
- 无法释放空间
DELETE与系统资源
delete 操作消耗的资源
在DELETE操作中,SQL语句首先要扫描表或者索引找到符合条件的记录,然后把他们删除,这个过程中将消耗大量的CPU资源和产生大量的I/O,同时还会产生大量的undo数据。
下面是一个简单的栗子,我们可以看出delete相对于drop或者truncate对资源的消耗情况:
SQL> drop table t purge ; ---> 不放入回收站,直接删除 Table dropped SQL> create table t as select a.OBJECT_ID,a.OBJECT_NAME from dba_objects a; Table created SQL> exec dbms_stats.gather_table_stats(user,'t');---->表分析 PL/SQL procedure successfully completed SQL>
注释:执行drop table xx 语句
drop后的表被放在回收站(user_recyclebin)里,而不是直接删除掉。这样,回收站里的表信息就可以被恢复,或彻底清除。
通过查询回收站user_recyclebin获取被删除的表信息,然后使用语句
flashback table <user_recyclebin.object_name or user_recyclebin.original_name> to before drop [rename to
];
将回收站里的表恢复为原名称或指定新名称,表中数据不会丢失。
若要彻底删除表,则使用语句:drop table <table_name> purge;
清除回收站里的信息
清除指定表:purge table <table_name>;
清除当前用户的回收站:purge recyclebin;
SQL> purge recyclebin;
Done
清除所有用户的回收站:purge dba_recyclebin;
不放入回收站,直接删除则是:drop table xx purge;
oracle@entel1:[/oracle]$sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Tue Jan 17 22:17:53 2017 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> conn xgj/smart Connected. SQL> SQL> set autotrace trace exp stat; -->显示执行计划和统计结果 SQL> delete from t where t.object_id < 10000; 9811 rows deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3335594643 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 4420 | 22100 | 43 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | TABLE ACCESS FULL| T | 4420 | 22100 | 43 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T"."OBJECT_ID"<10000) Statistics ---------------------------------------------------------- 18 recursive calls 10465 db block gets 162 consistent gets 0 physical reads 2640572 redo size 831 bytes sent via SQL*Net to client 797 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 9811 rows processed SQL>
SET AUTOT 用法
用法: SET AUTOT[RACE] {OFF | ON | TRACE[ONLY]} [EXP[LAIN]]
[STAT[ISTICS]]
SET AUTOT[RACE] OFF 停止AutoTrace SET AUTOT[RACE] ON
开启AutoTrace,显示AUTOTRACE信息和SQL执行结果 SET AUTOT[RACE] TRACEONLY
开启AutoTrace,仅显示AUTOTRACE信息 SET AUTOT[RACE] ON EXPLAIN
开启AutoTrace,仅显示AUTOTRACE的EXPLAIN信息 SET AUTOT[RACE] ON
STATISTICS开启AutoTrace,仅显示AUTOTRACE的STATISTICS信息
在这种情况下,DELETE操作产生了 10465 db block gets + 162 consistent gets 数据块读,同时产生了 2640572 redo size 的 redo日志 (2640572 /1024/1024 = 2.5M 大小)
即使我们给这个表创建了索引,依然无法避免这种资源消耗
SQL> rollback; Rollback complete. SQL> create index ind_t on t(object_id); Index created. SQL> exec dbms_stats.gather_index_stats(user,'ind_t'); PL/SQL procedure successfully completed. SQL> delete from t where t.object_id < 10000; 9811 rows deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3974964266 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 4420 | 22100 | 11 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | INDEX RANGE SCAN| IND_T | 4420 | 22100 | 11 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T"."OBJECT_ID"<10000) Statistics ---------------------------------------------------------- 15 recursive calls 10596 db block gets 36 consistent gets 0 physical reads 2801592 redo size 837 bytes sent via SQL*Net to client 796 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 9811 rows processed SQL> rollback; Rollback complete.
可以看到即使加了索引,资源的消耗情况差不多,都是 10596 db block gets + 36 consistent gets 。
truncate 和 drop 操作消耗的资源
相反的 truncate 和 drop 操作消耗的资源就小很多。
案例
Connected to Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 Connected as xxx@xgj SQL> --->创建分区表 SQL> create table t_part (object_id int ,object_name varchar2(2000) ) 2 partition by range(object_id) 3 ( 4 partition p1 values less than (6000), 5 partition p2 values less than (12000), 6 partition p3 values less than (18000), 7 partition p4 values less than (24000), 8 partition p5 values less than (30000), 9 partition pm values less than (maxvalue) 10 ); Table created SQL> insert into t_part select object_id ,object_name from dba_objects; 35238 rows inserted SQL> commit; Commit complete SQL> select count(1) from t_part partition(p1); COUNT(1) ---------- 5995 SQL>
开启SQL跟踪后,会生成一个trace文件,通过初始化参数user_dump_dest配置其所在目录,该参数的值可以通过下面方法获取到:
SQL> select name, value from v$parameter where name = 'user_dump_dest' ; NAME VALUE --------------------------- ----------------------------- user_dump_dest /oracle/diag/rdbms/cc/cc/trace
以上是产生的trace原文件,我们通过tkprof来查看一下:
oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$tkprof cc_ora_14104_xgj_20170118.trc cc_ora_14104_xgj_20170118.log explain=xxx/xgj sys=no ; TKPROF: Release 11.2.0.4.0 - Development on Wed Jan 18 16:35:18 2017 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$
以下是截取cc_ora_14104_xgj_20170118.log中的关键信息:
truncate partition p1
******************************************************************************** SQL ID: 4gj9z8z8707sx Plan Hash: 491940592 alter table t_part truncate partition p1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 5 0 0 Execute 1 0.01 0.38 20 1 69 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.01 0.38 20 6 69 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 38 (xxx) ********************************************************************************
truncate table
******************************************************************************** SQL ID: cv84y6hnt1dy2 Plan Hash: 491940592 truncate table t_part call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.02 1.54 80 4 266 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.02 1.54 80 4 266 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 38 (xxx) ********************************************************************************
drop partition p1
******************************************************************************** SQL ID: 7wwjnfwgcxgza Plan Hash: 0 alter table t_part drop call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.69 1 1 10 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.69 1 1 10 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 38 (xxx)
从结果可以看到,DDL对分区操作消耗的资源要远小于DML的
操作 | 数据块的读取次数 |
delete | 1000+ |
truncate | 6 |
drop | 1 |
可以看出DDL的操作要远小于DML消耗的资源。 DDL的操作主要是对数据字典的修改,这个值基本上是确定的,而DML会随着数据量的增加,消耗的资源也会相应的增加
DML操作会随着数据量的增加,消耗的资源也会相应的增加
oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$sqlplus /nolog SQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 18:04:57 2017 Copyright (c) 1982, 2013, Oracle. All rights reserved. SQL> conn xxx/xgj Connected. SQL> select count(1) from t; COUNT(1) ---------- 35249 SQL> set autotrace trace exp stat ; SQL> delete from t where object_id < 10000; 9811 rows deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3974964266 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 4420 | 22100 | 11 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | INDEX RANGE SCAN| IND_T | 4420 | 22100 | 11 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<10000) Statistics ---------------------------------------------------------- 45 recursive calls 10940 db block gets 52 consistent gets 0 physical reads 2806724 redo size 831 bytes sent via SQL*Net to client 794 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 3 sorts (memory) 0 sorts (disk) 9811 rows processed SQL> rollback; Rollback complete. SQL> delete from t where object_id < 20000; 13938 rows deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3974964266 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 8842 | 44210 | 21 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | INDEX RANGE SCAN| IND_T | 8842 | 44210 | 21 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<20000) Statistics ---------------------------------------------------------- 17 recursive calls 15016 db block gets 47 consistent gets 0 physical reads 3988132 redo size 836 bytes sent via SQL*Net to client 794 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 13938 rows processed SQL> rollback; Rollback complete. SQL> delete from t where object_id <30000; 14403 rows deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3974964266 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 13263 | 66315 | 31 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | INDEX RANGE SCAN| IND_T | 13263 | 66315 | 31 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"<30000) Statistics ---------------------------------------------------------- 17 recursive calls 15508 db block gets 48 consistent gets 0 physical reads 4121320 redo size 837 bytes sent via SQL*Net to client 793 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 14403 rows processed SQL>
从结果可以看到,消耗的资源和删除的数据量是同步增长的
删除的数据量 | 数据块的读取次数 |
object_id<10000 | 10940 + 52 |
object_id<20000 | 15016 +47 |
object_id<30000 | 15508+48 |
truncate的消耗
t_part 表
t_part_2表
SQL> create table t_part_2 (object_id int ,object_name varchar2(2000) ) 2 partition by range(object_id) 3 ( 4 partition p1 values less than (20000), 5 partition pm values less than (maxvalue) 6 ); Table created SQL> insert into t_part_2 select object_id ,object_name from dba_objects; 35239 rows inserted SQL> commit; Commit complete SQL> alter session set tracefile_identifier='xgj_truncate'; Session altered SQL> alter session set sql_trace=true; Session altered SQL> alter table t_part_2 truncate partition p1; Table truncated SQL> alter session set sql_trace=false; Session altered SQL>
我们可以看到,truncate操作时,数据块读的次数10次左右,所以DDL操作并没有随着操作的数据增加而明显的增加,它只取决于需要操作的字典表的数据量。
DELETE与释放空间
DELETE操作并不能释放出空间,也就是说,删除了哪个表的数据,疼出来的空间还是只能是那个表使用,并不能让给其他的对象使用,因为delete操作并不能使高水位线下降。 如果需要通过清除一个表中的部分数据来腾出空间给其他的对象,这个操作时不行的。
而truncate或者drop操作,就可以释放出空间给其他的对象使用。
下面举例说明:
我们还是用我们上面创建的t_part分区表为例
SQL> select * from user_extents a where a.segment_name='T_PART'; SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME EXTENT_ID BYTES BLOCKS -------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ---------- T_PART P1 TABLE PARTITION TAB_ZMC 0 8388608 1024 T_PART P2 TABLE PARTITION TAB_ZMC 0 8388608 1024 T_PART P3 TABLE PARTITION TAB_ZMC 0 8388608 1024 T_PART PM TABLE PARTITION TAB_ZMC 0 8388608 1024 T_PART P5 TABLE PARTITION TAB_ZMC 0 8388608 1024 SQL> select count(1) from user_extents a where a.segment_name='T_PART'; COUNT(1) ---------- 5 SQL>
最初的时候,T_PART表的数据占据了5个EXTENTS(区)
我们通过delete删除部分数据
SQL> select count(1) from t_part; COUNT(1) ---------- 35245 SQL> delete from t_part a where a.object_id<30000; 14403 rows deleted SQL> commit; Commit complete SQL> select count(1) from user_extents a where a.segment_name='T_PART'; COUNT(1) ---------- 5 SQL>
我们删除了14403 条数据后,发现T_PART仍然占据着5个EXTENDS,也就是说虽然表中的数据删除了,但是空间只能是T_PART自己使用,并没有将控件返回给数据库。
我们使用truncate 来试下 ,用T表的数据吧,比较多
SQL> select count(1) from user_extents a where a.segment_name='T'; COUNT(1) ---------- 17 SQL> select count(1) from t; COUNT(1) ---------- 35245 SQL> truncate table t; Table truncated SQL> select count(1) from user_extents a where a.segment_name='T'; COUNT(1) ---------- 1 SQL> select count(1) from 2 t; COUNT(1) ---------- 0 SQL>
可以看出 通过truncate 之后 T表占用的数据区从17个降为1个,大部分空间都返回给了数据库,这部分空间是可以分配给其他数据使用的。
还有一种方式,就是如果这个表上有许多碎片,可以通过整理表的空间来达到释放空间的目的,命令如下
alter table t move ;
我们用T表来做下试验,还原T表的数据
SQL> insert into t select object_id ,object_name from dba_objects; 35245 rows inserted SQL> commit; Commit complete SQL> select count(1) from user_extents a where a.segment_name='T'; COUNT(1) ---------- 17 SQL> delete from t ; --->通过delete方式删除数据 35245 rows deleted SQL> commit; Commit complete SQL> select count(1) from user_extents a where a.segment_name='T'; COUNT(1) ---------- 17 SQL> alter table t move;--->重新整理数据 Table altered SQL> select count(1) from user_extents a where a.segment_name='T'; COUNT(1) ---------- 1 SQL>
我们看到,使用alter table move 操作,表的空间基本上已经全部释放出来了,只剩下一个EXTENT(因为当前已经是一个空表了)。
通过alter table move 方式让表中的数据重新存储,这样可以释放出来一些空间,但是没这个操作基本上等同于将表中的数据重新分布一次,如果表中的数据比较庞大,这将是一个非常耗时的操作。
实际上delete的操作只适用删除非常少量的数据,并且需要在索引存在的情况下适用。
案例说明:
SQL> select count(1) from t; COUNT(1) ---------- 0 SQL> insert into t select object_id ,object_name from dba_objects; 35245 rows inserted SQL> commit; Commit complete SQL> select a.index_name ,a.index_type ,a.table_name from user_indexes a where a.table_name = 'T'; --->查看表上的索引 INDEX_NAME INDEX_TYPE TABLE_NAME ------------------------------ --------------------------- ------------------------------ IND_T NORMAL T SQL> drop index ind_t ;-->先删掉索引 Index dropped SQL> select a.index_name ,a.index_type ,a.table_name from user_indexes a where a.table_name = 'T'; INDEX_NAME INDEX_TYPE TABLE_NAME ------------------------------ --------------------------- ------------------------------ SQL>
看看执行计划吧
oracle@entel1:[/oracle]$sqlplus /nolog SQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 20:40:16 2017 Copyright (c) 1982, 2013, Oracle. All rights reserved. SQL> conn xxx/xgj Connected. SQL> set autotrace trace exp stat; SQL> delete from t where object_id=888; 1 row deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3335594643 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 1 | 5 | 43 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | TABLE ACCESS FULL| T | 1 | 5 | 43 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("OBJECT_ID"=888) Statistics ---------------------------------------------------------- 4 recursive calls 4 db block gets 209 consistent gets 0 physical reads 0 redo size 830 bytes sent via SQL*Net to client 790 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed SQL>
在没有索引的情况下,即使删除很少的数据,Oracle也会将全表扫描一遍(分区表除外)。
此时,Oracle读取了209 个数据块,如果我们肩上索引,效果会好很多。
SQL> create index ind_t on t(object_id); Index created. SQL> delete from t where object_id=888; 1 row deleted. Execution Plan ---------------------------------------------------------- Plan hash value: 3974964266 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | DELETE STATEMENT | | 1 | 5 | 1 (0)| 00:00:01 | | 1 | DELETE | T | | | | | |* 2 | INDEX RANGE SCAN| IND_T | 1 | 5 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"=888) Statistics ---------------------------------------------------------- 1 recursive calls 5 db block gets 2 consistent gets 4 physical reads 0 redo size 834 bytes sent via SQL*Net to client 790 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed SQL>
我们可以看到仅仅读取了2个数据块,2个一致性读,非常高效。