[20170604]12c Top Frequency histogram 2

简介: [20170604]12c Top Frequency histogram补充.txt 1.环境: SCOTT@test01p> @ ver1 PORT_STRING                    VERSION        BANNER       ...
[20170604]12c Top Frequency histogram补充.txt

1.环境:
SCOTT@test01p> @ ver1
PORT_STRING                    VERSION        BANNER                                                                               CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0           12.1.0.1.0     Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0

--//如果要建立Top Frequency histogram必须要满足几个条件:
--//链接 raajeshwaran.blogspot.co.id/2016/06/top-frequency-histogram-in-12c.html

The database creates a Top frequency histogram, when the following criteria are met.

NDV is greater than n, where n is the requested number of buckets (default 254)
The percentage of rows occupied by Top-frequent values is greater than or equal to the threshold p where p is (1-(1/n)*100).
The estimate_percent parameter in dbms_stats gathering procedure should be auto_sample_size (set to default)

SCOTT@test01p> create table t as select * from dba_objects;
Table created.

select column_name,num_distinct,density,histogram,SAMPLE_SIZE
  from user_tab_col_statistics
  where table_name ='T'
  and column_name ='OWNER';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32     .03125 NONE                  91695

--//12c ctas 建立统计信息,但是不会建立直方图.density 1/32=.03125.
SCOTT@test01p> select count(*) from t;
  COUNT(*)
----------
     91695

--//随手写的sql语句:
with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3  from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b;

ROWNUM OWNER                N1         N2         N3         X1         X2
------ ----------------- ----- ---------- ---------- ---------- ----------
     1 SYS               41942      91695      41942     .45741          0
     2 PUBLIC            37142      91695      79084     .86247         .5
     3 APEX_040200        3405      91695      82489      .8996     .66667
     4 ORDSYS             3157      91695      85646     .93403        .75
     5 MDSYS              1819      91695      87465     .95387         .8
     6 XDB                 985      91695      88450     .96461     .83333
     7 SYSTEM              641      91695      89091      .9716     .85714
     8 CTXSYS              405      91695      89496     .97602       .875
     9 WMSYS               387      91695      89883     .98024     .88889
    10 DVSYS               352      91695      90235     .98408         .9
    11 SH                  309      91695      90544     .98745     .90909
    12 ORDDATA             292      91695      90836     .99063     .91667
    13 LBACSYS             209      91695      91045     .99291     .92308
    14 OE                  142      91695      91187     .99446     .92857
    15 SCOTT                96      91695      91283     .99551     .93333
    16 GSMADMIN_INTERNAL    77      91695      91360     .99635      .9375
    17 IX                   58      91695      91418     .99698     .94118
    18 DBSNMP               55      91695      91473     .99758     .94444
    19 PM                   44      91695      91517     .99806     .94737
    20 HR                   35      91695      91552     .99844        .95
    21 OLAPSYS              25      91695      91577     .99871     .95238
    22 OJVMSYS              23      91695      91600     .99896     .95455
    23 DVF                  19      91695      91619     .99917     .95652
    24 FLOWS_FILES          13      91695      91632     .99931     .95833
    25 AUDSYS               12      91695      91644     .99944        .96
    26 ORDPLUGINS           10      91695      91664     .99966     .96154
    27 OUTLN                10      91695      91664     .99966     .96296
    28 BI                    8      91695      91688     .99992     .96429
    29 ORACLE_OCM            8      91695      91688     .99992     .96552
    30 SI_INFORMTN_SCHEM     8      91695      91688     .99992     .96667
    31 APPQOSSYS             5      91695      91693     .99998     .96774
    32 TEST                  2      91695      91695          1     .96875

D:\temp>cat a1.sql
cat a1.sql
exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size &1');
select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';

SCOTT@test01p> @ a1.sql 2
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32     .03125 HYBRID                 5500

SCOTT@test01p> @ a1.sql 3
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 4
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 31
PL/SQL procedure successfully completed.
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 TOP-FREQUENCY         91695

SCOTT@test01p> @ a1.sql 32
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 5.4529E-06 FREQUENCY             91695

--//除了bucket=2,32建立的直方图HYBRID,FREQUENCY外,建立的都是TOP-FREQUENCY.
--//以10个bucket为例.解方程式(90235-x)/(91695-x)=0.9 ,得到x=77095.也就是要减少77095.

--//delete t where owner='SYS' and rownum<=41000;
--//delete t where owner='PUBLIC' and rownum<=36095;

SCOTT@test01p> delete t where owner='SYS' and rownum<=41000;
41000 rows deleted.

SCOTT@test01p> delete t where owner='PUBLIC' and rownum<=36095;
36095 rows deleted.

SCOTT@test01p> commit ;
Commit complete.

with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3  from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b where rownum<=11;

ROWNUM OWNER         N1         N2         N3         X1         X2
------ ----------- ---- ---------- ---------- ---------- ----------
     1 APEX_040200 3405      14600       3405     .23322          0
     2 ORDSYS      3157      14600       6562     .44945         .5
     3 MDSYS       1819      14600       8381     .57404     .66667
     4 PUBLIC      1047      14600       9428     .64575        .75
     5 XDB          985      14600      10413     .71322         .8
     6 SYS          942      14600      11355     .77774     .83333
     7 SYSTEM       641      14600      11996     .82164     .85714
     8 CTXSYS       405      14600      12401     .84938       .875
     9 WMSYS        387      14600      12788     .87589     .88889
    10 DVSYS        352      14600      13140         .9         .9
    11 SH           309      14600      13449     .92116     .90909
11 rows selected.
--//backet=10,前面10个值占90%.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//再减少1条记录.
SCOTT@test01p> delete t where owner='SYS' and rownum<=1;
1 row deleted.

SCOTT@test01p> commit ;
Commit complete.

ROWNUM OWNER         N1         N2         N3         X1         X2
------ ----------- ---- ---------- ---------- ---------- ----------
     1 APEX_040200 3405      14599       3405     .23324          0
     2 ORDSYS      3157      14599       6562     .44948         .5
     3 MDSYS       1819      14599       8381     .57408     .66667
     4 PUBLIC      1047      14599       9428      .6458        .75
     5 XDB          985      14599      10413     .71327         .8
     6 SYS          941      14599      11354     .77772     .83333
     7 SYSTEM       641      14599      11995     .82163     .85714
     8 CTXSYS       405      14599      12400     .84937       .875
     9 WMSYS        387      14599      12787     .87588     .88889
    10 DVSYS        352      14599      13139     .89999         .9
    11 SH           309      14599      13448     .92116     .90909
11 rows selected.
--//现在前10占.89999.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32    .018378 HYBRID                14599

--//可以发现建立的直方图不是TOP-FREQUENCY,而是HYBRID(混合型直方图).
--//转化成TOP-FREQUENCY.
SCOTT@test01p> insert into t  select * from dba_objects where owner='SYS' and rownum=1;
1 row created.

SCOTT@test01p> commit ;
Commit complete.

SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//以上内容是昨天的测试.
--//前面我提到如果取样不是auto_sample_size,也可能不行,测试看看.

2.取样大小Estimate_Percent  => NULL.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => NULL);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32    .018379 HYBRID                14600

--//可以发现如果全取样,反而生成混合型直方图.

3.取样大小Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,Block_sample => TRUE.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,Block_sample => TRUE);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//可以发现Estimate_Percent=>SYS.DBMS_STATS.AUTO_SAMPLE_SIZE,不管是块取样依旧.

4.取样Estimate_Percent  => 100,90看看.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => 100);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .062908991 HEIGHT BALANCED       14600

SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => 90);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32      .0625 HEIGHT BALANCED       13108

--//看来仅仅Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE才会生成TOP-FREQUENCY直方图.
SCOTT@test01p> exec  dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size 10',Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE);
PL/SQL procedure successfully completed.

SCOTT@test01p> select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER                          32 .000034247 TOP-FREQUENCY         14600

--//有机会研究HYBRID直方图.

目录
相关文章
|
编译器 Go 开发工具
常见问题之Golang——cgo: C compiler &quot;gcc&quot; not found: exec: &quot;gcc&quot;: executable file not found in %PATH%错误
本文主要是对我日常在使用golang时遇到的一些问题与解决方式进行的汇总,在此提供给大家便于排查一些遇到的问题,其中有更好的解决方案可在评论区留言。
1425 0
常见问题之Golang——cgo: C compiler &quot;gcc&quot; not found: exec: &quot;gcc&quot;: executable file not found in %PATH%错误
|
JSON 安全 程序员
[JavaWeb]——JWT令牌技术,如何从获取JWT令牌
[JavaWeb]——JWT令牌技术,如何从获取JWT令牌
278 0
|
存储 安全 网络安全
云计算与网络安全的交织挑战及应对策略
随着科技的飞速发展,云计算已经成为企业运营的重要组成部分。然而,随之而来的网络安全问题也日益凸显。本文从云服务、网络安全以及信息安全等角度出发,探讨了这些领域面临的主要挑战,并提出了相应的应对策略。
420 2
|
Java 数据库连接 数据格式
【Java笔记+踩坑】Spring基础2——IOC,DI注解开发、整合Mybatis,Junit
IOC/DI配置管理DruidDataSource和properties、核心容器的创建、获取bean的方式、spring注解开发、注解开发管理第三方bean、Spring整合Mybatis和Junit
【Java笔记+踩坑】Spring基础2——IOC,DI注解开发、整合Mybatis,Junit
急急急急,代理IP行业头部芝麻代理释放重大讯号!
芝麻代理IP近期退出市场,这对代理IP行业产生了巨大影响。面对这一变化,各公司需重新寻找替代方案。在挑选新的代理IP服务商时,可从响应速度、可用率、稳定性及带宽等方面进行测试。通过编写相关代码并记录测试结果,最终选择符合自身需求的服务商。以下是部分测试代码示例,帮助大家更好地进行评估与选择。
|
SQL 缓存 关系型数据库
一次sql改写优化子查询的案例
在生产环境中,一个MySQL RDS实例遭遇了高CPU使用率问题,原因是执行了一条复杂的UPDATE SQL语句,该语句涉及一个无法缓存的子查询(UNCACHEABLE SUBQUERY),导致子查询需要针对每一行数据重复执行,极大地影响了性能。SQL语句的目标是更新一行数据,但执行时间长达30秒。优化方法是将子查询转换为内连接形式,优化后的语句执行时间降低到毫秒级别,显著减少了CPU消耗。通过示例数据和执行计划对比,展示了优化前后的时间差异和执行效率的提升。
370 2
|
Web App开发 前端开发 JavaScript
后端一次给你10万条数据,如何优雅展示?到底考察我什么?
题目探讨了当后端传递10万条数据给前端时,如何有效渲染到页面。回答者表达了对这种需求的困惑,指出一次性渲染大量数据会导致页面卡顿。分析显示,Chrome下直接渲染耗时且卡顿明显。解决方案是分批渲染,利用`setTimeout`模拟多线程,将数据分组并间隔时间逐次插入DOM,减轻浏览器负担。问题旨在考察前端性能优化和`setTimeout`的使用。现实需求中通常会采用分页或虚拟滚动等技术。
|
小程序 安全 搜索推荐
闪灵CMS电子商城系统源码v5.0(自带微信小程序)
闪灵CMS电子商城系统源码,双语带手机版,PHP+MYSQL进行开发,网站安装简单、快捷。
235 0
|
分布式计算 监控 Oracle
Spark Standalone环境搭建及测试
Spark Standalone环境搭建及测试
206 0
|
JavaScript 前端开发 编译器
前端经典面试题 | 吊打面试官系列 之 说说你对TypeScript 和 JavaScript的理解
前端经典面试题 | 吊打面试官系列 之 说说你对TypeScript 和 JavaScript的理解

热门文章

最新文章