开发者社区> 杰克.陈> 正文

MS SQL统计信息浅析下篇

简介: 原文:MS SQL统计信息浅析下篇    MS SQL统计信息浅析上篇对SQL SERVER 数据库统计信息做了一个整体的介绍,随着我对数据库统计信息的不断认识、理解,于是有了MS SQL统计信息浅析下篇。
+关注继续查看
原文:MS SQL统计信息浅析下篇

   MS SQL统计信息浅析上篇对SQL SERVER 数据库统计信息做了一个整体的介绍,随着我对数据库统计信息的不断认识、理解,于是有了MS SQL统计信息浅析下篇。 下面是我对SQL Server统计信息的一些探讨或认识,如有不对的地方,希望大家能够指正。


触发统计信息更新条件疑问

    关于这个触发统计信息更新的条件。因为我在很多资料上看到过,例如Microsoft  SQL Server 企业级平台管理实践。 我自己上篇也是这样解释的。

   1:普通表上,触发数据库自动更新统计信息的条件

           1、 在一个空表中有数据的改动。

           2、 当统计信息创建时,表的行数只有500或以下,且后来统计对象中的引导列(统计信息的第一个字段数据)的更改次数大于500.

           3、 当表的统计信息收集时,超过了500行,且统计对象的引导列(统计信息的第一个字段数据)后来更改次数超过500+表总行数的20%时

    2:临时表

          If the statistics object is defined on a temporary table, it is out of date as discussed above, except that there is an additional threshold for recomputation at 6 rows, with a test otherwise identical to test 2 in the previous list.。

    3: 表变量

            表变量没有统计信息

官方资料http://msdn.microsoft.com/en-us/library/dd535534%28v=sql.100%29.aspx 也是这样解释的。

A statistics object is considered out of date in the following cases:

  If the statistics is defined on a regular table, it is out of date if:

  1.        The table size has gone from 0 to >0 rows (test 1).
  2.        The number of rows in the table when the statistics were gathered was 500 or less, and the colmodctr of the leading column of the statistics object has changed by more than 500 since then (test 2).
  3.        The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered (test 3).

·         For filtered statistics, the colmodctr is first adjusted by the selectivity of the filter before these conditions are tested. For example, for filtered statistics with predicate selecting 50% of the rows, the colmodctr is multiplied by 0.5.

·         One limitation of the automatic update logic is that it tracks changes to columns in the statistics, but not changes to columns in the predicate. If there are many changes to the columns used in predicates of filtered statistics, consider using manual updates to keep up with the changes.

·         If the statistics object is defined on a temporary table, it is out of date as discussed above, except that there is an additional threshold for recomputation at 6 rows, with a test otherwise identical to test 2 in the previous list.

Table variables do not have statistics at all.

但是又一次我的实验显示不是那么一回事,有兴趣的可以按照下面SQL语句试试,

CREATE TABLE TEST1
(
 ID      INT           ,
 NAME    VARCHAR(8)    ,
 CONSTRAINT PK_TEST1 PRIMARY KEY(ID)
)
GO
 
 
SELECT    name AS index_name,
  STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('dbo.TEST1')
GO
 
index_name                              StatsUpdated
------------------------------------ -----------------------
PK_TEST1                                      NULL
 
 
INSERT INTO TEST1
SELECT 1001, 'Kerry' ;

此时查看统计信息的更新日期,发现空表插入一条数据并没有触发数据库更新其统计信息。

clipboard

INSERT INTO TEST1
 
SELECT '1002', 'Jimmy' ;
 

即使我再插入一条或几条数据,统计信息依然不会更新,DBCC SHOW_STATISTICS(TEST1, PK_TEST1) 查看依然如此

clipboard[1]

这明显跟第一条规则:在一个空表中有数据的改动会触发统计信息明显不符。 Why? 难道官方文档有问题? 那么我就去验证第二条规则

DECLARE @Index INT;
SET @Index =2;
WHILE @Index <= 510
BEGIN
INSERT INTO TEST1 VALUES(@Index, 'k'+LTRIM(STR(@Index)));
    SET @Index = @Index + 1;
    END
 
SELECT    name AS index_name,
  STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('dbo.TEST1')
 
 
 
DBCC SHOW_STATISTICS(TEST1, PK_TEST1)

查看居然发现统计信息还是没有更新。规则2似乎也没有生效,哇靠,怎么会这样呢? 估计有人会怀疑是不是我没有开启数据库”自动更新统计信息“和”自动创建统计信息“选项,其实当时我也这样怀疑过,甚至怀疑数据库版本问题,如果你按照这个做实验估计你也很纳闷,为什么呢? 其实出现这种情况,是因为还少了触发条件,大家可以先将整个表DROP掉,然后从简单开始,插入一条数据后,

CREATE TABLE TEST1
(
 ID     INT     ,
 NAME    VARCHAR(8)    ,
 CONSTRAINT PK_TEST1 PRIMARY KEY(ID)
)
GO
 
INSERT INTO TEST1
SELECT 1001, 'Kerry' ;

执行SQL语句后SELECT * FROM TEST1,此时查看统计信息是否更新,发现没有,但是如果执行SELECT * FROM TEST1 WHERE ID=1语句后,你会发现统计信息居然更新了。官方文档果然诚不欺我啊,呵呵。

clipboard[2]

其实也就是说统计信息的更新不光需要满足上述条件,还需特定的SQL触发, 例如上面两条语句

SELECT * FROM TEST1

SELECT * FROM TEST1 WHERE ID=1001

第一条SQL不会触发,而第二条SQL会触发。  它们之间的区别是就是查询条件,第二条语句查询条件ID包含在统计信息PK_TEST1里面。那么大家猜测一下

SELECT * FROM TEST1 WHERE Name='Kerry'

这条语句会不会触发统计信息的更新呢? 我的实验是不会。于是我设计了另外一个小实验,来验证我的另外一个想法, 如下所示:

IF EXISTS(SELECT 1 FROM SYSOBJECTS WHERE NAME = 'TEST1' AND XTYPE = 'U')
BEGIN
 DROP TABLE TEST1;
END
GO
CREATE TABLE TEST1
(
 ID     INT     ,
 NAME    VARCHAR(8)    ,
 Sex     VARCHAR(2)    ,
 CONSTRAINT PK_TEST1 PRIMARY KEY(ID,NAME)
)
GO
 
INSERT INTO TEST1
SELECT 1001, 'Kerry','男' ;
SELECT * FROM TEST1 WHERE Name='Kerry'

结果告诉我,上面这个SQL语句依然不能触发更新统计信息,那么可以总结归纳为,触发统计信息更新的SQL语句里面必须有统计信息的第一个字段作为条件才能成功触发统计信息的更新。

统计信息存储位置探讨

    关于统计信息存储在哪些系统表,我们先从简单的入手。一个简单的表Employee为示例

clipboard[3]

    使用SP_HELPSTATS 可以查看表拥有哪些统计信息,那么如果看过我上篇   ,就可以通过分析存储过程SP_HELPSTATS了解到其实这里查看的统计信息其实是从sys.stats取值,当然还需要关联sys.stats_columns 与sys.columns

clipboard[4]

    但是上面的统计信息位于哪里呢?在SQL SERVER 2000 里面,在sysindexes里面有一列statblob,统计信息应该就放在这里面.官方文档里面指明statblob字段就是统计信息的二进制大对象。

    select name, statblob from sysindexes where id= object_id('Login_Log')

clipboard[5]

但是从SQL SERVER 2005开始,这一列返回null值,statblob本身存储在一张内部目录表中。

那么如何查看统计信息呢? 其实可以用下面两种方式:

    方法1:DBCC SHOW_STATISTICS('Employee', 'PK_Employee_ID_Name') WITH STATS_STREAM

clipboard[6]

0x0100000002000000000000000000000084A41ED3000000001103000000000000B902000000000000380300003800000004000A00000000000000000000000000E7030480E7000000280000000000000024D0000000000000070000007E3A170111A300000F270000000000000F270000000000000000803F76BCD13876BCD1380000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000030000000200000014000000873ADE41003C1C460000000000008040873ABE4100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000190000000000000000000000000000005D0000000000000045010000000000004D0100000000000018000000000000002F000000000000004600000000000000100014000000803F000000000000803F10270000040000100014000000803F00301C460000803F1D4E0000040000100014000000803F000000000000803F1E4E00000400000300000078DC020011A30000000000008087C340778200000000000003000000000000C08E371A3F00000000000000000000000000000000000000000000000000000000000000000000000000000000A003CA000FA30000000000008087C340778200000000000003000000000000C08E371A3F0000000000000000000000000000000000000000000000000000000000000000000000000000000049805F010DA30000000000008087C340778200000000000003000000000000C08E371A3F000000000000000000000000000000000000000000000000000000000000000000000000000000000F27000000000000

方法2: 必须通过DAC方式登录数据库,才能查看到。

SELECT name, imageval
FROM sys.stats AS s
 INNER JOIN sys.sysobjvalues AS o
 ON s.object_id = o.objid
 AND s.stats_id = o.subobjid
WHERE
  s.object_id = OBJECT_ID('dbo.Employee');

clipboard[7]

0x070000007E3A170111A300000F270000000000000F270000000000000000803F76BCD13876BCD1380000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000030000000200000014000000873ADE41003C1C460000000000008040873ABE4100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000190000000000000000000000000000005D0000000000000045010000000000004D0100000000000018000000000000002F000000000000004600000000000000100014000000803F000000000000803F10270000040000100014000000803F00301C460000803F1D4E0000040000100014000000803F000000000000803F1E4E00000400000300000078DC020011A30000000000008087C340778200000000000003000000000000C08E371A3F00000000000000000000000000000000000000000000000000000000000000000000000000000000A003CA000FA30000000000008087C340778200000000000003000000000000C08E371A3F0000000000000000000000000000000000000000000000000000000000000000000000000000000049805F010DA30000000000008087C340778200000000000003000000000000C08E371A3F000000000000000000000000000000000000000000000000000000000000000000000000000000000F27000000000000

将上面两段二进制码分两行放到UE或文本编辑器,右对齐后发现下面下面的十六进制码是上面十六进制码的一部分。部分截图如下所示。

clipboard[8]

至于两种的区别,暂时还没有搞懂。

有个叫joe-chang的大神http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx  这篇博客介绍了一些如何解码统计信息(文章看起来确实很深奥、枯燥)。我用其提供的p_Stat2008c存储过程的执行结果与DBCC SHOW_STATISTICS的结果做了对比分析,如下所示,从里面可以找到很多对应信息,图片里面没法弄很多线条,大致整了几条。

DBCC SHOW_STATISTICS(Employee,PK_Employee_ID_Name);

exec [dbo].[p_Stat2008c] @Table='Employee', @Index ='PK_Employee_ID_Name'

clipboard[9]

也就是说统计信息基本上是可以解析出来的。只是相当复杂。

有效维护更新数据库统计信息

      如果要有效的维护、更新数据库的统计信息,下面有一些建议,仅供参考

1:一般建议开启“自动创建统计信息”和“自动更新统计信息”选项(默认开启)。让数据库自动维护、更新统计信息。在比较繁忙的OLTP系统中建议开启“自动异步更新统计信息”选项, 否则应该关闭这个选项,尤其是OLAP系统。

    关于自动异步更新统计信息开启的建议:Use asynchronous statistics update if synchronous update causes undesired delay

If you have a large database and an OLTP workload, and if you enable AUTO_UPDATE_STATISTICS, some transactions that normally run in a fraction of a second may very infrequently take several seconds or more because they cause statistics to be updated. If you want to avoid the possibility of this noticeable delay, enable AUTO_UPDATE_STATISTICS_ASYNC. For workloads with long-running queries, getting the best plan is more important than an infrequent delay in compilation. In such cases, use synchronous rather than asynchronous auto update statistics.

2: 如果需要,可以选项性的使用FULLSCAN更新统计信息。

更新统计信息是一件消耗资源的事情,尤其是对那些大表。很多时候SQL SERVER引擎会根据抽样更新统计信息,例如80%进行抽样更新统计信息。此时统计信息的准确性就跟采样的比例有很大的关系。尤其对某些特殊的表(数据分布严重不均)影响非常大。所以有时候可以选择性的使用FULLSCAN来更新统计信息。SQL SERVER一般会在统计信息的准确度度和资源合理消耗之间做一个平衡。其实有很多特殊例子,最明显就是数据分布非常不均时,此时统计信息的准确性对执行计划的影响就非常大。

 

3:Consider more frequent statistics gathering for ascending keys

考虑更频繁的收集ascending keys的统计数据。升序键列,如IDENTITY列或代表真实世界的时间戳datetime列,频繁的INSERT可能会导致表的统计信息不正确,因为新插入的值不在直方图之中。所以需要频繁的收集ascending keys的统计数据。

4:考虑少用表值函数和表变量

      对于表变量和表值函数,它们没有统计信息,所以数据库优化器需去猜测它们的基数,这样得到的执行计划就非常不可靠。

5:在脚本中考虑使用字符串代替局部变量,在存储过程考虑使用参数代替局部变量

如果你在查询谓词中使用局部变量而不是参数或字符串,那么优化器会诉诸于减少质量的估计或谓词的选择性的一个猜想选择性(If you use a local variable in a query predicate instead of a parameter or literal, the optimizer resorts to a reduced-quality estimate, or a guess for selectivity of the predicate)。在查询中使用参数或字符串而不是局部变量,优化器能选择一个更好的查询计划。请看下面例子。



declare @StartOrderDate datetime
set @StartOrderDate = '20040731'
select * from Sales.SalesOrderHeader h, Sales.SalesOrderDetail d
WHERE h.SalesOrderID = d.SalesOrderId
AND h.OrderDate >= @StartOrderDate


  


SELECT * FROM Sales.SalesOrderHeader h, Sales.SalesOrderDetail d
WHERE h.SalesOrderID = d.SalesOrderId
AND h.OrderDate >= '20040731'



  


  


  


  


  

clipboard[36]

两者的对比如下所示。使用字符串得到的执行计划明显优于使用局部变量(注意,不同版本的数据库或可用内存不同,得到的执行计划可能有所差异,请以各自实验为准)

clipboard[37]

consider (1) rewriting the query to use literals instead of variables, (2) using sp_executesql with parameters that replace your use of local variables, or (3) using a stored procedure with parameters that replace your use of local variables. Dynamic SQL via EXEC may also be useful for eliminating local variables, but it typically results in higher compilation overhead and more complex programming. A new enhancement in SQL Server 2008 is that the OPTION(RECOMPILE) hint

USE [YourSQLDba]
GO
/****** Object:  StoredProcedure [yMaint].[UpdateStats]    Script Date: 04/25/2014 14:26:10 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [yMaint].[UpdateStats]
  @JobNo Int
, @SpreadUpdStatRun Int
as
Begin
  declare @seqStatNow  Int
  declare @cmptlevel Int
  declare @dbName sysname
  declare @sql nvarchar(max)
  declare @lockResult int
  Declare @seq Int                 -- row sequence for row by row processing
  Declare @scn sysname             -- schema name
  Declare @tb sysname              -- table name
  declare @sampling Int           -- page count to get an idea if the size of the table
  Declare @idx sysname             -- index name
  Declare @object_id int           -- a proof that an object exists
 
  Begin Try
 
  Create table #TableNames
  (
    scn sysname
  , tb sysname
  , sampling nvarchar(3)
  , seq int
  , primary key clustered (seq)
  )
 
  Update Maint.JobSeqUpdStat 
    Set @seqStatNow = (seq + 1) % @SpreadUpdStatRun, seq = @seqStatNow  
 
  Set @DbName = ''
  While(1 = 1) -- simple do loop
  Begin
    Select top 1 -- first next in alpha sequence after the last one.
      @DbName = DbName
    , @cmptLevel = CmptLevel  
    From #Db
    Where DbName > @DbName 
    Order By DbName 
    
    -- exit if nothing after the last one processed
    If @@rowcount = 0 Break -- 
 
    -- If database is not updatable, skip update stats for this database
    If DATABASEPROPERTYEX(@DbName, 'Updateability') = N'READ_ONLY'  
      Continue
 
    -- If database is in emrgency, skip update stats for this database
    If DatabasepropertyEx(@DbName, 'Status') IN (N'Emergency')
      Continue
 
    -- makes query boilerplate with replacable parameter identified by
    -- labels between "<" et ">"
    -- this query select table for which to perform update statistics
    truncate table #TableNames
    set @sql =
    '
     set nocount on
     ;With
       TableSizeStats as
     (
     select 
       object_schema_name(Ps.object_id, db_id("<DbName>")) as scn --collate <srvCol>
     , object_name(Ps.object_id, db_id("<DbName>")) as tb --collate <srvCol>
     , Sum(Ps.Page_count) as Pg
    From
      sys.dm_db_index_physical_stats (db_id("<DbName>"), NULL, NULL, NULL, "LIMITED") Ps
    Group by 
      Ps.object_id  
    )
    Insert into #tableNames (scn, tb, seq, sampling)
    Select 
      scn
    , tb
    , row_number() over (order by scn, tb) as seq
    , Case 
        When pg > 200001 Then "10"
        When Pg between 50001 and 200000 Then "20"
        When Pg between 5001 and 50000 Then "30"
        else "100"
      End  
    From 
      TableSizeStats
    where (abs(checksum(tb)) % <SpreadUpdStatRun>) = <seqStatNow>
    '  
    set @sql = replace(@sql,'<srvCol>',convert(nvarchar(100), Serverproperty('collation'))) 
    Set @sql = replace(@sql,'<seqStatNow>', convert(nvarchar(20), @seqStatNow))
    Set @sql = replace(@sql,'<SpreadUpdStatRun>', convert(nvarchar(20), @SpreadUpdStatRun))
    set @sql = replace(@sql,'"','''') -- to avoid doubling of quotes in boilerplate
    set @sql = replace(@sql,'<DbName>',@DbName) 
 
    Exec yExecNLog.LogAndOrExec 
      @context = 'yMaint.UpdateStats'
    , @Info = 'Table selection for update statistics'  
    , @sql = @sql
    , @JobNo = @JobNo
    , @forDiagOnly  = 1
 
    Exec yMaint.LockMaintDb @jobNo, 'U', @DbName, @LockResult output
    If @lockResult < 0 -- messages are issued from yMaint.LockMaintDb
      Continue
 
    set @seq = 0
    While (1 = 1)
    begin
      Select top 1 @scn = scn, @tb = tb, @sampling = sampling, @seq = seq
      from #TableNames where seq > @seq order by seq
      if @@rowcount = 0 break
 
 
      Set @sql = 'Select @object_id = object_id("<DbName>.<scn>.<tb>") '
      set @sql = replace (@sql, '<DbName>', @DbName)
      set @sql = replace (@sql, '<scn>', @scn)
      set @sql = replace (@sql, '<tb>', @tb)
      set @sql = replace (@sql, '"', '''')
      Exec sp_executeSql @Sql, N'@object_id  int output', @object_id output
 
      If @object_id is not null
      Begin
        Set @sql = 'update statistics [<DbName>].[<scn>].[<tb>] WITH sample <sampling> PERCENT'
        set @sql = replace (@sql, '<DbName>', @DbName)
        set @sql = replace (@sql, '<scn>', @scn)
        set @sql = replace (@sql, '<tb>', @tb)
        set @sql = replace (@sql, '<sampling>', @sampling)
        set @sql = replace (@sql, '"', '''')
        Exec yExecNLog.LogAndOrExec 
          @context = 'yMaint.UpdateStats'
        , @Info = 'update statistics selected'  
        , @sql = @sql
        , @JobNo = @JobNo
      End
    end -- While
 
    Exec yMaint.UnLockMaintDb @jobNo, @DbName
    
  End -- While boucle banque par banque
 
  End try
  Begin catch
    Exec yExecNLog.LogAndOrExec @jobNo = @jobNo, @context = 'yMaint.UpdateStats Error', @err = '?'
  End Catch
End -- yMaint.UpdateStats

 

6:Consider filtered statistics for heterogeneous data
Sometimes rows with different schema are mapped to a single physical table, with multipurpose columns such as ntext1, ntext2, bigint1, bigint2 storing semantically unrelated data. Typically, there is also a special-purpose rowtype column that defines what is the semantic meaning of the data stored in each column. Such design is useful for storing arbitrary user-defined lists without changing the underlying database schema. As a result, the same column may end up storing telephone numbers and city names, and a histogram on such column may not be very useful, due to the limit of 200 steps. To avoid this, define separate statistics for each rowtype in this table.:

7:Consider filtered statistics for partitioned tables

Statistics are defined at the table level. Changes to partitions affect statistics only indirectly, through the column modification counters. Switching in a partition is treated as an insert of its rows into the table, triggering statistics update based on the 20% rule, as outlined above. Filtered statistics, through their predicates, can target only rows in certain partition or partitions. There is no requirement to align to the boundaries of the partitions when defining the statistics.
Often, customers partition a table by the Date column, keeping a partition for every month, and updating only the last partition; older months receive the majority of the complex, read-only queries. In this scenario, creating separate, full-scan statistics on the read-only region of the table results in more accurate cardinality estimates. In order to benefit from the separate statistics object, queries must be contained within the read-only region. Similarly, separate statistics objects can be created for different regions based on the different access patterns.

8:定期自动去更新统计信息,我推荐使用YourSQLDba,这个工具开源、方便扩展。最重要的是你能了解是如何处理的。

USE [YourSQLDba]
GO
/****** Object:  StoredProcedure [yMaint].[UpdateStats]    Script Date: 04/25/2014 14:26:10 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [yMaint].[UpdateStats]
  @JobNo Int
, @SpreadUpdStatRun Int
as
Begin
  declare @seqStatNow  Int
  declare @cmptlevel Int
  declare @dbName sysname
  declare @sql nvarchar(max)
  declare @lockResult int
  Declare @seq Int                 -- row sequence for row by row processing
  Declare @scn sysname             -- schema name
  Declare @tb sysname              -- table name
  declare @sampling Int           -- page count to get an idea if the size of the table
  Declare @idx sysname             -- index name
  Declare @object_id int           -- a proof that an object exists
 
  Begin Try
 
  Create table #TableNames
  (
    scn sysname
  , tb sysname
  , sampling nvarchar(3)
  , seq int
  , primary key clustered (seq)
  )
 
  Update Maint.JobSeqUpdStat 
    Set @seqStatNow = (seq + 1) % @SpreadUpdStatRun, seq = @seqStatNow  
 
  Set @DbName = ''
  While(1 = 1) -- simple do loop
  Begin
    Select top 1 -- first next in alpha sequence after the last one.
      @DbName = DbName
    , @cmptLevel = CmptLevel  
    From #Db
    Where DbName > @DbName 
    Order By DbName 
    
    -- exit if nothing after the last one processed
    If @@rowcount = 0 Break -- 
 
    -- If database is not updatable, skip update stats for this database
    If DATABASEPROPERTYEX(@DbName, 'Updateability') = N'READ_ONLY'  
      Continue
 
    -- If database is in emrgency, skip update stats for this database
    If DatabasepropertyEx(@DbName, 'Status') IN (N'Emergency')
      Continue
 
    -- makes query boilerplate with replacable parameter identified by
    -- labels between "<" et ">"
    -- this query select table for which to perform update statistics
    truncate table #TableNames
    set @sql =
    '
     set nocount on
     ;With
       TableSizeStats as
     (
     select 
       object_schema_name(Ps.object_id, db_id("<DbName>")) as scn --collate <srvCol>
     , object_name(Ps.object_id, db_id("<DbName>")) as tb --collate <srvCol>
     , Sum(Ps.Page_count) as Pg
    From
      sys.dm_db_index_physical_stats (db_id("<DbName>"), NULL, NULL, NULL, "LIMITED") Ps
    Group by 
      Ps.object_id  
    )
    Insert into #tableNames (scn, tb, seq, sampling)
    Select 
      scn
    , tb
    , row_number() over (order by scn, tb) as seq
    , Case 
        When pg > 200001 Then "10"
        When Pg between 50001 and 200000 Then "20"
        When Pg between 5001 and 50000 Then "30"
        else "100"
      End  
    From 
      TableSizeStats
    where (abs(checksum(tb)) % <SpreadUpdStatRun>) = <seqStatNow>
    '  
    set @sql = replace(@sql,'<srvCol>',convert(nvarchar(100), Serverproperty('collation'))) 
    Set @sql = replace(@sql,'<seqStatNow>', convert(nvarchar(20), @seqStatNow))
    Set @sql = replace(@sql,'<SpreadUpdStatRun>', convert(nvarchar(20), @SpreadUpdStatRun))
    set @sql = replace(@sql,'"','''') -- to avoid doubling of quotes in boilerplate
    set @sql = replace(@sql,'<DbName>',@DbName) 
 
    Exec yExecNLog.LogAndOrExec 
      @context = 'yMaint.UpdateStats'
    , @Info = 'Table selection for update statistics'  
    , @sql = @sql
    , @JobNo = @JobNo
    , @forDiagOnly  = 1
 
    Exec yMaint.LockMaintDb @jobNo, 'U', @DbName, @LockResult output
    If @lockResult < 0 -- messages are issued from yMaint.LockMaintDb
      Continue
 
    set @seq = 0
    While (1 = 1)
    begin
      Select top 1 @scn = scn, @tb = tb, @sampling = sampling, @seq = seq
      from #TableNames where seq > @seq order by seq
      if @@rowcount = 0 break
 
 
      Set @sql = 'Select @object_id = object_id("<DbName>.<scn>.<tb>") '
      set @sql = replace (@sql, '<DbName>', @DbName)
      set @sql = replace (@sql, '<scn>', @scn)
      set @sql = replace (@sql, '<tb>', @tb)
      set @sql = replace (@sql, '"', '''')
      Exec sp_executeSql @Sql, N'@object_id  int output', @object_id output
 
      If @object_id is not null
      Begin
        Set @sql = 'update statistics [<DbName>].[<scn>].[<tb>] WITH sample <sampling> PERCENT'
        set @sql = replace (@sql, '<DbName>', @DbName)
        set @sql = replace (@sql, '<scn>', @scn)
        set @sql = replace (@sql, '<tb>', @tb)
        set @sql = replace (@sql, '<sampling>', @sampling)
        set @sql = replace (@sql, '"', '''')
        Exec yExecNLog.LogAndOrExec 
          @context = 'yMaint.UpdateStats'
        , @Info = 'update statistics selected'  
        , @sql = @sql
        , @JobNo = @JobNo
      End
    end -- While
 
    Exec yMaint.UnLockMaintDb @jobNo, @DbName
    
  End -- While boucle banque par banque
 
  End try
  Begin catch
    Exec yExecNLog.LogAndOrExec @jobNo = @jobNo, @context = 'yMaint.UpdateStats Error', @err = '?'
  End Catch
End -- yMaint.UpdateStats

参考资料:

http://msdn.microsoft.com/en-us/library/dd535534.aspx

http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx 

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
大数据常见问题:数据倾斜
大数据常见问题:数据倾斜
5 0
建议收藏丨sql行转列的一千种写法!!
建议收藏丨sql行转列的一千种写法!!
10 0
想知道大厂面试都问什么吗,附最强面试技巧!!(大数据开发岗)
想知道大厂面试都问什么吗,附最强面试技巧!!(大数据开发岗)
7 0
Hive小文件问题:如何产生、造成影响、解放办法
Hive小文件问题:如何产生、造成影响、解放办法
7 0
〖Docker指南⑤〗学习Dockerfile,看这一篇就够了
Dockerfile是用来构建Docker镜像的构建文件,是由一系列命令和参数构成的脚本。
7 0
我绘制文章插图的三个神级工具
今天来给大家介绍三个我在制作文章插图时经常使用的堪称神级的工具,每一个工具都可以绘制非常精美图片,对于经常输出文章的朋友来说,绝对不容错过~
6 0
万字长文丨7个经典问题,助你拿下Java面试(建议收藏)
万字长文丨7个经典问题,助你拿下Java面试(建议收藏)
8 0
10行Python代码能做出哪些有趣的事情?
Python 凭借语法的易学性,代码的简洁性以及类库的丰富性,赢得了众多开发者的喜爱。下面我们来看看,用不超过10行代码能实现些什么有趣的功能
15 0
+关注
杰克.陈
一个安静的程序猿~
10427
文章
2
问答
文章排行榜
最热
最新
相关电子书
更多
OceanBase 入门到实战教程
立即下载
阿里云图数据库GDB,加速开启“图智”未来.ppt
立即下载
实时数仓Hologres技术实战一本通2.0版(下)
立即下载