今天查看Job的History,发现Job 运行失败,错误信息是:“The transaction log for database 'xxxx' is full due to 'ACTIVE_TRANSACTION'.”
错误消息表明:数据库的事务日志文件空间耗尽,log 文件不能再存储新的transaction log。
SQL Server将事务日志文件在逻辑上划分为多个VLF(Virtual Log Files),将这些VLF组成一个的环形结构,以VLF为重用单元。如果一个VLF 中存在Active Transaction,那么该VLF就不能被截断和重用。如果事务日志文件没有可用的VLF,那么SQL Server就不能处理新增的事务,并抛出事务日志文件耗尽的错误消息。
那为什么Active Transaction 会导致事务日志文件耗尽?
1,如果数据库的事务日志文件太大,将整个Disk Space耗尽,那么就要考虑是什么原因造成事务日志文件大量增长,定期做事务日志备份能够截断事务日志文件。
2,如果数据库的事务日志文件本身不是很大,可能的原因是SQL Server 无法为事务日志文件分配Disk Space。
3,查看数据库中活动的事务,如果是由于一个事务运行时间太长,没有关闭,导致事务日志的VLF不能重用,那么必须修改应用程序。
如果数据库中某一个 Transaction 运行的时间太长,导致其他transaction虽然被commint,但是其占用的VLF仍然被标记为Active,不能被truncate和reuse,当log文件中没有可用的VLF,而SQL Server又要处理新增的Transaction时,SQL Server就会报错。
step1,查看事务日志文件的大小
查看日志文件的 size_gb 和 max_size_gb 字段,发现该事务日志文件的大小没有达到最大值,并且事务日志文件占用的Disk Space并不是很大,我猜想,很可能是日志文件所在的Disk Space 被使用殆尽,没有剩余的free space。
select db.name as database_name, db.is_auto_shrink_on, db.recovery_model_desc, mf.file_id, mf.type_desc, mf.name as logic_file_name, mf.size*8/1024/1024 as size_gb, mf.physical_name, iif(mf.max_size=-1,-1,mf.max_size*8/1024/1024) as max_size_gb, mf.growth, mf.is_percent_growth, mf.state_desc from sys.databases db inner join sys.master_files mf on db.database_id=mf.database_id where mf.size*8/1024/1024>1 -- GB and db.name='database name' and mf.type=0 order by size_gb desc
step2,查看Disk的Free Space
查询结果显示,D盘空间仅仅剩下9MB,正是事务日志文件所在的Disk。
exec sys.xp_fixeddrives
step3,Disk Space 用尽,必须想办法将大的数据文件压缩,或者将事务日志文件截断。
由于数据库的恢复模式是simple,会自动截断事务日志文件,因此,最大的可能是disk space耗尽。
1,查看数据库空间的使用情况
exec sys.sp_spaceused
unallocated space 空闲很大,必须压缩数据库,以释放disk space
2,收缩(shrink)数据库文件
use target_database_name go select file_id, type, type_desc, data_space_id, name, size*8/1024/1024 as size_gb, growth, is_percent_growth, physical_name, max_size from sys.database_files dbcc shrinkfile('file logcial name',0,notruncate) dbcc shrinkfile('file logcial name',target_size_mb,truncateonly)
3,对数据库中的 table 和 index 压缩存储
3.1, 查看数据库中,占用存储空间非常大的table;
use target_database_name go select t.name, sum(case when ps.index_id<2 then ps.row_count else 0 end) as row_count, sum(ps.reserved_page_count)*8/1024/1024 as reserved_gb, sum(ps.used_page_count)*8/1024/1024 as used_gb, sum( case when ps.index_id<2 then ps.in_row_data_page_count+ps.lob_used_page_count+ps.row_overflow_used_page_count else 0 end )*8/1024/1024 as data_used_gb, sum(case when ps.index_id>=2 then ps.in_row_data_page_count+ps.lob_used_page_count+ps.row_overflow_used_page_count else 0 end )*8/1024/1024 as index_used_gb from sys.dm_db_partition_stats ps inner join sys.tables t on ps.object_id=t.object_id group by t.object_id, t.name order by used_gb desc
3.2, 查看table及其Index是否被压缩过
select p.partition_id,object_name(p.object_id) as ObjectName, p.index_id, p.rows, p.data_compression, p.data_compression_desc, au.Type, au.Type_desc, au.total_pages, au.used_pages, au.data_pages from sys.partitions p inner join sys.allocation_units au on p.partition_id=au.container_id where p.object_id=object_id('[dbo].[table_name]',N'U')
3.3,估计压缩能够节省的存储空间
exec sys.sp_estimate_data_compression_savings @schema_name='dbo', @object_name='table_name', @index_id=1, @partition_number=null, @data_compression ='page'
3.4, 对table及其index进行数据压缩
对table 及其index 进行 rebuild,SQL Server将重新分配存储空间,慎重:rebuild 反而会增加数据库占用的存储空间。在数据压缩存储之后,必须shrink 数据库文件,才能释放数据库所占用的存储空间,增加Disk的Free Space。
alter table [dbo].table_name rebuild with(data_compression=page) alter index index_name on [dbo].table_name rebuild with(data_compression=page)
4,增加事务日志文件
参考:《The transaction log for database 'tempdb' is full due to 'ACTIVE_TRANSACTION'》
Appendix:《Log Reuse Waits Explained: ACTIVE_TRANSACTION》
SQL Server will return a log_reuse_wait_desc value of ACTIVE_ TRANSACTION if it runs out of virtual log files because of an open transaction. Open transactions prevent virtual log file reuse, because the information in the log records for that transaction might be required to execute a rollback operation.
To prevent this log reuse wait type, make sure you design you transactions to be as short lived as possible and never require end user interaction while a transaction is open.
To resolve this wait, you have to commit or rollback all transactions. The safest strategy is to just wait until the transactions finish themselves. Well-designed transactions are usually short lived, but there are many reasons that can turn a normal transaction into a log running one. If you cannot afford to wait for an extra-long running transaction to finish, you might have to kill its session. However, that will cause that transaction to be rolled back. Keep this in mind when designing your application and try to keep all transactions as short as possible.
One common design mistake that can lead to very long running transactions is to require user interaction while the transaction is open. If the person that started the transaction went to lunch while the system is waiting for a response, this transaction can turn into a very-long-running transaction. During this time other transactions, if they are not blocked by this one, will eventually fill up the log and cause the log file to grow.