今晚办公楼停电维护,需要提前关闭服务器,为防止异常关闭导致的各种问题,有个测试库,使用shutdown normal停库,结果就是很常见的hang住了。
操作顺序:
1. shutdown normal,然后关闭了当前sqlplus窗口。从alert日志中看:
Mon Jun 22 16:50:22 2015
Shutting down instance (normal)
Stopping background process SMCO
Shutting down instance: further logons disabled
这里涉及到shutdown normal的原理,稍后引述。
2. 此时重新登录,sqlplus / as sysdba,执行startup或shutdown immediate命令都提示失败,
SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 22 17:03:06 2015 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected. SQL> startup ORA-01012: not logged on SQL> shutdown immediate ORA-24324: service handle not initialized ORA-24323: value not allowed ORA-01090: shutdown in progress - connection is not permitted Mon Jun 22 16:50:24 2015 Stopping background process CJQ0 Stopping background process QMNC Stopping background process MMNL Stopping background process MMON License high water mark = 125
ORA-01090提示说正在执行关闭操作,不允许其他连接的操作。
3. 其实这涉及到normal关闭的原理,他需要等待所有已连接用户中断连接,换句话说,如果仍有连接到库的用户,shutdown的操作就一直等待。这是最完全的关闭方式,但同时是变数最大的,因为可能你不知其他用户什么时候中断。
首先尝试查找出所有连接用户,用kill -9直接杀进程。
可以使用ps -ef查找所有(LOCAL=NO)的进程,LOCAL=NO表示连接不是本地,而是远程。
ps -ef|grep ora|grep -v grep|grep -v ora_|grep LOCAL=NO|awk '{print $2}',然后kill -9 进程号 或者 ps -ef|grep ora|grep -v grep|grep -v ora_|grep LOCAL=NO|awk '{print $2}'|xargs kill从alert日志看:
Mon Jun 22 16:55:26 2015
Active process 27446 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27402 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27555 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11697 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 14942 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27559 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27513 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 26911 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 31993 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 30810 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27557 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11684 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11666 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 27510 user 'oracle11g' program 'oracle@dcsopenNode1'
Active process 11688 user 'oracle11g' program 'oracle@dcsopenNode1'
SHUTDOWN: waiting for logins to complete.
Mon Jun 22 17:01:29 2015
All dispatchers and shared servers shutdown
是提示了所有dispatcher和共享服务关闭,但sqlplus登录后仍是上面的提示。
4. 尝试关闭监听服务,lsnrctl stop。
问题依旧。
5. 重登陆执行shutdown abort,强制关闭。
从alert日志看:
USER (ospid: 28558): terminating the instance
Instance terminated by USER, pid = 28558
看样子是关闭了实例。
重新执行sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Mon Jun 22 17:43:25 2015 Copyright (c) 1982, 2009, Oracle. All rights reserved. Connected to an idle instance.
再次执行startup-shutdown normal,
SQL> startup ORACLE instance started. Total System Global Area 3290345472 bytes Fixed Size 2217832 bytes Variable Size 2499807384 bytes Database Buffers 771751936 bytes Redo Buffers 16568320 bytes Database mounted. Database opened. SQL> shutdown normal Database closed. Database dismounted. ORACLE instance shut down.由于现在已经没有连接的用户了,正常启动,正常关闭了。
从alert日志看,
Mon Jun 22 17:46:01 2015 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Picked latch-free SCN scheme 3 Using LOG_ARCHIVE_DEST_1 parameter default value as /oracle/ora11gR2/product/11.2.0/dbhome_1/dbs/arch Autotune of undo retention is turned on. IMODE=BR ILAT =27 LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options. Using parameter settings in server-side spfile /oracle/ora11gR2/product/11.2.0/dbhome_1/dbs/spfiledcsopen.ora System parameters with non-default values: processes = 150 memory_target = 3152M control_files = "/oracle/ora11gR2/oradata/dcsopen/control01.ctl" control_files = "/oracle/ora11gR2/oradata/dcsopen/control02.ctl" db_block_size = 8192 compatible = "11.2.0.0.0" undo_tablespace = "UNDOTBS1" remote_login_passwordfile= "EXCLUSIVE" db_domain = "" dispatchers = "(PROTOCOL=TCP) (SERVICE=dcsopenXDB)" audit_file_dest = "/oracle/ora11gR2/admin/dcsopen/adump" audit_trail = "DB" db_name = "dcsopen" open_cursors = 300 diagnostic_dest = "/oracle/ora11gR2" Mon Jun 22 17:46:03 2015 PMON started with pid=2, OS id=30699 Mon Jun 22 17:46:03 2015 VKTM started with pid=3, OS id=30701 at elevated priority VKTM running at (10)millisec precision with DBRM quantum (100)ms Mon Jun 22 17:46:03 2015 GEN0 started with pid=4, OS id=30705 Mon Jun 22 17:46:03 2015 DIAG started with pid=5, OS id=30707 Mon Jun 22 17:46:03 2015 DBRM started with pid=6, OS id=30709 Mon Jun 22 17:46:03 2015 PSP0 started with pid=7, OS id=30711 Mon Jun 22 17:46:03 2015 DIA0 started with pid=8, OS id=30713 Mon Jun 22 17:46:03 2015 MMAN started with pid=9, OS id=30715 Mon Jun 22 17:46:03 2015 DBW0 started with pid=10, OS id=30717 Mon Jun 22 17:46:03 2015 LGWR started with pid=11, OS id=30721 Mon Jun 22 17:46:03 2015 CKPT started with pid=12, OS id=30723 Mon Jun 22 17:46:03 2015 SMON started with pid=13, OS id=30725 Mon Jun 22 17:46:03 2015 RECO started with pid=14, OS id=30727 Mon Jun 22 17:46:03 2015 MMON started with pid=15, OS id=30729 starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'... Mon Jun 22 17:46:03 2015 MMNL started with pid=16, OS id=30731 starting up 1 shared server(s) ... ORACLE_BASE from environment = /oracle/ora11gR2 Mon Jun 22 17:46:04 2015 ALTER DATABASE MOUNT Successful mount of redo thread 1, with mount id 2809595100 Database mounted in Exclusive Mode Lost write protection disabled Completed: ALTER DATABASE MOUNT Mon Jun 22 17:46:08 2015 ALTER DATABASE OPEN Thread 1 opened at log sequence 1279 Current log# 1 seq# 1279 mem# 0: /oracle/ora11gR2/oradata/dcsopen/redo01.log Successful open of redo thread 1 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set SMON: enabling cache recovery Successfully onlined Undo Tablespace 2. Verifying file header compatibility for 11g tablespace encryption.. Verifying 11g file header compatibility for tablespace encryption completed SMON: enabling tx recovery Database Characterset is ZHS16GBK No Resource Manager plan active replication_dependency_tracking turned off (no async multimaster replication found) Starting background process QMNC Mon Jun 22 17:46:09 2015 QMNC started with pid=20, OS id=30789 Completed: ALTER DATABASE OPEN Starting background process CJQ0 Mon Jun 22 17:46:11 2015 CJQ0 started with pid=22, OS id=30806 Mon Jun 22 17:46:18 2015 Shutting down instance (normal) Shutting down instance: further logons disabled Stopping background process QMNC Stopping background process CJQ0 Stopping background process MMNL Stopping background process MMON License high water mark = 5 All dispatchers and shared servers shutdown ALTER DATABASE CLOSE NORMAL Mon Jun 22 17:46:22 2015 SMON: disabling tx recovery SMON: disabling cache recovery Mon Jun 22 17:46:22 2015 Shutting down archive processes Archiving is disabled Archive process shutdown avoided: 0 active Thread 1 closed at log sequence 1279 Successful close of redo thread 1 Completed: ALTER DATABASE CLOSE NORMAL ALTER DATABASE DISMOUNT Completed: ALTER DATABASE DISMOUNT ARCH: Archival disabled due to shutdown: 1090 Shutting down archive processes Archiving is disabled Archive process shutdown avoided: 0 active Mon Jun 22 17:46:23 2015 Stopping background process VKTM: ARCH: Archival disabled due to shutdown: 1090 Shutting down archive processes Archiving is disabled Archive process shutdown avoided: 0 active Mon Jun 22 17:46:25 2015 Instance shutdown complete
借鉴《Concept》,一些知识点:
1. 如果用户试图访问一个正在关闭的数据库,会得到错误提示:ORA-01090: shutdown in progress - connection is not permitted.
2. 关闭数据库,必须用SYSOPER或SYSDBA的角色。
3. 关闭数据库,是有超时时间的,如果用户未中断连接,或者交易未完成,超过一小时,则shutdown命令会取消,提示错误:ORA-01013: user requested cancel of current operation.
4. 几种关闭库的参数,
shutdown normal:
默认的关闭参数,需要两个条件:
(1) 执行语句后,不允许新的连接。
(2) 数据库关闭之前,数据库会等待所有已连接用户中断连接。
下一次启动时不需要实例恢复。
shutdown immediate:
使用场景:
(1) 初始化一个自动,无人值守的备份。
(2) 马上就要断电。
(3) 数据库或应用工作不正常,你不能马上联系到用户退出登录或他们无法退出登录。
条件:
(1) 不允许新的连接,不允许新的交易。
(2) 任何未提交的事务会回滚(如果此时有个长交易,未提交,那么不会像这种关闭名称immediate那样迅速地关闭)。
(3) 不会等待已连接用户退出登录。数据库会隐式回滚活动事务,中断连接用户。
下一次启动时不需要实例恢复。
shutdown transactional:
适用于计划停机,允许活动交易处理完成后再停止实例的场景。
条件:
(1) 不允许新的连接,不允许新的交易。
(2) 所有交易完成后,会中断所有和库的连接。
(3) 在这个时间点,关闭实例就像执行了shutdown immediate。
下一次启动时不需要实例恢复。
transactional参数主要会防止用户丢失交易,同时不需要所有用户退出登录。
shutdown abort:
适用场景:
数据库或应用不能正常工作,并且没有其它类型的关闭操作正在进行。
(1) 需要立即关闭数据库(例如,一分钟后电源会被关闭)。
(2) 启动实例时碰到了问题。
条件:
(1) 不允许新的连接,不允许新的交易。
(2) 正在被Oracle处理的客户端SQL语句会被立即中断。
(3) 未提交事务不会回滚。
(4) Oracle不会等待正保持连接的客户端退出登录。数据库会隐式地中断所有连接。
下一次启动时需要进行实例恢复。
总结:
以上四种参数会适合于不同的场景,简单讲,shutdown normal是默认的关闭方式,最完整的关闭方式,缺点是需要被动等待所有交易完成,所有用户退出登录。shutdown immediate只要不存在较长的需要回滚的事务,其关闭时间会快。shutdown transactional会最大限度地保证交易的完成。前三种都不需要实例恢复。shutdown abort则是最暴力的关闭,关闭时间最快,但代价是启动需要实例恢复,因为关闭时存在未回滚未提交的事务。