上午接近11点,一同事说测试环境的数据库起不来了,可以通过sqlplus连接数据库,但是执行startup命令时,会出现hung住的情况。查看日志只有早上8:23之前的,之后对数据库进行启动的操作是没有记录的
日志大致内容如下:
ORA-00445: background process "J000" did not start after 120 seconds
Thu Oct 27 07:20:21 2011
Errors in file /apsarapangu/disk1/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_smco_654.trc (incident=10194):
ORA-00445: background process "W000" did not start after 120 seconds
Thu Oct 27 07:25:04 2011
Dumping diagnostic data in directory=[cdmp_20111027072504], requested by (instance=1, sid=654 (SMCO)), summary=[incident=10194].
Thu Oct 27 07:25:40 2011
kkjcre1p: unable to spawn jobq slave process
Errors in file /apsarapangu/disk1/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_cjq0_400.trc:
Thu Oct 27 07:32:59 2011
Process q000 died, see its trace file
Process 0x0xab06fe628 appears to be hung while dumping
Attempting to kill process 0x0xab06fe628 with OS pid = 32572
OSD kill succeeded for process 0xab06fe628
Process 0x0xab0707aa8 appears to be hung while dumping
Attempting to kill process 0x0xab0707aa8 with OS pid = 400
OSD kill succeeded for process 0xab0707aa8
Process 0x0xac8713678 appears to be hung while dumping
Attempting to kill process 0x0xac8713678 with OS pid = 654
OSD kill succeeded for process 0xac8713678
Thu Oct 27 07:59:20 2011
Restarting dead background process CJQ0
Thu Oct 27 07:59:38 2011
Restarting dead background process MMON
Restarting dead background process SMCO
Thu Oct 27 07:59:50 2011
Starting background process CJQ0
Starting background process MMON
Thu Oct 27 08:00:59 2011
Process q000 died, see its trace file
Thu Oct 27 08:01:02 2011
Starting background process SMCO
Thu Oct 27 08:02:21 2011
Process SMCO died, see its trace file
Errors in file /apsarapangu/disk1/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pmon_32498.trc:
ORA-00443: background process "SMCO" did not start
Starting background process CJQ0
Starting background process MMON
Thu Oct 27 08:02:39 2011
Process q000 died, see its trace file
Thu Oct 27 08:02:39 2011
Restarting dead background process SMCO
Thu Oct 27 08:03:55 2011
Process SMCO died, see its trace file
Errors in file /apsarapangu/disk1/opt/oracle/diag/rdbms/orcl/orcl/trace/orcl_pmon_32498.trc:
ORA-00443: background process "SMCO" did not start
出现ORA-00443,ORA-00445
ORA-00443: background process "SMCO" did not start
ORA-00445: background process "W000" did not start after 120 seconds
对于ORA-00443 官方的介绍如下:
Cause: The specified process did not start.
Action: Check that the executable image is in the correct place with the
correct protections and that there is enough memory.
对于ORA-00445 官方的介绍如下:
Cause: The specified process did not start.
Action: Check and, if necessary, correct problems indicated by one or more of the following:
the size of the SGA
the operating system-specific initialization parameters
accompanying messages
the background trace file
the executable image is not in the right location with the correct protections
两个错误都指示和内存是否足够相关,因此查看oracle 的内存设置情况。由于数据库没有启动且使用的spfile,所以通过如下方式:
SQL> create pfile from spfile;
File created.
查看 pfile中的关于内存配置的情况:
orcl.__db_cache_size=21877489664 ~20.375G
orcl.__java_pool_size=134217728 ~128M
orcl.__large_pool_size=134217728 ~128M
orcl.__pga_aggregate_target=17179869184 ~16G
orcl.__sga_target=25769803776 ~24G
orcl.__shared_io_pool_size=0
orcl.__shared_pool_size=3221225472 ~307.2M
orcl.__streams_pool_size=134217728 ~128 M
*.memory_target=44445899345920 ---约41393.46G.而整个PC server 的内存不过141G!!
上面的memory_target配置完全超过了系统的内存大小,因此造成了alert 报警日志产生的错误
修改 memory_target的值为45G。重新启动:
SQL> startup nomount pfile='/apsarapangu/disk1/opt/oracle/products/11.2.0/dbs/initorcl.ora';
ORACLE instance started.
Total System Global Area 4.5698E+10 bytes
Fixed Size 2236784 bytes
Variable Size 2.3757E+10 bytes
Database Buffers 2.1877E+10 bytes
Redo Buffers 61263872 bytes
利用pfile 创建新的spfile(将老的备份)
SQL> create spfile from pfile='/apsarapangu/disk1/opt/oracle/products/11.2.0/dbs/initorcl.ora';
File created.
SQL> shutdown immediate
ORA-01507: database not mounted
ORACLE instance shut down.
使用spfile文件启动,成功
SQL> startup nomount
ORACLE instance started.
Total System Global Area 4.5698E+10 bytes
Fixed Size 2236784 bytes
Variable Size 2.3757E+10 bytes
Database Buffers 2.1877E+10 bytes
Redo Buffers 61263872 bytes
SQL> alter database mount;
Database altered.
SQL> alter database open;
Database altered.
SQL> exit
成功启动。。