Dataguard搭建问题小记

简介:

由于希望测试一个DG问题,所以在Vmware上搭建测试环境,primary和standby放在同一台主机上。   
搭建过程就不赘述,但是在最后,遇到了一些问题,这里记录一下,以备后续查阅。

----搭建过程中的一些命令-----   

1
2
3
4
5
6
7
1 . backup database format  '/tmp/bk_%U' ;
2 . backup current controlfile  for  standby format  '/tmp/stdbyctl.bkp' ;
3 . catalog start  with  '/tmp/' ;
4 set  controlfile autobackup format  for  device type disk to  '/tmp/%F' ;
    restore standby controlfile from  '/tmp/stdbyctl.bkp'
5 . RMAN> CONNECT TARGET SYS/oracle@db;
RMAN> CONNECT AUXILIARY SYS/oracle@stddb;

6. 最会std的pfile文件内容    

1
2
3
4
5
6
7
8
9
10
11
12
13
db_file_name_convert=  '+data/db/' , '+reco/stddb/' 
log_file_name_convert=  '+RECO/db/archivelog/' , '+RECO/stddb/archivelog_std/' 
log_archive_format=%t_%s_%r.arc 
standby_file_management=auto 
compatible= '11.2.0.0.0' 
control_files= '+RECO/stddb/controlfile/current.260.834947597'     <<<<<<在恢复control之前,先设定为control_files= '+RECO' 恢复之后,根据实际位置更改 
db_block_size= 8192 
processes= 150 
remote_login_passwordfile= 'EXCLUSIVE' 
undo_tablespace= 'UNDOTBS1' 
undo_management=auto 
log_archive_config= 'dg_config=(primdb,stbdb)' 
log_archive_dest_1= 'location=+RECO/stddb/archivelog_std/  valid_for=(all_logfiles,all_roles)  db_unique_name=stbdb'

问题一:

在设置listener的过程中,由于设定静态监听,在standby启动之后,同时又会注册一个动态监听,目前有两个监听,其中动态监听状态为blocked。   
遇到的问题就是,在duplicate连接的时候,就会发生ORA-12528错误,而不能正常连接到standby。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[oracle@OEL ~]$ lsnrctl status
LSNRCTL  for  Linux: Version  11.2 . 0.1 . 0  - Production on  30 -DEC- 2013  10 : 46 : 03
Copyright (c)  1991 2009 , Oracle.  All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT= 1521 ))
STATUS of the LISTENER 
------------------------ 
Alias                     LISTENER 
Version                   TNSLSNR  for  Linux: Version  11.2 . 0.1 . 0  - Production 
Start  Date                 27 -NOV- 2013  12 : 40 : 26 
Uptime                     32  days  22  hr.  5  min.  37  sec 
Trace Level               off 
Security                  ON: Local OS Authentication 
SNMP                      OFF 
Listener Parameter File   /u01/app/ 11.2 . 0 /grid/network/admin/listener.ora 
Listener Log File         /u01/app/oracle/diag/tnslsnr/OEL/listener/alert/log.xml 
Listening Endpoints Summary... 
   (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521))) 
   (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=OEL.localdomain)(PORT= 1521 ))) 
Services Summary... 
Service  "+ASM"  has  1  instance(s). 
   Instance  "+ASM" , status READY, has  1  handler(s)  for  this  service... 
Service  "PRIMDB"  has  1  instance(s). 
   Instance  "db" , status READY, has  1  handler(s)  for  this  service... 
Service  "dbXDB"  has  1  instance(s). 
   Instance  "db" , status READY, has  1  handler(s)  for  this  service... 
Service  "stbdb"  has  1  instance(s). 
   Instance  "stbdb" , status UNKNOWN, has  1  handler(s)  for  this  service...    <<<<<<<<<<<<<<<<<<<<<< 
Service  "stddb"  has  1  instance(s). 
   Instance  "stddb" , status READY, has  1  handler(s)  for  this  service...     <<<<<<<<<<<<<<<<<<<<<< 
The command completed successfully 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

通过设置tnsnames.ora,可以解决这个问题。   

1
2
3
4
5
6
7
8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
stddb = 
(DESCRIPTION = 
   (ADDRESS_LIST = 
     (ADDRESS = (PROTOCOL = TCP)(HOST = OEL.localdomain)(PORT =  1521 )) 
  
  (CONNECT_DATA = (SERVICE_NAME = stddb)(UR=A)) <--------------------In order to avoid error ORA- 12528  )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

问题二:

在执行下面命令之后,直接primary就crash掉,经过查询alert发现,是redo broken导致的   
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

查询standby alert,发现如下信息:   
(在ALTER DATABASE RECOVER...开始之后,第一件事情就是clear redo log,这样就直接导致primary crash)    

1
2
3
4
5
6
7
8
9
10
11
12
13
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
# tail -f alert_stddb.log 
Clearing online redo logfile  1  complete 
Clearing online redo logfile  2  +DATA/db/redo02.log 
Clearing online log  2  of thread  1  sequence number  5 
Clearing online redo logfile  2  complete 
Clearing online redo logfile  3  +DATA/db/redo03.log 
Clearing online log  3  of thread  1  sequence number  3 
Tue Dec  24  14 : 32 : 49  2013 
Clearing online redo logfile  3  complete 
Tue Dec  24  14 : 32 : 49  2013 
Media Recovery Waiting  for  thread  1  sequence  4 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

在重建几次之后,问题依然。   
由于datafile和archivelog都使用convert参数转换了,但是redo是无法转换的,如何解决这个问题呢?    
经过分析,发现在duplicate的最后,有一些警告信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
~~~~~~~~~duplicate the standby databsae~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
RMAN> CONNECT TARGET SYS/oracle@db; 
RMAN> CONNECT AUXILIARY SYS/oracle@stddb;
connected to auxiliary database: DB (not mounted)
RMAN> duplicate target database  for  standby;
Starting Duplicate Db at  24 -DEC- 13
allocated channel: ORA_AUX_DISK_1 
channel ORA_AUX_DISK_1: SID= 13  device type=DISK
contents of Memory Script:
    restore clone standby controlfile; 
executing Memory Script
Starting restore at  24 -DEC- 13
using channel ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: starting datafile backup  set  restore
channel ORA_AUX_DISK_1: restoring control file 
channel ORA_AUX_DISK_1: reading from backup piece /tmp/stdbyctl.bkp 
channel ORA_AUX_DISK_1: piece handle=/tmp/stdbyctl.bkp tag=TAG20131224T133449 
channel ORA_AUX_DISK_1: restored backup piece  1 
channel ORA_AUX_DISK_1: restore complete, elapsed time:  00 : 00 : 08 
output file name=+RECO/stddb/controlfile/current. 259.835018533 
Finished restore at  24 -DEC- 13
contents of Memory Script:
    sql clone  'alter database mount standby database'
executing Memory Script
sql statement: alter database mount standby database
contents of Memory Script:
    set  newname  for  tempfile   1  to 
  "+RECO/stddb/temp01.dbf"
    switch  clone tempfile all; 
    set  newname  for  datafile   1  to 
  "+RECO/stddb/system01.dbf"
    set  newname  for  datafile   2  to 
  "+RECO/stddb/sysaux01.dbf"
    set  newname  for  datafile   3  to 
  "+RECO/stddb/undotbs01.dbf"
    set  newname  for  datafile   4  to 
  "+RECO/stddb/users01.dbf"
    restore 
    clone database 
   
executing Memory Script
executing command: SET NEWNAME
renamed tempfile  1  to +RECO/stddb/temp01.dbf  in  control file
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
Starting restore at  24 -DEC- 13
using channel ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: starting datafile backup  set  restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup  set 
channel ORA_AUX_DISK_1: restoring datafile  00001  to +RECO/stddb/system01.dbf 
channel ORA_AUX_DISK_1: restoring datafile  00002  to +RECO/stddb/sysaux01.dbf 
channel ORA_AUX_DISK_1: restoring datafile  00003  to +RECO/stddb/undotbs01.dbf 
channel ORA_AUX_DISK_1: restoring datafile  00004  to +RECO/stddb/users01.dbf 
channel ORA_AUX_DISK_1: reading from backup piece /tmp/bk_01osanei_1_1 
channel ORA_AUX_DISK_1: piece handle=/tmp/bk_01osanei_1_1 tag=TAG20131224T132953 
channel ORA_AUX_DISK_1: restored backup piece  1 
channel ORA_AUX_DISK_1: restore complete, elapsed time:  00 : 04 : 27 
Finished restore at  24 -DEC- 13
contents of Memory Script:
    switch  clone datafile all; 
executing Memory Script
datafile  1  switched to datafile copy
input datafile copy RECID= 1  STAMP= 835019033  file name=+RECO/stddb/system01.dbf 
datafile  2  switched to datafile copy 
input datafile copy RECID= 2  STAMP= 835019033  file name=+RECO/stddb/sysaux01.dbf 
datafile  3  switched to datafile copy 
input datafile copy RECID= 3  STAMP= 835019033  file name=+RECO/stddb/undotbs01.dbf 
datafile  4  switched to datafile copy 
input datafile copy RECID= 4  STAMP= 835019033  file name=+RECO/stddb/users01.dbf 
ORACLE error from auxiliary database: ORA- 01511 : error  in  renaming log/data files      <<<<<<<<<<<<<<<<<<<<<<<< 
ORA- 01275 : Operation RENAME  is  not allowed  if  standby file management  is  automatic.    <<<<<<<<<<<<<<<<<<<<<<<<
RMAN- 05535 : WARNING: All redo log files were not defined properly.
ORACLE error from auxiliary database: ORA- 01511 : error  in  renaming log/data files 
ORA- 01275 : Operation RENAME  is  not allowed  if  standby file management  is  automatic.
RMAN- 05535 : WARNING: All redo log files were not defined properly.
ORACLE error from auxiliary database: ORA- 01511 : error  in  renaming log/data files 
ORA- 01275 : Operation RENAME  is  not allowed  if  standby file management  is  automatic.
RMAN- 05535 : WARNING: All redo log files were not defined properly.
Finished Duplicate Db at  24 -DEC- 13 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

        后手动修改当standby_file_management为MANUAL模式,可以duplicate并且rename成功,并rename正broken0~2.   
(这个是我在pfile中有这个standby_file_management=auto)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
   checkpoint  is  977553
   last deallocation scn  is  959057 
   Undo Optimization current scn  is  974128 
Tue Dec  24  15 : 16 : 57  2013 
Switch of datafile  1  complete to datafile copy 
   checkpoint  is  977553 
Switch of datafile  2  complete to datafile copy 
   checkpoint  is  977553 
Switch of datafile  3  complete to datafile copy 
   checkpoint  is  977553 
Switch of datafile  4  complete to datafile copy 
   checkpoint  is  977553 
alter database rename file  '+DATA/db/redo01.log'  to  'broken0' 
Completed: alter database rename file  '+DATA/db/redo01.log'  to  'broken0'    <<<<<<<< 
alter database rename file  '+DATA/db/redo02.log'  to  'broken1' 
Completed: alter database rename file  '+DATA/db/redo02.log'  to  'broken1' 
alter database rename file  '+DATA/db/redo03.log'  to  'broken2' 
Completed: alter database rename file  '+DATA/db/redo03.log'  to  'broken2' 
RFS connections are allowed

     在执行ALTER DATABASE RECOVER...的时候,依然需要去清空redo,只不过,应经将redo rename了,所以不会影响主库的redo log了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
~~~~~~~~~~~~~~~~~~~~~
Tue Dec  24  15 : 45 : 37  2013 
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION 
Attempt to start background Managed Standby Recovery process (stddb) 
Tue Dec  24  15 : 45 : 37  2013 
MRP0 started  with  pid= 21 , OS id= 21343 
MRP0: Background Managed Standby Recovery process started (stddb) 
  started logmerger process 
Tue Dec  24  15 : 45 : 43  2013 
Managed Standby Recovery not using Real Time Apply 
Parallel Media Recovery started  with  2  slaves 
Waiting  for  all non-current ORLs to be archived... 
All non-current ORLs have been archived. 
Errors  in  file /u01/app/oracle/product/ 11.2 . 0 /dbhome_1/log/diag/rdbms/stddb/stddb/ trace /stddb_mrp0_21343.trc: 
ORA- 00313 : open failed  for  members of log group  1  of thread  1 
ORA- 00312 : online log  1  thread  1 '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken0'      <<<<<<<<<<<<<<<< 
ORA- 27037 : unable to obtain file status 
Linux-x86_64 Error:  2 : No such file or directory 
Additional information:  3 
...... 
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION 
Clearing online redo logfile  1  complete 
Errors  in  file /u01/app/oracle/product/ 11.2 . 0 /dbhome_1/log/diag/rdbms/stddb/stddb/ trace /stddb_mrp0_21343.trc: 
ORA- 00313 : open failed  for  members of log group  2  of thread  1 
ORA- 00312 : online log  2  thread  1 '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken1' 
ORA- 27037 : unable to obtain file status 
Linux-x86_64 Error:  2 : No such file or directory 
Additional information:  3 
...... 
Tue Dec  24  15 : 45 : 48  2013 
Clearing online redo logfile  2  complete 
Errors  in  file /u01/app/oracle/product/ 11.2 . 0 /dbhome_1/log/diag/rdbms/stddb/stddb/ trace /stddb_mrp0_21343.trc: 
ORA- 00313 : open failed  for  members of log group  3  of thread  1 
ORA- 00312 : online log  3  thread  1 '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken2' 
ORA- 27037 : unable to obtain file status 
Linux-x86_64 Error:  2 : No such file or directory 
Additional information:  3 
...... 
Clearing online redo logfile  3  complete 
Media Recovery Waiting  for  thread  1  sequence  4


        总结:由于之前的DG搭建,都是在两个不同的主机上实施,也没有关心过这个问题,因为standby清空primary redo的动作一定是失败的。但是现在是在一个主机上,standby能发现原primary的redo,并且由于rename失败,所以就直接clear primary redo导致primary 无法启动。    
       而root cause就是我在编辑pfile的时候,多写了standby_file_management=auto,其实不写的话,默认值是manual的,就不会出现这个问题。










本文转自 hsbxxl 51CTO博客,原文链接:http://blog.51cto.com/hsbxxl/1346635,如需转载请自行联系原作者

目录
相关文章
|
3月前
|
Oracle 关系型数据库 数据库
手把手教你Oracle DataGuard主备切换(switchover)
手把手教你Oracle DataGuard主备切换(switchover)
483 4
|
数据库
dataguard 搭建
dataguard 搭建
157 0
|
SQL Oracle 网络协议
实战不停机搭建ORACLE DataGuard
一、基础信息 操作系统:Red Hat Enterprise Linux Server release 6.3 (Santiago) 数据库:11.
1050 1
|
Oracle 关系型数据库 数据库
RAC到单机搭建DataGuard
一、环境信息 1、服务器:Cetnos 7.22、数据库版本:11.2.0.43、RAC -> 单机 二、环境准备      目标是搭建RAC到单机的Activity DataGuard,首先RAC环境是准备完善的,接下来需要搭建一个单机环境。
1970 0
|
Oracle 网络协议 关系型数据库
|
SQL Oracle 关系型数据库