由于希望测试一个DG问题,所以在Vmware上搭建测试环境,primary和standby放在同一台主机上。
搭建过程就不赘述,但是在最后,遇到了一些问题,这里记录一下,以备后续查阅。
----搭建过程中的一些命令-----
1
2
3
4
5
6
7
|
1
. backup database format
'/tmp/bk_%U'
;
2
. backup current controlfile
for
standby format
'/tmp/stdbyctl.bkp'
;
3
. catalog start
with
'/tmp/'
;
4
.
set
controlfile autobackup format
for
device type disk to
'/tmp/%F'
;
restore standby controlfile from
'/tmp/stdbyctl.bkp'
;
5
. RMAN> CONNECT TARGET SYS/oracle@db;
RMAN> CONNECT AUXILIARY SYS/oracle@stddb;
|
6. 最会std的pfile文件内容
1
2
3
4
5
6
7
8
9
10
11
12
13
|
db_file_name_convert=
'+data/db/'
,
'+reco/stddb/'
log_file_name_convert=
'+RECO/db/archivelog/'
,
'+RECO/stddb/archivelog_std/'
log_archive_format=%t_%s_%r.arc
standby_file_management=auto
compatible=
'11.2.0.0.0'
control_files=
'+RECO/stddb/controlfile/current.260.834947597'
<<<<<<在恢复control之前,先设定为control_files=
'+RECO'
恢复之后,根据实际位置更改
db_block_size=
8192
processes=
150
remote_login_passwordfile=
'EXCLUSIVE'
undo_tablespace=
'UNDOTBS1'
undo_management=auto
log_archive_config=
'dg_config=(primdb,stbdb)'
log_archive_dest_1=
'location=+RECO/stddb/archivelog_std/ valid_for=(all_logfiles,all_roles) db_unique_name=stbdb'
|
问题一:
在设置listener的过程中,由于设定静态监听,在standby启动之后,同时又会注册一个动态监听,目前有两个监听,其中动态监听状态为blocked。
遇到的问题就是,在duplicate连接的时候,就会发生ORA-12528错误,而不能正常连接到standby。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[oracle@OEL ~]$ lsnrctl status
LSNRCTL
for
Linux: Version
11.2
.
0.1
.
0
- Production on
30
-DEC-
2013
10
:
46
:
03
Copyright (c)
1991
,
2009
, Oracle. All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=
1521
))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR
for
Linux: Version
11.2
.
0.1
.
0
- Production
Start
Date
27
-NOV-
2013
12
:
40
:
26
Uptime
32
days
22
hr.
5
min.
37
sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/
11.2
.
0
/grid/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/OEL/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=OEL.localdomain)(PORT=
1521
)))
Services Summary...
Service
"+ASM"
has
1
instance(s).
Instance
"+ASM"
, status READY, has
1
handler(s)
for
this
service...
Service
"PRIMDB"
has
1
instance(s).
Instance
"db"
, status READY, has
1
handler(s)
for
this
service...
Service
"dbXDB"
has
1
instance(s).
Instance
"db"
, status READY, has
1
handler(s)
for
this
service...
Service
"stbdb"
has
1
instance(s).
Instance
"stbdb"
, status UNKNOWN, has
1
handler(s)
for
this
service... <<<<<<<<<<<<<<<<<<<<<<
Service
"stddb"
has
1
instance(s).
Instance
"stddb"
, status READY, has
1
handler(s)
for
this
service... <<<<<<<<<<<<<<<<<<<<<<
The command completed successfully
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
通过设置tnsnames.ora,可以解决这个问题。
1
2
3
4
5
6
7
8
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stddb =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = OEL.localdomain)(PORT =
1521
))
)
(CONNECT_DATA = (SERVICE_NAME = stddb)(UR=A)) <--------------------In order to avoid error ORA-
12528
)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
问题二:
在执行下面命令之后,直接primary就crash掉,经过查询alert发现,是redo broken导致的
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
查询standby alert,发现如下信息:
(在ALTER DATABASE RECOVER...开始之后,第一件事情就是clear redo log,这样就直接导致primary crash)
1
2
3
4
5
6
7
8
9
10
11
12
13
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# tail -f alert_stddb.log
Clearing online redo logfile
1
complete
Clearing online redo logfile
2
+DATA/db/redo02.log
Clearing online log
2
of thread
1
sequence number
5
Clearing online redo logfile
2
complete
Clearing online redo logfile
3
+DATA/db/redo03.log
Clearing online log
3
of thread
1
sequence number
3
Tue Dec
24
14
:
32
:
49
2013
Clearing online redo logfile
3
complete
Tue Dec
24
14
:
32
:
49
2013
Media Recovery Waiting
for
thread
1
sequence
4
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
在重建几次之后,问题依然。
由于datafile和archivelog都使用convert参数转换了,但是redo是无法转换的,如何解决这个问题呢?
经过分析,发现在duplicate的最后,有一些警告信息:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
~~~~~~~~~duplicate the standby databsae~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RMAN> CONNECT TARGET SYS/oracle@db;
RMAN> CONNECT AUXILIARY SYS/oracle@stddb;
connected to auxiliary database: DB (not mounted)
RMAN> duplicate target database
for
standby;
Starting Duplicate Db at
24
-DEC-
13
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=
13
device type=DISK
contents of Memory Script:
{
restore clone standby controlfile;
}
executing Memory Script
Starting restore at
24
-DEC-
13
using channel ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: starting datafile backup
set
restore
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: reading from backup piece /tmp/stdbyctl.bkp
channel ORA_AUX_DISK_1: piece handle=/tmp/stdbyctl.bkp tag=TAG20131224T133449
channel ORA_AUX_DISK_1: restored backup piece
1
channel ORA_AUX_DISK_1: restore complete, elapsed time:
00
:
00
:
08
output file name=+RECO/stddb/controlfile/current.
259.835018533
Finished restore at
24
-DEC-
13
contents of Memory Script:
{
sql clone
'alter database mount standby database'
;
}
executing Memory Script
sql statement: alter database mount standby database
contents of Memory Script:
{
set
newname
for
tempfile
1
to
"+RECO/stddb/temp01.dbf"
;
switch
clone tempfile all;
set
newname
for
datafile
1
to
"+RECO/stddb/system01.dbf"
;
set
newname
for
datafile
2
to
"+RECO/stddb/sysaux01.dbf"
;
set
newname
for
datafile
3
to
"+RECO/stddb/undotbs01.dbf"
;
set
newname
for
datafile
4
to
"+RECO/stddb/users01.dbf"
;
restore
clone database
;
}
executing Memory Script
executing command: SET NEWNAME
renamed tempfile
1
to +RECO/stddb/temp01.dbf
in
control file
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
executing command: SET NEWNAME
Starting restore at
24
-DEC-
13
using channel ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: starting datafile backup
set
restore
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup
set
channel ORA_AUX_DISK_1: restoring datafile
00001
to +RECO/stddb/system01.dbf
channel ORA_AUX_DISK_1: restoring datafile
00002
to +RECO/stddb/sysaux01.dbf
channel ORA_AUX_DISK_1: restoring datafile
00003
to +RECO/stddb/undotbs01.dbf
channel ORA_AUX_DISK_1: restoring datafile
00004
to +RECO/stddb/users01.dbf
channel ORA_AUX_DISK_1: reading from backup piece /tmp/bk_01osanei_1_1
channel ORA_AUX_DISK_1: piece handle=/tmp/bk_01osanei_1_1 tag=TAG20131224T132953
channel ORA_AUX_DISK_1: restored backup piece
1
channel ORA_AUX_DISK_1: restore complete, elapsed time:
00
:
04
:
27
Finished restore at
24
-DEC-
13
contents of Memory Script:
{
switch
clone datafile all;
}
executing Memory Script
datafile
1
switched to datafile copy
input datafile copy RECID=
1
STAMP=
835019033
file name=+RECO/stddb/system01.dbf
datafile
2
switched to datafile copy
input datafile copy RECID=
2
STAMP=
835019033
file name=+RECO/stddb/sysaux01.dbf
datafile
3
switched to datafile copy
input datafile copy RECID=
3
STAMP=
835019033
file name=+RECO/stddb/undotbs01.dbf
datafile
4
switched to datafile copy
input datafile copy RECID=
4
STAMP=
835019033
file name=+RECO/stddb/users01.dbf
ORACLE error from auxiliary database: ORA-
01511
: error
in
renaming log/data files <<<<<<<<<<<<<<<<<<<<<<<<
ORA-
01275
: Operation RENAME
is
not allowed
if
standby file management
is
automatic. <<<<<<<<<<<<<<<<<<<<<<<<
RMAN-
05535
: WARNING: All redo log files were not defined properly.
ORACLE error from auxiliary database: ORA-
01511
: error
in
renaming log/data files
ORA-
01275
: Operation RENAME
is
not allowed
if
standby file management
is
automatic.
RMAN-
05535
: WARNING: All redo log files were not defined properly.
ORACLE error from auxiliary database: ORA-
01511
: error
in
renaming log/data files
ORA-
01275
: Operation RENAME
is
not allowed
if
standby file management
is
automatic.
RMAN-
05535
: WARNING: All redo log files were not defined properly.
Finished Duplicate Db at
24
-DEC-
13
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
后手动修改当standby_file_management为MANUAL模式,可以duplicate并且rename成功,并rename正broken0~2.
(这个是我在pfile中有这个standby_file_management=auto)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
checkpoint
is
977553
last deallocation scn
is
959057
Undo Optimization current scn
is
974128
Tue Dec
24
15
:
16
:
57
2013
Switch of datafile
1
complete to datafile copy
checkpoint
is
977553
Switch of datafile
2
complete to datafile copy
checkpoint
is
977553
Switch of datafile
3
complete to datafile copy
checkpoint
is
977553
Switch of datafile
4
complete to datafile copy
checkpoint
is
977553
alter database rename file
'+DATA/db/redo01.log'
to
'broken0'
Completed: alter database rename file
'+DATA/db/redo01.log'
to
'broken0'
<<<<<<<<
alter database rename file
'+DATA/db/redo02.log'
to
'broken1'
Completed: alter database rename file
'+DATA/db/redo02.log'
to
'broken1'
alter database rename file
'+DATA/db/redo03.log'
to
'broken2'
Completed: alter database rename file
'+DATA/db/redo03.log'
to
'broken2'
RFS connections are allowed
|
在执行ALTER DATABASE RECOVER...的时候,依然需要去清空redo,只不过,应经将redo rename了,所以不会影响主库的redo log了
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
~~~~~~~~~~~~~~~~~~~~~
Tue Dec
24
15
:
45
:
37
2013
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
Attempt to start background Managed Standby Recovery process (stddb)
Tue Dec
24
15
:
45
:
37
2013
MRP0 started
with
pid=
21
, OS id=
21343
MRP0: Background Managed Standby Recovery process started (stddb)
started logmerger process
Tue Dec
24
15
:
45
:
43
2013
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started
with
2
slaves
Waiting
for
all non-current ORLs to be archived...
All non-current ORLs have been archived.
Errors
in
file /u01/app/oracle/product/
11.2
.
0
/dbhome_1/log/diag/rdbms/stddb/stddb/
trace
/stddb_mrp0_21343.trc:
ORA-
00313
: open failed
for
members of log group
1
of thread
1
ORA-
00312
: online log
1
thread
1
:
'/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken0'
<<<<<<<<<<<<<<<<
ORA-
27037
: unable to obtain file status
Linux-x86_64 Error:
2
: No such file or directory
Additional information:
3
......
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
Clearing online redo logfile
1
complete
Errors
in
file /u01/app/oracle/product/
11.2
.
0
/dbhome_1/log/diag/rdbms/stddb/stddb/
trace
/stddb_mrp0_21343.trc:
ORA-
00313
: open failed
for
members of log group
2
of thread
1
ORA-
00312
: online log
2
thread
1
:
'/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken1'
ORA-
27037
: unable to obtain file status
Linux-x86_64 Error:
2
: No such file or directory
Additional information:
3
......
Tue Dec
24
15
:
45
:
48
2013
Clearing online redo logfile
2
complete
Errors
in
file /u01/app/oracle/product/
11.2
.
0
/dbhome_1/log/diag/rdbms/stddb/stddb/
trace
/stddb_mrp0_21343.trc:
ORA-
00313
: open failed
for
members of log group
3
of thread
1
ORA-
00312
: online log
3
thread
1
:
'/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken2'
ORA-
27037
: unable to obtain file status
Linux-x86_64 Error:
2
: No such file or directory
Additional information:
3
......
Clearing online redo logfile
3
complete
Media Recovery Waiting
for
thread
1
sequence
4
|
总结:由于之前的DG搭建,都是在两个不同的主机上实施,也没有关心过这个问题,因为standby清空primary redo的动作一定是失败的。但是现在是在一个主机上,standby能发现原primary的redo,并且由于rename失败,所以就直接clear primary redo导致primary 无法启动。
而root cause就是我在编辑pfile的时候,多写了standby_file_management=auto,其实不写的话,默认值是manual的,就不会出现这个问题。