RAC环境中threads变更后如何确保goldengate继续正常复制
当rac节点变更的时候,比如我们添加或者删除了集群中的节点,理所当然会对节点对应的log threads进行添加或者删除,但会造成goldengate的map log threads的顺序发生紊乱。在进行这一类行为变更的时候,特别需要注意goldengate端也需要进行特别处理。
比如,在节点添加之前,goldengate map log threads顺序如下(数据库log thread在后,下同):
1—>1 (假设是sequence 100,rba 1001)
2—>2(假设是sequence 88,rba 3009)
当添加节点后,map log threads的顺序会变成:
1—->3(sequence 88,rba 3009)
2—->1(sequence 100,rba 1001)
3—->2(new)
当ogg重新工作的时候,因为此时map的顺序发生了变化,因此会造成抽取进度出现问题。
如果有足够的处理时间,简单而又安全的做法是停止源端应用,删除extract进程后,重新配置新的extract进程并从当前开始抽取。但在这段时间内,所有的操作需确保在应用已经停止服务的前提下,否则数据将造成丢失或者不一致,需要手工处理或者重新初始化。
如果应用无法停机呢?我们可以将新建的extract进度修改成停止之前的进度状态,从而避免操作过程中应用的停机行为。
需要我们特别记录的checkpoint有:Current Checkpoint、Recovery Checkpoint以及Write Checkpoint
1.正常停止extract(略)
2.获得extract的checkpoint记录
GGSCI (node1) 21> info ext_r1 showch
EXTRACT EXT_R1 Last Started 2011-08-16 22:35 Status STOPPED
Checkpoint Lag 00:00:00 (updated 00:06:21 ago)
Log Read Checkpoint Oracle Redo Logs
2011-08-17 03:32:48 Thread 1, Seqno 62, RBA 29890576
Log Read Checkpoint Oracle Redo Logs
2011-08-17 03:32:34 Thread 2, Seqno 42, RBA 18674704Current Checkpoint Detail:
Read Checkpoint #1
Oracle RAC Redo Log
Startup Checkpoint (starting position in the data source):
Thread #: 1
Sequence #: 61
RBA: 32112656
Timestamp: 2011-08-16 22:34:53.000000
SCN: 0.3743980 (3743980)
Redo File: +DATA1/my/onlinelog/group_1.261.758327805Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Thread #: 1
Sequence #: 62
RBA: 29890576
Timestamp: 2011-08-17 03:32:48.000000
SCN: 0.3811675 (3811675)
Redo File: +DATA1/my/onlinelog/group_2.262.758327805Current Checkpoint (position of last record read in the data source):
Thread #: 1
Sequence #: 62
RBA: 29890576
Timestamp: 2011-08-17 03:32:48.000000
SCN: 0.3811675 (3811675)
Redo File: +DATA1/my/onlinelog/group_2.262.758327805BR Previous Recovery Checkpoint:
Thread #: 1
Sequence #: 0
RBA: 0
Timestamp: 2011-08-16 22:35:09.416136
SCN: Not available
Redo File:BR Begin Recovery Checkpoint:
Thread #: 1
Sequence #: 62
RBA: 22437392
Timestamp: 2011-08-17 02:35:11.000000
SCN: 0.3798208 (3798208)
Redo File:BR End Recovery Checkpoint:
Thread #: 1
Sequence #: 62
RBA: 24120320
Timestamp: 2011-08-17 02:35:16.000000
SCN: 0.3801192 (3801192)
Redo File:Read Checkpoint #2
Oracle RAC Redo Log
Startup Checkpoint (starting position in the data source):
Thread #: 2
Sequence #: 41
RBA: 25323024
Timestamp: 2011-08-16 22:34:40.000000
SCN: 0.3743980 (3743980)
Redo File: +DATA1/my/onlinelog/group_3.266.758328125Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
Thread #: 2
Sequence #: 42
RBA: 18674704
Timestamp: 2011-08-17 03:32:34.000000
SCN: 0.3811674 (3811674)
Redo File: +DATA1/my/onlinelog/group_4.267.758328125Current Checkpoint (position of last record read in the data source):
Thread #: 2
Sequence #: 42
RBA: 18674704
Timestamp: 2011-08-17 03:32:34.000000
SCN: 0.3811674 (3811674)
Redo File: +DATA1/my/onlinelog/group_4.267.758328125BR Previous Recovery Checkpoint:
Thread #: 2
Sequence #: 0
RBA: 0
Timestamp: 2011-08-16 22:35:09.416136
SCN: Not available
Redo File:BR Begin Recovery Checkpoint:
Thread #: 2
Sequence #: 42
RBA: 15242240
Timestamp: 2011-08-17 02:35:02.000000
SCN: 0.3800455 (3800455)
Redo File:BR End Recovery Checkpoint:
Thread #: 2
Sequence #: 42
RBA: 15242240
Timestamp: 2011-08-17 02:35:02.000000
SCN: 0.3800455 (3800455)
Redo File:Write Checkpoint #1
GGS Log Trail
Current Checkpoint (current write position):
Sequence #: 3
RBA: 51132
Timestamp: 2011-08-17 03:32:48.695373
Extract Trail: /opt/ggs/dirdat/r1/exHeader:
Version = 2
Record Source = A
Type = 6
# Input Checkpoints = 2
# Output Checkpoints = 1File Information:
Block Size = 2048
Max Blocks = 100
Record Length = 4096
Current Offset = 0Configuration:
Data Source = 3
Transaction Integrity = 1
Task Type = 0Status:
Start Time = 2011-08-16 22:35:10
Last Update Time = 2011-08-17 03:32:48
Stop Status = G
Last Result = 402
3.新建extract进程。
GGSCI (node1) 34> ADD EXT ext_r1, BEGIN NOW, TRANLOG, THREADS 3
2011-08-17 03:52:26 INFO OGG-01749 Successfully registered EXTRACT EXT_R1 to start managing log retention at SCN 3826107.
EXTRACT added.
4.修改current checkpoint (注意每个thread都要修改)
GGSCI (node1) 35> alter EXTRACT ext_r1, TRANLOG, EXTSEQNO 62, EXTRBA 29890576,thread 1
EXTRACT altered.
GGSCI (node1) 36> alter EXTRACT ext_r1, TRANLOG, EXTSEQNO 42, EXTRBA 18674704,thread 2
EXTRACT altered.
5. 修改recovery checkpoint (注意每个thread都要修改)
GGSCI (node1) 42> ALTER EXTRACT ext_r1, IOEXTSEQNO 62, IOEXTRBA 29890576,thread 1
GGSCI (node1) 42> ALTER EXTRACT ext_r1, IOEXTSEQNO 42, IOEXTRBA 18674704,thread 2
6. 修改exttrail或者rmttrail的write checkpoint
GGSCI (node1) 47> ADD EXTTRAIL /opt/ggs/dirdat/r1/ex,SEQNO 3, RBA 51132, EXTRACT ext_r1
EXTTRAIL added.
7. 验证checkpoint是否修改成功(使用showch,略)
8.重新启动extract(略)