测试使用的机器IP及主机名,其中slave1位mha的manager节点,测试过程中使用的软件包为:
链接:http://pan.baidu.com/s/1dD76D93 密码:gg7m
1
2
3
4
|
# cat /etc/hosts
172.16.25.108 master
172.16.25.109 backup
172.16.25.110 slave
|
测试使用172.16.25.100作为VIP。
测试拓扑架构为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
#+BEGIN_SRC ditaa :file images/mha001.png
+------------+
| App |
+-----+------+
|
|
+---------+----------+
+-------+ VIP:172.16.25.100 +------+
| +---------+----------+ |
| | |
| | |
+------+-----+ +------+-----+ +------+-----+
| | | | | |
| {s} | | {s} | | {s} |
| Master | | Backup | | Slave1 |
| | | | | |
| | | | | |
+------------+ +------------+ +------------+
172.16.25.108 172.16.25.109 172.16.25.110
#+END_SRC
|
在所有机器上执行,设置SSH无密码通信:
1
|
# ssh-keygen -t rsa
|
在所有机器上执行,分发SSH公钥,
1
2
3
|
# ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.25.108
# ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.25.109
# ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.25.110
|
在所有机器上执行,安装依赖包:
1
|
# yum install -y libaio
|
在所有机器上执行,增加MySQL组及用户:
1
2
|
# groupadd -g 601 mysql
# useradd -c "mysql software owner" -g mysql -u 601 mysql
|
在所有机器上执行,解压MySQL到/usr/local目录
1
|
# tar -xf mysql-advanced-5.5.32-linux2.6-x86_64.tar.gz -C /usr/local/
|
1
2
3
4
5
6
7
8
|
# cd /usr/local
# ln -s mysql-advanced-5.5.32-linux2.6-x86_64/ mysql
# chown -R mysql.mysql /usr/local/mysql
# cd /usr/local/mysql
# ./scripts/mysql_install_db --user=mysql
# cp support-files/mysql.server /etc/init.d/mysqld
# chmod 755 /etc/init.d/mysqld
# service mysqld start
|
在所有机器上执行,设置环境变量:
1
2
|
# echo "export PATH=$PATH:/usr/local/mysql/bin" >> ~/.bashrc
# . ~/.bashrc
|
在所有机器上进行操作,登录MySQL,删除默认用户:
1
2
3
4
|
# mysql -S /var/lib/mysql/mysql.sock
mysql> delete from mysql.user;
mysql> grant all on *.* to
'root'
@
'%'
identified by
'111111'
with grant option;
mysql> flush privileges;
|
在master及backup机器上进行操作:
1
2
3
4
5
6
7
8
9
|
mysql> GRANT REPLICATION SLAVE ON *.* TO
'repl'
@
'%'
IDENTIFIED BY
'replpass'
;
mysql> flush privileges;
mysql> show master status;
+------------------+----------+--------------+------------------+
| File |Position |Binlog_Do_DB |Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000007| 107 | | |
+------------------+----------+--------------+------------------+
1 row
in
set
(0.00 sec)
|
在backup及slave1机器上进行操作:
1
|
mysql> CHANGE MASTER TO MASTER_HOST=
'172.16.25.108'
,MASTER_PORT=3306,MASTER_USER=
'repl'
,MASTER_PASSWORD=
'replpass'
,MASTER_LOG_FILE=
'mysql-bin.000007'
,MASTER_LOG_POS=107;
|
在backup机器上设置read_only参数,由于backup有可能会成为主库,因此,它的配置文件里的read_only是关闭的,就是为了当backup成为主库时使用。
但在backup上需要动态设置其read_only选项为ON状态。而slave1机器为从库,且规定其永不成为主库,所以在配置文件里明确指定其read_only是开启的。
在backup机器上执行,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
mysql> show variables like
'read_only'
;
+---------------+-------+
|Variable_name | Value |
+---------------+-------+
|read_only | OFF |
+---------------+-------+
1 row
in
set
(0.00 sec)
mysql>
set
global read_only=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like
'read_only'
;
+---------------+-------+
|Variable_name |Value |
+---------------+-------+
|read_only |ON |
+---------------+-------+
1 row
in
set
(0.00 sec)
|
==============================================================
mha的一些限制,MySQL版本在5.1以上,本次使用的是5.5.32;mysqlbinlog版本在3.3以上,在安装
之前有必要进行验证
在所有机器上执行,
1
2
3
4
|
# mysqlbinlog --version
mysqlbinlog Ver 3.3
for
linux2.6 at x86_64
# mysql --version
mysql Ver 14.14 Distrib 5.5.32,
for
linux2.6 (x86_64) using EditLine wrapper
|
在所有机器上执行,安装mha的node节点:
安装依赖包:
1
2
|
# yum install -y perl-DBD-MySQL
# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
|
在slave1机器上执行,安装mha的manager节点
management节点:
1
2
3
4
5
6
7
8
9
10
|
# yum install -y perl-Config-Tiny
# yum install -y perl-DBD-MySQL
# yum install -y perl-MIME-Lite
# yum install -y perl-Params-Validate
# yum install -y perl-Time-HiRes
# rpm -ivh perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
# rpm -ivh perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
# rpm -ivh perl-Log-Dispatch-2.27-1.el6.noarch.rpm
# rpm -ivh perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
|
安装完毕,可以准备MHA的配置文件了,配置文件请参考附录4、MHA配置文件。比如配置文件中的一些目录需要手工进行创建。本次搭建MHA过程中,使用了master_ip_failover及master_ip_online_change两个脚本,脚本内容见附录5和附录6,请把两个脚本放到/usr/local/bin目录下,
1
2
3
4
5
|
# mkdir -p /var/log/masterha/app1
# mkdir /etc/mha
# ls /etc/mha/app1.conf
# ls /usr/local/bin/master_ip_failover
# ls /usr/local/bin/master_ip_change_online
|
mha配置文件的内容为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
# cat /etc/mha/app1.conf
[server default]
manager_log=
/var/log/masterha/app1/manager
.log
manager_workdir=
/var/log/masterha/app1
master_binlog_dir=
/app/mysql
master_ip_failover_script=
/usr/local/bin/master_ip_failover
master_ip_online_change_script=
/usr/local/bin/master_ip_online_change
ping_interval=1
remote_workdir=
/tmp
repl_password=111111
repl_user=repl
secondary_check_script=
/usr/local/bin/masterha_secondary_check
-s backup -s master --user=root --master_host=master --master_ip=172.16.25.108 --master_port=3306
ssh_user=root
user=mha
password=3kfRCZdc
[server1]
hostname
=master
port=3306
[server2]
candidate_master=1
check_repl_delay=0
hostname
=backup
port=3306
[server3]
hostname
=slave
port=3306
|
MHA脚本之master_ip_failover内容:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
|
#!/usr/bin/env perl
use strict;
use warnings FATAL =>
'all'
;
use Getopt::Long;
my (
$
command
, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip =
'172.16.25.100'
;
my $key =
'0'
;
my $ssh_start_vip =
"/sbin/ifconfig eth0:$key $vip"
;
my $ssh_stop_vip =
"/sbin/ifconfig eth0:$key down"
;
GetOptions(
'command=s'
=>\$
command
,
'ssh_user=s'
=>\$ssh_user,
'orig_master_host=s'
=> \$orig_master_host,
'orig_master_ip=s'
=>\$orig_master_ip,
'orig_master_port=i'
=> \$orig_master_port,
'new_master_host=s'
=>\$new_master_host,
'new_master_ip=s'
=>\$new_master_ip,
'new_master_port=i'
=>\$new_master_port,
);
exit
&main();
sub main {
print
"\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"
;
if
( $
command
eq
"stop"
|| $
command
eq
"stopssh"
) {
my $exit_code = 1;
eval
{
print
"Disabling the VIP on old master: $orig_master_host \n"
;
&stop_vip();
$exit_code = 0;
};
if
($@) {
warn
"Got Error: $@\n"
;
exit
$exit_code;
}
exit
$exit_code;
}
elsif ( $
command
eq
"start"
) {
my $exit_code = 10;
eval
{
print
"Enabling the VIP - $vip on the new master - $new_master_host\n"
;
&start_vip();
$exit_code = 0;
};
if
($@) {
warn $@;
exit
$exit_code;
}
exit
$exit_code;
}
elsif ( $
command
eq
"status"
) {
print
"Checking the Status of the script.. OK \n"
;
exit
0;
}
else
{
&usage();
exit
1;
}
}
sub start_vip() {
`
ssh
$ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return
0 unless ($ssh_user);
`
ssh
$ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status--orig_master_host=host --orig_master_ip=ip --orig_master_port=port--new_master_host=host --new_master_ip=ip --new_master_port=port\n"
;
}
|
MHA脚本之master_ip_online_change脚本内容:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
|
#!/usr/bin/env perl
## Note: This is a sample script and is notcomplete. Modify the script based on your environment.
use strict;
use warnings FATAL =>
'all'
;
use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw(
sleep
gettimeofdaytv_interval );
use Data::Dumper;
my $_tstart;
my $_running_interval = 0.1;
my (
$
command
, $orig_master_host, $orig_master_ip,
$orig_master_port, $orig_master_user,
$new_master_host, $new_master_ip, $new_master_port,
$new_master_user,
);
my $vip =
'172.16.25.100'
;
# Virtual IP
my $key =
"0"
;
my $ssh_start_vip =
"/sbin/ifconfig eth0:$key $vip"
;
my $ssh_stop_vip =
"/sbin/ifconfig eth0:$key down"
;
my $ssh_user =
"root"
;
my $new_master_password =
"111111"
;
my $orig_master_password =
"111111"
;
GetOptions(
'command=s'
=>\$
command
,
#'ssh_user=s' => \$ssh_user,
'orig_master_host=s'
=>\$orig_master_host,
'orig_master_ip=s'
=>\$orig_master_ip,
'orig_master_port=i'
=>\$orig_master_port,
'orig_master_user=s'
=>\$orig_master_user,
#'orig_master_password=s' => \$orig_master_password,
'new_master_host=s'
=>\$new_master_host,
'new_master_ip=s'
=>\$new_master_ip,
'new_master_port=i'
=>\$new_master_port,
'new_master_user=s'
=>\$new_master_user,
#'new_master_password=s' =>\$new_master_password,
);
exit
&main();
sub current_time_us {
my ($sec, $microsec ) = gettimeofday();
my$curdate = localtime($sec);
return
$curdate .
" "
. sprintf(
"%06d"
, $microsec);
}
sub sleep_until {
my$elapsed = tv_interval($_tstart);
if
($_running_interval > $elapsed ) {
sleep
( $_running_interval - $elapsed );
}
}
sub get_threads_util {
my$dbh =
shift
;
my$my_connection_id =
shift
;
my$running_time_threshold =
shift
;
my$
type
=
shift
;
$running_time_threshold = 0 unless ($running_time_threshold);
$
type
= 0 unless($
type
);
my@threads;
my$sth = $dbh->prepare(
"SHOW PROCESSLIST"
);
$sth->execute();
while
( my $ref = $sth->fetchrow_hashref() ) {
my$
id
= $ref->{Id};
my$user = $ref->{User};
my$host = $ref->{Host};
my$
command
= $ref->{Command};
my$state = $ref->{State};
my$query_time = $ref->{Time};
my$info = $ref->{Info};
$info =~ s/^\s*(.*?)\s*$/$1/
if
defined($info);
next
if
( $my_connection_id == $
id
);
next
if
( defined($query_time) && $query_time <$running_time_threshold );
next
if
( defined($
command
) && $
command
eq
"Binlog Dump"
);
next
if
( defined($user) && $user
eq
"system user"
);
next
if
( defined($
command
)
&& $
command
eq
"Sleep"
&& defined($query_time)
&& $query_time >= 1 );
if
( $
type
>= 1 ) {
next
if
( defined($
command
) && $
command
eq
"Sleep"
);
nextif ( defined($
command
) && $
command
eq
"Connect"
);
}
if
( $
type
>= 2 ) {
next
if
( defined($info) && $info =~ m/^
select
/i
);
next
if
( defined($info) && $info =~ m/^show
/i
);
}
push @threads, $ref;
}
return
@threads;
}
sub main {
if
($
command
eq
"stop"
) {
##Gracefully killing connections on the current master
#1. Set read_only= 1 on the new master
#2. DROP USER so that no app user can establish new connections
#3. Set read_only= 1 on the current master
#4. Kill current queries
#* Any database access failure will result in script die.
my$exit_code = 1;
eval
{
## Setting read_only=1 on the new master (to avoid accident)
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error(die_on_error)_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
print current_time_us() .
" Set read_only on the new master.."
;
$new_master_handler->enable_read_only();
if
( $new_master_handler->is_read_only() ) {
print
"ok.\n"
;
}
else
{
die
"Failed!\n"
;
}
$new_master_handler->disconnect();
# Connecting to the orig master, die if any database error happens
my $orig_master_handler = new MHA::DBHelper();
$orig_master_handler->connect( $orig_master_ip, $orig_master_port,
$orig_master_user, $orig_master_password, 1 );
## Drop application user so that nobodycan connect. Disabling per-session binlog beforehand
#$orig_master_handler->disable_log_bin_local();
#print current_time_us() . " Drpping app user on the origmaster..\n";
#FIXME_xxx_drop_app_user($orig_master_handler);
## Waiting for N * 100 milliseconds so that current connections can exit
my $time_until_read_only = 15;
$_tstart = [gettimeofday];
my @threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
while
( $time_until_read_only > 0 && $
#threads >= 0 ) {
if
( $time_until_read_only % 5 == 0 ) {
printf
"%s Waiting all running %d threads aredisconnected.. (max %d milliseconds)\n"
,
current_time_us(), $
#threads + 1, $time_until_read_only * 100;
if
( $
#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump .
"\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_read_only--;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
## Setting read_only=1 on the current master so that nobody(exceptSUPER) can write
print current_time_us() .
" Set read_only=1 on the orig master.."
;
$orig_master_handler->enable_read_only();
if
( $orig_master_handler->is_read_only() ) {
print
"ok.\n"
;
}
else
{
die
"Failed!\n"
;
}
## Waiting for M * 100 milliseconds so that current update queries cancomplete
my $time_until_kill_threads = 5;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
while
( $time_until_kill_threads > 0 && $
#threads >= 0 ) {
if
( $time_until_kill_threads % 5 == 0 ) {
printf
"%s Waiting all running %d queries aredisconnected.. (max %d milliseconds)\n"
,
current_time_us(), $
#threads + 1, $time_until_kill_threads * 100;
if
( $
#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump .
"\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_kill_threads--;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
print
"Disabling the VIPon old master: $orig_master_host \n"
;
&stop_vip();
## Terminating all threads
print current_time_us() .
" Killing all applicationthreads..\n"
;
$orig_master_handler->kill_threads(@threads)
if
( $
#threads >= 0);
print current_time_us() .
" done.\n"
;
#$orig_master_handler->enable_log_bin_local();
$orig_master_handler->disconnect();
## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
$exit_code = 0;
};
if
($@) {
warn
"Got Error: $@\n"
;
exit
$exit_code;
}
exit
$exit_code;
}
elsif ( $
command
eq
"start"
) {
##Activating master ip on the new master
#1. Create app user with write privileges
#2. Moving backup script if needed
#3. Register new master's ip to the catalog database
# We don't return error even thoughactivating updatable accounts/ip failed so that we don't interrupt slaves'recovery.
# If exit code is 0 or 10, MHA does notabort
my$exit_code = 10;
eval
{
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
#$new_master_handler->disable_log_bin_local();
print current_time_us() .
" Set read_only=0 on the newmaster.\n"
;
$new_master_handler->disable_read_only();
## Creating an app user on the new master
#print current_time_us() . " Creating app user on the newmaster..\n";
#FIXME_xxx_create_app_user($new_master_handler);
#$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
## Update master ip on the catalog database, etc
print
"Enabling the VIP -$vip on the new master - $new_master_host \n"
;
&start_vip();
$exit_code = 0;
};
if
($@) {
warn
"Got Error: $@\n"
;
exit
$exit_code;
}
exit
$exit_code;
}
elsif ( $
command
eq
"status"
) {
#do nothing
exit
0;
}
else
{
&usage();
exit
1;
}
}
# A simple system call that enable the VIPon the new master
sub start_vip() {
`
ssh
$ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIPon the old_master
sub stop_vip() {
`
ssh
$ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_online_change --command=start|stop|status--orig_master_host=host --orig_master_ip=ip --orig_master_port=port--new_master_host=host --new_master_ip=ip --new_master_port=port\n"
;
die;
}
|
检查SSH连接
在Slave上进行SSH连接的检查,
1
|
#masterha_check_ssh --conf=/etc/mha/app1.conf
|
检查成功,会有“[info] All SSH connection tests passedsuccessfully.”的输出信息。
检查MySQL主从同步监控状态
在Slave上进行MySQL主从同步的检查,
1
|
#masterha_check_repl --conf=/etc/mha/app1.conf
|
检查成功,会有“MySQL Replication Health is OK.”的输出信息。