■传统的MySQL主从架构存在的问题
单点故障
●一套优秀的MySQL高可用环境下故障切换和主从复制的软件
MySQL故障过程中,MHA能做到0- 30秒内自动完成故障切换
MHA Manager (管理节点)
MHA Node (数据节点)
●自动故障切换过程中,MHA试图从宕机的主服务器.上保存
二进制日志,最大程度的保证数据不丢失
●使用半同步复制,可以大
无识别结果
降低数据丢失的风险
●目前MHA支持一-主多从架构, 最少三台服务,即一-主两从
案例拓扑图
1.MHA架构
1)数据库安装
2)一主两从
3)MHA搭建
2.故障模拟
1)主库失效
2)备选主库成为主库
3)从库2将备选主库指向为主库
3.故障修复
1)坏库修复,启动
2)在修复好的库上建立新主从关系
3)修改manager配置文件,添加修好的库的记录
4)重启mha
1. 本案例环境
服务器 CentOS7.3(64 位) MHA-manager/192.168.8.100
管理节点,安装 manager 组件
服务器 CentOS7.3(64 位) Mysql1/192.168.8.134 Master 节点,安装 node 组件
服务器 CentOS7.3(64 位) Mysql2/192.168.8.136 Slave 节点,安装 node 组件
服务器 CentOS7.3(64 位) Mysql3/192.168.8.139 Slave 节点,安装 node 组件
本实例中用到 MYSQL 版本请从官网进行下载,MHA 的版本请从一些相关资源进行下
载,因为 google 官网上面最新的还是 2012 年 0.55 版本,而且 0.55 版本只支持到 CentOS6,
这里操作系统是 CentOS7 版本,所以这里下载 MHA 版本是 0.57 版本。MHA 架构如图 7.1
所示。
2. 案例需求
本案例要求通过 MHA 监控 MySQL 数据库在故障时进行自动切换,不影响业务。
3. 案例实现思路
1) 安装 MySQL 数据库
2) 配置 MySQL 一主两从
3) 安装 MHA 软件
4) 配置无密码认证
5) 配置 MySQL MHA 高可用
6) 模拟 master 故障切换
在三台 MySQL 节点上分别安装数据库,MySQL 版本请使用 5.6.36,cmake 版本请使
用 2.8.6。下面只在 Mysql1 上面做演示,安装过程如下。
[root@Mysql1 ~]# yum -y install ncurses-devel gcc-c++ perl-Module-Install
[root@Mysql1 ~]# tar zxvf cmake-2.8.6.tar.gz
[root@Mysql1 ~]# cd cmake-2.8.6
[root@Mysql1 cmake-2.8.6]# ./configure
[root@Mysql1 cmake-2.8.6]# gmake && gmake install
[root@Mysql1 ~]# tar -zxvf mysql-5.6.36.tar.gz
[root@Mysql1 ~]# cd mysql-5.6.36
[root@Mysql1 mysql-5.6.36]# cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DWITH_EXTRA_CHARSETS=all -DSYSCONFDIR=/etc
[root@Mysql1 mysql-5.6.36]# make && make install
[root@Mysql1 mysql-5.6.36]# cp support-files/my-default.cnf /etc/my.cnf
[root@Mysql1 mysql-5.6.36]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld
[root@Mysql1 ~]# chmod +x /etc/rc.d/init.d/mysqld
[root@Mysql1 ~]# chkconfig --add mysqld
[root@Mysql1 ~]# echo "PATH=$PATH:/usr/local/mysql/bin" >> /etc/profile
[root@Mysql1 ~]# source /etc/profile
[root@Mysql1 ~]# groupadd mysql
[root@Mysql1 ~]# useradd -M -s /sbin/nologin mysql -g mysql
[root@Mysql1 ~]# chown -R mysql.mysql /usr/local/mysql
[root@Mysql1 ~]# mkdir -p /data/mysql
[root@Mysql1 ~]# /usr/local/mysql/scripts/mysql_install_db --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --user=mysql
[root@Mysql1 ~]# cat /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true
配置从服务器:
在/etc/my.cnf 中修改或者增加下面内容。
[root@Mysql2 ~]# vim /etc/my.cnf
[mysqld]
server-id = 2 //增加
log_bin = master-bin
relay-log = relay-log-bin //增加
relay-log-index = slave-relay-bin.index //增加
这里要注意 server-id 不能相同。
需要删除字符集utf8的语句
[root@Mysql1 ~]# ln -s /usr/local/mysql/bin/mysql /usr/sbin/
[root@Mysql1 ~]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/
[root@Mysql1 ~]# /usr/local/mysql/bin/mysqld_safe --user=mysql &
在所有数据库节点上授权两个用户,一个是从库同步使用,另外一个是 manager 使用。
mysql> grant replication slave on *.* to 'myslave'@'192.168.1.%' identified by '123';
mysql> grant all privileges on *.* to 'mha'@'192.168.1.%' identified by 'manager';
mysql> flush privileges;
MySQL 主从有报错,报两个从库通过主机名连接不上主库,所以所有数据库加上下面的授
权。
mysql> grant all privileges on *.* to 'mha'@'Mysql1' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'Mysql2' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'Mysql3' identified by 'manager';
mysql> change master to
master_host='192.168.8.134',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=675; 和上一章保持一致
mysql> start slave;
mysql> show slave status\G;
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
必须设置两个从库为只读模式:
mysql> set global read_only=1;
mysql> create database test_db;
Query OK, 1 row affected (0.00 sec)
mysql> use test_db;
Database changed
mysql> create table test(id int);
Query OK, 0 rows affected (0.13 sec)
mysql> insert into test(id) values (1);
Query OK, 1 row affected (0.03 sec)
mysql> select * from test_db.test; +------+
| id | +------+
| 1 | +------+
1 row in set (0.00 sec)
[root@MHA-manager ~]# yum install epel-release --nogpgcheck
[root@MHA-manager ~]# yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
在所有服务器上必须先安装 node 组件,最后在 MHA-manager 节点上安装 manager 组件,
因为 manager 依赖 node 组件,下面都是在 Mysql1 上操作演示安装 node 组件。
[root@Mysql1 ~]# tar zxvf mha4mysql-node-0.57.tar.gz
[root@Mysql1 ~]# cd mha4mysql-node-0.57
[root@Mysql1 mha4mysql-node-0.57]# perl Makefile.PL
[root@Mysql1 mha4mysql-node-0.57]# make
[root@Mysql1 mha4mysql-node-0.57]# make install
[root@MHA-manager ~]# tar zxvf mha4MHA-manager-0.57.tar.gz
[root@MHA-manager ~]# cd mha4MHA-manager-0.57
[root@MHA-manager mha4MHA-manager-0.57]# perl Makefile.PL
*** Module::AutoInstall version 1.06
*** Checking for Perl dependencies...
[Core Features] - DBI ...loaded. (1.627)
- DBD::mysql ...loaded. (4.023)
- Time::HiRes ...loaded. (1.9725)
- Config::Tiny ...loaded. (2.14)
- Log::Dispatch ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (7.18)
- MHA::NodeConst ...loaded. (0.57) *** Module::AutoInstall configuration finished. Checking if your kit is complete... Looks good
Writing Makefile for mha4mysql::manager
[root@MHA-manager mha4MHA-manager-0.57]# make
[root@MHA-manager mha4MHA-manager-0.57]# make install
manager 安装后在/usr/local/bin 下面会生成几个工具,主要包括以下几个:
masterha_check_ssh 检查 MHA 的 SSH 配置状况
masterha_check_repl 检查 MySQL 复制状况
masterha_manger 启动 manager的脚本
masterha_check_status 检测当前 MHA 运行状态
masterha_master_monitor 检测 master 是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的 server 信息
masterha_stop 关闭manager
Manager 的脚本触发,无需人为操作)主要如下:
save_binary_logs 保存和复制 master 的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的 slave
filter_mysqlbinlog 去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞 SQL 线程)
[root@MHA-manager ~]# ssh-keygen -t rsa //一路按回车键
[root@MHA-manager ~]# ssh-copy-id 192.168.8.134
[root@MHA-manager ~]# ssh-copy-id 192.168.8.136
[root@MHA-manager ~]# ssh-copy-id 192.168.8.139
[root@Mysql1 ~]# ssh-keygen -t rsa
[root@Mysql1 ~]# ssh-copy-id 192.168.8.136
[root@Mysql1 ~]# ssh-copy-id 192.168.8.139
[root@Mysql2 ~]# ssh-keygen -t rsa
[root@Mysql2 ~]# ssh-copy-id 192.168.8.134
[root@Mysql2 ~]# ssh-copy-id 192.168.8.139
[root@Mysql3 ~]# ssh-keygen -t rsa
[root@Mysql3 ~]# ssh-copy-id 192.168.8.134
[root@Mysql3 ~]# ssh-copy-id 192.168.8.136
[root@MHA-manager ~]# cp -ra /root/mha4MHA-manager-0.57/samples/scripts /usr/local/bin
拷贝后会有四个执行文件
[root@atlas ~]# ll /usr/local/bin/scripts/
总用量 32
-rwxr-xr-x 1 mysql mysql 3648 5 月 31 2015 master_ip_failover
-rwxr-xr-x 1 mysql mysql 9872 5 月 25 09:07 master_ip_online_change
-rwxr-xr-x 1 mysql mysql 11867 5 月 31 2015 power_manager
-rwxr-xr-x 1 mysql mysql 1360 5 月 31 2015 send_report
master_ip_failover #自动切换时 VIP 管理的脚本
master_ip_online_change #在线切换时 vip 的管理
power_manager #故障发生后关闭主机的脚本
send_report #因故障切换后发送报警的脚本
也是推荐的一种方式,生产环境不太建议使用 keepalived。
[root@MHA-manager ~]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin
[root@MHA-manager ~]# cat /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd. #
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version. #
# This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details. #
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.8.200/24';
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print "Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
## Creating an app user on the new master
print "Creating app user on the new master..\n";
FIXME_xxx_create_user( $new_master_handler->{dbh} );
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
## Update master ip on the catalog database, etc
#FIXME_xxx;
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
[root@MHA-manager ~]# mkdir /etc/masterha
[root@MHA-manager ~]# cp /root/mha4MHA-manager-0.57/samples/conf/app1.cnf /etc/masterha
[root@MHA-manager ~]# cat /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script= /usr/local/bin/master_ip_failover
master_ip_online_change_script= /usr/local/bin/master_ip_online_change
password=manager
user=mha
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.8.136 -s 192.168.8.139
shutdown_script=""
ssh_user=root
[server1]
hostname=192.168.8.134
port=3306
[server2]
hostname=192.168.8.136
port=3306
candidate_master=1
check_repl_delay=0
[server3]
hostname=192.168.8.139
port=3306
[root@MHA-manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf
Thu May 17 14:07:29 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu May 17 14:07:29 2018 - [info] Reading application default configuration from
/etc/masterha/app1.cnf.. Thu May 17 14:07:29 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu May 17 14:07:29 2018 - [info] Starting SSH connection tests.. Thu May 17 14:07:30 2018 - [debug]
Thu May 17 14:07:29 2018 - [debug] Connecting via SSH from
root@192.168.8.134(192.168.8.134:22) to root@192.168.8.136(192.168.8.136:22).. Thu May 17 14:07:30 2018 - [debug] ok. Thu May 17 14:07:30 2018 - [debug] Connecting via SSH from
root@192.168.8.134(192.168.8.134:22) to root@192.168.8.139(192.168.8.139:22).. Thu May 17 14:07:30 2018 - [debug] ok. Thu May 17 14:07:31 2018 - [debug]
第 15 页 共 32 页
Thu May 17 14:07:30 2018 - [debug] Connecting via SSH from
root@192.168.8.136(192.168.8.136:22) to root@192.168.8.134(192.168.8.134:22).. Thu May 17 14:07:30 2018 - [debug] ok. Thu May 17 14:07:30 2018 - [debug] Connecting via SSH from
root@192.168.8.136(192.168.8.136:22) to root@192.168.8.139(192.168.8.139:22).. Thu May 17 14:07:30 2018 - [debug] ok. Thu May 17 14:07:32 2018 - [debug]
Thu May 17 14:07:30 2018 - [debug] Connecting via SSH from
root@192.168.8.139(192.168.8.139:22) to root@192.168.8.134(192.168.8.134:22).. Thu May 17 14:07:31 2018 - [debug] ok. Thu May 17 14:07:31 2018 - [debug] Connecting via SSH from
root@192.168.8.139(192.168.8.139:22) to root@192.168.8.136(192.168.8.136:22).. Thu May 17 14:07:31 2018 - [debug] ok. Thu May 17 14:07:32 2018 - [info] All SSH connection tests passed successfully.
正常。如下所示:
[root@MHA-manager ~]# masterha_check_repl -conf=/etc/masterha/app1.cnf
Thu May 17 16:44:55 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu May 17 16:44:55 2018 - [info] Reading application default configuration from
/etc/masterha/app1.cnf.. Thu May 17 16:44:55 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu May 17 16:44:55 2018 - [info] MHA::MasterMonitor version 0.57. Thu May 17 16:44:56 2018 - [info] GTID failover mode = 0
Thu May 17 16:44:56 2018 - [info] Dead Servers:
Thu May 17 16:44:56 2018 - [info] Alive Servers:
Thu May 17 16:44:56 2018 - [info] 192.168.8.134(192.168.8.134:3306)
Thu May 17 16:44:56 2018 - [info] 192.168.8.136(192.168.8.136:3306)
Thu May 17 16:44:56 2018 - [info] 192.168.8.139(192.168.8.139:3306)
Thu May 17 16:44:56 2018 - [info] Alive Slaves:
Thu May 17 16:44:56 2018 - [info] Checking replication filtering settings.. Thu May 17 16:44:56 2018 - [info] binlog_do_db= , binlog_ignore_db= Thu May 17 16:44:56 2018 - [info] Replication filtering check ok.
......//省略部分
Cleaning up test file(s).. done. Thu May 17 16:45:00 2018 - [info] Slaves settings check done. Thu May 17 16:45:00 2018 - [info]
192.168.8.134(192.168.8.134:3306) (current master) +--192.168.8.136(192.168.8.136:3306)
+--192.168.8.139(192.168.8.139:3306)
Thu May 17 16:45:00 2018 - [info] Checking replication health on 192.168.8.136.. Thu May 17 16:45:00 2018 - [info] ok. Thu May 17 16:45:00 2018 - [info] Checking replication health on 192.168.8.139.. Thu May 17 16:45:00 2018 - [info] ok. Thu May 17 16:45:00 2018 - [info] Checking master_ip_failover_script status:
Thu May 17 16:45:00 2018 - [info] /usr/local/bin/master_ip_failover --command=status
--ssh_user=root --orig_master_host=192.168.8.134 --orig_master_ip=192.168.8.134
--orig_master_port=3306
Checking the Status of the script.. OK
Thu May 17 16:45:00 2018 - [info] OK. Thu May 17 16:45:00 2018 - [warning] shutdown_script is not defined. Thu May 17 16:45:00 2018 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
[root@MHA-manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
–remove_dead_master_conf 该参数代表当发生主从切换后,老的主库的 ip 将会从配置文件中移除。
–manger_log 日志存放位置。
–ignore_last_failover 在缺省情况下,如果 MHA 检测到连续发生宕机,且两次宕机间隔不足 8 小时的话,则不会进行 Failover,之所以这样限制是为了避免 ping-pong 效应。该参数代表忽略上次 MHA 触发切换产生的文件,默认情况下,MHA 发生切换后会在日志记目录,也就是上面设置的日志 app1.failover.complete 文件,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换,除非在第一次切换后收到删除该文件,为了方便,
== 这里设置为–ignore_last_failover ==
[root@MHA-manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:7763) is running(0:PING_OK), master:192.168.8.134
[root@MHA-manager ~]# cat /var/log/masterha/app1/manager.log
Thu May 17 16:49:48 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found.
Skipping. Thu May 17 16:49:48 2018 - [info] Reading application default configuration from
/etc/masterha/app1.cnf.. Thu May 17 16:49:48 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu May 17 16:49:48 2018 - [info] MHA::MasterMonitor version 0.57. Thu May 17 16:49:49 2018 - [info] GTID failover mode = 0
Thu May 17 16:49:49 2018 - [info] Dead Servers:
Thu May 17 16:49:49 2018 - [info] Alive Servers:
Thu May 17 16:49:49 2018 - [info] 192.168.8.134(192.168.8.134:3306)
Thu May 17 16:49:49 2018 - [info] 192.168.8.136(192.168.8.136:3306)
Thu May 17 16:49:49 2018 - [info] 192.168.8.139(192.168.8.139:3306)
......//省略部分
Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Thu May 17 16:49:54 2018 - [info] Slaves settings check done. Thu May 17 16:49:54 2018 - [info]
192.168.8.134(192.168.8.134:3306) (current master) +--192.168.8.136(192.168.8.136:3306) +--192.168.8.139(192.168.8.139:3306)
Thu May 17 16:49:54 2018 - [info] Checking master_ip_failover_script status:
Thu May 17 16:49:54 2018 - [info] /usr/local/bin/master_ip_failover --command=status
--ssh_user=root --orig_master_host=192.168.8.134 --orig_master_ip=192.168.8.134
--orig_master_port=3306
Checking the Status of the script.. OK
Thu May 17 16:49:54 2018 - [info] OK. Thu May 17 16:49:54 2018 - [warning] shutdown_script is not defined. Thu May 17 16:49:54 2018 - [info] Set master ping interval 1 seconds. Thu May 17 16:49:54 2018 - [info] Set secondary check script:
/usr/local/bin/masterha_secondary_check -s 192.168.8.136 -s 192.168.8.139
Thu May 17 16:49:54 2018 - [info] Starting ping health check on
192.168.8.134(192.168.8.134:3306).. Thu May 17 16:49:54 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't
respond..
manager 节点停止 MHA 服务而消失。
[root@Mysql1 ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
第 18 页 共 32 页
inet 192.168.8.134 netmask 255.255.255.0 broadcast 192.168.8.255
inet6 fe80::20c:29ff:feeb:b2c5 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:eb:b2:c5 txqueuelen 1000 (Ethernet)
RX packets 32494 bytes 19929135 (19.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 20439 bytes 3094488 (2.9 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.8.200 netmask 255.255.255.0 broadcast 192.168.8.255
ether 00:0c:29:eb:b2:c5 txqueuelen 1000
在主库上:
pkill mysqld
可以看到从库的状态,其中之一肯定有切换到主库的
切换备选主库的算法:
1.一般判断从库的是从(position/GTID)判断优劣,数据有差异,最接近于master的slave,成为备选主。
2.数据一致的情况下,按照配置文件顺序,选择备选主库。
3.设定有权重(candidate_master=1),按照权重强制指定备选主。
1)默认情况下如果一个slave落后master 100M的relay logs的话,即使有权重,也会失效。
2)如果check_repl_delay=0的话,即使落后很多日志,也强制选择其为备选主。
故障修复步骤:
/etc/init.d/mysqld start
>change master to master_host='192.168.8.136',master_port=3306,master_auto_position=1,master_user='mha',master_password='manager';
>start slave;
vi /etc/masterha/app1.cnf
[server1]
hostname=192.168.8.134
port=3306
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
注:
第一次配置vip的时候,需要在主库手工生成vip
<192.168.8.136>#ifconfig ens33:1 192.168.8.200/24
在manager那台机器上
masterha_stop --conf=/etc/masterha/app1.cnf
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
还有一个点需要大家注意:
dos2unix /usr/local/bin/master_ip_failover 解决中英字不兼容报错的问题
另,master_ip_failover需要有执行权限
rm -rf /var/log/masterha/app1.log/app1.failover.complete
如需设置为只读状态,将该read_ only参数设置为1或TRUE状态, 但设置read_ _only=1 状态有两个需要注意的地方:
信息加载中,请等待
微信客服(速回)
微信客服(慢回)