High Availability手册(3): 配置

阅读量：4946 次

发布时间：2019-06-11

本文共 24119 字，大约阅读时间需要 80 分钟。

各种配置在命令行状态下，多用crm进行

Global Cluster Options

这个类型是全局配置，主要包含下面两个：

no-quorum-policy

quorum的意思是最低法定人数，pacemaker能够继续工作所需要的最少的active的node的个数，这个数是(num of nodes)/2 + 1

如果不能达到法定人数的时候行为如何呢？

ignore表示继续运行，如果是两个Node的cluster，只要有一个挂了，就小于最小法定数目了，所有要设为ignore

freeze表示已经运行的resource还是运行的，但是不能加入新的resource了。

stop表示所有的resource都会停止工作

stonith-enabled

stonith全称Shoot-The-Other-Node-In-The-Head，一枪毙命或者一枪爆头

当一个Node的heartbeat没有反应了，但是不代表这台机器不访问和写入数据，尤其是在DRBD的情况下，这个没有反应的Node很可能写入脏数据，所以通过电源管理系统ipmi，直接关掉机器是最好的保护数据的方法

设置

crm configure propertry no-quorum-policy=ignore

crm configure property stonith-enabled=false

查看

crm configure show

Cluster Resources

IP地址，apache，mysql这些服务都是Cluster resources

resource由resource agent管理

resource agent有多种类型，可以用下面的命令查看

# crm ra classes

lsb

ocf / heartbeat pacemaker redhat

service

stonith

upstart

LSB全称Linux Standards Base，是由操作系统提供的在/etc/init.d下面的script，包含start, stop, restart, reload, force-reload, status方法。然而这些方法和操作系统有关，依据不同的操作系统，不同的实现

root@pacemaker01:/home/openstack# crm ra list lsb

acpid apache2 apparmor apport atd

console-setup corosync corosync-notifyd cron dbus

dns-clean friendly-recovery grub-common halt irqbalance

killprocs kmod logd networking ondemand

openhpid pacemaker pppd-dns procps rc

rc.local rcS reboot resolvconf rsync

rsyslog screen-cleanup sendsigs single ssh

sudo udev umountfs umountnfs.sh umountroot

unattended-upgrades urandom

# ls /etc/init.d/

acpid corosync grub-common networking rc rsync ssh unattended-upgrades

apache2 corosync-notifyd halt ondemand rc.local rsyslog sudo urandom

apparmor cron irqbalance openhpid rcS screen-cleanup udev

apport dbus killprocs pacemaker README sendsigs umountfs

atd dns-clean kmod pppd-dns reboot single umountnfs.sh

console-setup friendly-recovery logd procps resolvconf skeleton umountroot

OCF全称Open Cluster Framework

这种resource agent屏蔽了不同的操作系统，提供了标准的实现，在目录/usr/lib/ocf/resource.d/provider中，支持start, stop, status, monitor, meta-data方法。

root@pacemaker01:/home/openstack# crm ra list ocf heartbeat

AoEtarget AudibleAlarm CTDB ClusterMon Delay Dummy

EvmsSCC Evmsd Filesystem ICP IPaddr IPaddr2

IPsrcaddr IPv6addr LVM LinuxSCSI MailTo ManageRAID

ManageVE Pure-FTPd Raid1 Route SAPDatabase SAPInstance

SendArp ServeRAID SphinxSearchDaemon Squid Stateful SysInfo

VIPArip VirtualDomain WAS WAS6 WinPopup Xen

Xinetd anything apache asterisk conntrackd db2

dhcpd drbd eDir88 ethmonitor exportfs fio

iSCSILogicalUnit iSCSITarget ids iscsi jboss ldirectord

lxc mysql mysql-proxy named nfsserver nginx

oracle oralsnr pgsql pingd portblock postfix

pound proftpd rsyncd rsyslog scsi2reservation sfex

slapd symlink syslog-ng tomcat varnish vmware

# ls /usr/lib/ocf/resource.d/heartbeat/

anything Delay Filesystem iSCSILogicalUnit ManageVE pingd rsyslog Squid vmware

AoEtarget dhcpd fio iSCSITarget mysql portblock SAPDatabase Stateful WAS

apache drbd ICP jboss mysql-proxy postfix SAPInstance symlink WAS6

asterisk Dummy ids ldirectord named pound scsi2reservation SysInfo WinPopup

AudibleAlarm eDir88 IPaddr LinuxSCSI nfsserver proftpd SendArp syslog-ng Xen

ClusterMon ethmonitor IPaddr2 LVM nginx Pure-FTPd ServeRAID tomcat Xinetd

conntrackd Evmsd IPsrcaddr lxc oracle Raid1 sfex varnish

CTDB EvmsSCC IPv6addr MailTo oralsnr Route slapd VIPArip

db2 exportfs iscsi ManageRAID pgsql rsyncd SphinxSearchDaemon VirtualDomain

例如我们看IPaddr2 /usr/lib/ocf/resource.d/heartbeat/IPaddr2

start会调用ip_start

ip_start会调用add_interface $OCF_RESKEY_ip $NETMASK $BRDCAST $NIC $IFLABEL

add_interface会调用$IP2UTIL -f inet addr add $ipaddr/$netmask brd $broadcast dev $iface

这里$IP2UTIL就是一个环境变量

root@pacemaker01:/usr/lib/ocf/lib/heartbeat# ls

apache-conf.sh ocf-binaries ocf-rarun ocf-shellfuncs sapdb-nosha.sh

http-mon.sh ocf-directories ocf-returncodes ora-common.sh sapdb.sh

# grep -r "IP2UTIL" *

ocf-binaries:: ${IP2UTIL:=ip}

在这里定义了，不同的操作系统命令可能不同。

Resource有多种类型

最常用的的是primitives类型，也即基本类型

在配置一个primitives类型的resource的时候，可以先查看帮助

crm ra info ocf:heartbeat:IPaddr2

这里面有所有可以设置的parameters

Manages virtual IPv4 addresses (Linux specific version) (ocf:heartbeat:IPaddr2)

Parameters (* denotes required, [] the default):

ip* (string): IPv4 address

The IPv4 address to be configured in dotted quad notation, for example

"192.168.1.1".

cidr_netmask (string): CIDR netmask

broadcast (string): Broadcast address

mac (string): Cluster IP MAC address

Operations' defaults (advisory minimum):

start timeout=20s

stop timeout=20s

status timeout=20s interval=10s

monitor timeout=20s interval=10s

所有我们可以这样configure resource

crm configure primitive myIP ocf:heartbeat:IPaddr2 params ip=127.0.0.99 op monitor interval=60s

第二种resource的类型是group resource

有一些resource是绑定在一起的，这些resource要么同时运行在同一个node上，要么同时运行在另外的node上。

如下面的图，Web Server就是一个Group Resource，它包含三个子resource, IP Addr，Apache, Filesystem

Group有以下的属性

Start and Stop: resource安装被指定的顺序启动，按照相反的顺序关闭

Dependency: 所有的子resource必须同时运行在一个node上，一个运行不起来，统统运行不起来

Contents: 一个group至少有一个resource

Constraints: Constraints包括colocation，用于指定两个resource要运行在同一台机器上，如果需要一个resource和group运行在一个机器上，虽然可以指定这个resource和group中的一个子resource colocate在一起，根据group的定义，整个group必将与这个resource运行在一起，但是最好指定colocation的时候使用group的名字而非其中一个子resource的名字

stickiness: 一个group的stickiness的值是所有active的子resourse的值之和

resource monitoring: 不可以monitor整个group，而必须一一monitor每个子resource

要配置group resource，首先需要定义primitive resource

crm configure primitive Public-IP ocf:heartbeat:IPaddr2 params ip=1.2.3.4 id=p.public-ip

crm configure primitive Email lsb:exim params id=p.lsb-exim

下面生个一个group

crm configure group mygroup Public-IP Email

如果想改变group

crm configure modgroup mygroup add p.lsb-exim before p.public-ip

删除子resource

crm configure modgroup mygroup remove p.lsb-exim

Clones

clone的目的是部署多个active-active的resource，使得它在多个机器上同时运行。

有三种clone：

Ananymous Clone是最简单的一种，多个资源同时运行在多个地方，每个资源都是完全一样的，每个机器只能运行一个active的resource。

比如处于只读状态的apache，就是很好的例子，因为只读，他们可以很好的协同工作而没有冲突。

例如我们创建一个apache的resource，不添加constraint

crm configure primitive WebSite ocf:heartbeat:apache params configfile=/etc/apache2/apache2.conf statusurl=" op monitor interval=1min

crm configure op_defaults timeout=240s

# crm_mon -1

Last updated: Sat Aug 2 13:24:46 2014

Last change: Sat Aug 2 13:24:38 2014 via cibadmin on pacemaker01

Stack: corosync

Current DC: pacemaker01 (1084777482) - partition with quorum

Version: 1.1.10-42f2063

3 Nodes configured

2 Resources configured

Online: [ pacemaker01 pacemaker02 pacemaker03 ]

ClusterIP (ocf::heartbeat:IPaddr2): Started pacemaker01

WebSite (ocf::heartbeat:apache): Started pacemaker03

这个时候website运行在pacemaker3上，我们在每个节点ps aux一下

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# ps aux | grep apache

root 32560 0.0 0.0 11744 900 pts/0 S+ 13:25 0:00 grep --color=auto apache

root@pacemaker02:/home/openstack# ps aux | grep apache

root 4504 0.0 0.0 11744 900 pts/0 S+ 13:25 0:00 grep --color=auto apache

root@pacemaker03:/home/openstack# ps aux | grep apache

root 4455 0.0 0.1 71300 2564 ? Ss 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4456 0.0 0.2 360464 4252 ? Sl 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4457 0.0 0.2 557136 4928 ? Sl 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

root 4592 0.0 0.0 11744 900 pts/0 S+ 13:25 0:00 grep --color=auto apache

下面我们创建一个apache-clone

crm configure clone apache-clone WebSite

这里我们使用了很多的默认值，clone-max默认为每个节点都启动，clone-node-max默认为每个节点最多启动一个

这样三个节点的apache都启动起来了

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# ps aux | grep apache

root 410 0.0 0.0 11744 900 pts/0 S+ 13:27 0:00 grep --color=auto apache

root 32754 0.0 0.1 71300 2560 ? Ss 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 32755 0.0 0.4 360464 8332 ? Sl 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 32756 0.0 0.4 491600 9008 ? Sl 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

root@pacemaker02:/home/openstack# ps aux | grep apache

root 4533 0.0 0.1 71300 2564 ? Ss 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4534 0.0 0.2 360464 4252 ? Sl 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4535 0.0 0.2 491600 4928 ? Sl 13:27 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

root 4647 0.0 0.0 11744 900 pts/0 S+ 13:27 0:00 grep --color=auto apache

root@pacemaker03:/home/openstack# ps aux | grep apache

root 4455 0.0 0.1 71300 2564 ? Ss 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4456 0.0 0.2 491600 4924 ? Sl 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

www-data 4457 0.0 0.2 622672 4928 ? Sl 13:24 0:00 /usr/sbin/apache2 -DSTATUS -f /etc/apache2/apache2.conf

root 4694 0.0 0.0 11744 900 pts/0 S+ 13:27 0:00 grep --color=auto apache

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm_mon -1

Last updated: Sat Aug 2 13:27:20 2014

Last change: Sat Aug 2 13:27:14 2014 via cibadmin on pacemaker01

Stack: corosync

Current DC: pacemaker01 (1084777482) - partition with quorum

Version: 1.1.10-42f2063

3 Nodes configured

4 Resources configured

Online: [ pacemaker01 pacemaker02 pacemaker03 ]

ClusterIP (ocf::heartbeat:IPaddr2): Started pacemaker01

Clone Set: apache-clone [WebSite]

Started: [ pacemaker01 pacemaker02 pacemaker03 ]

我们试图将IPaddr2在三个node切换，就看出每个node都启动了apache

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# curl

<html>

<body>My Test Site - pacemaker01</body>

</html>

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm resource move ClusterIP pacemaker02

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# curl

<html>

<body>My Test Site - pacemaker02</body>

</html>

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm resource move ClusterIP pacemaker03

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

当然我们可以创建三个IPaddr2，和这三个apache分别做成一个group，前面加一个haproxy，就可以负载均衡了。

Anonymous Clone每个resource agent都可以，不需要什么特殊的处理，只要把resource启动起来就可以了。

第二种是Globally Unique Clones

这种clone，一个resource虽然被clone成多个，但是每个clone不一样，比如启动了三个apache，一个是财经新闻，一个是政治新闻，一个是娱乐新闻

为了支持global unique clone，必须要自己写相应的resource agent，至少lsb的不可以。

Copies of a clone are identified by appending a colon and a numerical offset, eg. apache:2. 这个数字称为clone id

需要resource agent根据clone id的不同进行不同的操作。

globally-unique='true'的resource可以做到下面的两点：

由于每个clone instance都是唯一的，不同的，因而两个clone instance可以运行在同一个机器上

resource agent可以将clone id作为一个hash函数，从而实现负载均衡

默认的resource中实现了这种clone的就是IPaddr2

IPaddr2中我们可以看到下面的代码

$IPTABLES -I INPUT -d $OCF_RESKEY_ip -i $NIC -j CLUSTERIP \

--new \

--clustermac $IF_MAC \

--total-nodes $IP_INC_GLOBAL \

--local-node $IP_INC_NO \

--hashmode $IP_CIP_HASH

这是使用的IPtables中的CLUSTERIP target

一般来说，一个网络上，IP应该是唯一的，并且只有一个机器拥有这个IP，当进行arp寻找IP的时候，只有一个机器恢复。

为了实现Load Balancer，IPtables使得多个机器都拥有这个IP，并且有这个IP的clustermac。根据sourse ip以及source port进行hash运算，toal-nodes是总共的拥有这个IP的node的数量，local-node是这是第几个node，hashmod是指进行hash的方式，默认是sourceip-sourceport，出了这个iptables规则，还会生成/proc/net/ipt_CLUSTERIP/VIP_ADDRESS，当有arp的时候，会hash计算出哪个node应该反映，于是哪个host进行应答

我们先配置一个clone-vip

root@pacemaker01:~# crm configure clone clone-vip ClusterIP meta clone-max='2' clone-node-max='1' globally-unique='true'

root@pacemaker01:~# crm_mon -1

Last updated: Sat Aug 2 18:46:40 2014

Last change: Sat Aug 2 18:46:31 2014 via cibadmin on pacemaker01

Stack: corosync

Current DC: pacemaker01 (1084777482) - partition with quorum

Version: 1.1.10-42f2063

3 Nodes configured

3 Resources configured

Online: [ pacemaker01 pacemaker02 pacemaker03 ]

WebSite (ocf::heartbeat:apache): Started pacemaker02

Clone Set: clone-vip [ClusterIP] (unique)

ClusterIP:0 (ocf::heartbeat:IPaddr2): Started pacemaker01

ClusterIP:1 (ocf::heartbeat:IPaddr2): Started pacemaker03

可以看出和apache不同，配置了两个不同的ClusterIP，一个是ClusterIP:0，一个是ClusterIP:1，都带clone id做后缀

我们先去pacemaker01上去看看

root@pacemaker01:~# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

link/ether 52:54:00:9b:d5:11 brd ff:ff:ff:ff:ff:ff

inet 192.168.100.10/24 brd 192.168.100.255 scope global eth0

valid_lft forever preferred_lft forever

inet 192.168.100.100/24 brd 192.168.100.255 scope global secondary eth0

valid_lft forever preferred_lft forever

inet6 fe80::5054:ff:fe9b:d511/64 scope link

valid_lft forever preferred_lft forever

root@pacemaker01:~# iptables -nvL

Chain INPUT (policy ACCEPT 489 packets, 47504 bytes)

pkts bytes target prot opt in out source destination

0 0 CLUSTERIP all -- eth0 * 0.0.0.0/0 192.168.100.100 CLUSTERIP hashmode=sourceip-sourceport clustermac=31:B3:55:6F:CE:87 total_nodes=2 local_node=1 hash_init=0

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)

pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 708 packets, 87757 bytes)

pkts bytes target prot opt in out source destination

再去pacemaker02上看看

root@pacemaker03:/home/openstack# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

link/ether 52:54:00:9b:d5:33 brd ff:ff:ff:ff:ff:ff

inet 192.168.100.12/24 brd 192.168.100.255 scope global eth0

valid_lft forever preferred_lft forever

inet 192.168.100.100/24 brd 192.168.100.255 scope global secondary eth0

valid_lft forever preferred_lft forever

inet6 fe80::5054:ff:fe9b:d533/64 scope link

valid_lft forever preferred_lft forever

root@pacemaker03:/home/openstack# iptables -nvL

Chain INPUT (policy ACCEPT 610 packets, 78996 bytes)

pkts bytes target prot opt in out source destination

0 0 CLUSTERIP all -- eth0 * 0.0.0.0/0 192.168.100.100 CLUSTERIP hashmode=sourceip-sourceport clustermac=31:B3:55:6F:CE:87 total_nodes=2 local_node=2 hash_init=0

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)

pkts bytes target prot opt in out source destination

Chain OUTPUT (policy ACCEPT 390 packets, 44182 bytes)

pkts bytes target prot opt in out source destination

看到两个iptables的不同了吧，就是local_node不同

root@pacemaker03:/home/openstack# crm configure clone clone-apache WebSite

root@pacemaker03:/home/openstack# crm_mon -1

Last updated: Sat Aug 2 18:49:40 2014

Last change: Sat Aug 2 18:49:33 2014 via cibadmin on pacemaker03

Stack: corosync

Current DC: pacemaker01 (1084777482) - partition with quorum

Version: 1.1.10-42f2063

3 Nodes configured

5 Resources configured

Online: [ pacemaker01 pacemaker02 pacemaker03 ]

Clone Set: clone-vip [ClusterIP] (unique)

ClusterIP:0 (ocf::heartbeat:IPaddr2): Started pacemaker01

ClusterIP:1 (ocf::heartbeat:IPaddr2): Started pacemaker03

Clone Set: clone-apache [WebSite]

Started: [ pacemaker01 pacemaker02 pacemaker03 ]

下面我们访问apache网站，发现有时候是pacemaker1返回，有时候是pacemaker2返回。

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker01</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker01</body>

</html>

root@pacemaker02:/home/openstack# curl

<html>

<body>My Test Site - pacemaker03</body>

</html>

最后一种Clone是stateful Clone

也即每种clone都是有状态的，主要是两种状态active和passive，和普通的active和passive不同，active可以是多个。

master-max ：How many copies of the resource can be promoted to master status; default 1.

为了支持stateful Clone，resource agent需要有action: promote和demote

Stateful的一个很好的例子是mysql

mysql的resource agent是基于mysql replication技术进行的，mysql的instance有的是master，有的是slave

在文件/usr/lib/ocf/resource.d/heartbeat/mysql中

start) mysql_start

resource agent的start会调用mysql_start

在mysql_start中，会调用下面的命令

${OCF_RESKEY_binary} --defaults-file=$OCF_RESKEY_config \

--pid-file=$OCF_RESKEY_pid \

--socket=$OCF_RESKEY_socket \

--datadir=$OCF_RESKEY_datadir \

--user=$OCF_RESKEY_user $OCF_RESKEY_additional_parameters \

$mysql_extra_params >/dev/null 2>&1 &

rc=$?

其中OCF_RESKEY_binary_default="/usr/local/bin/mysqld_safe"，这是启动mysql进程的命令

启动了mysql进行后，等待一段时间，于是判断ocf_is_ms，是否是master/slave模式

如果是，则首先将当前的mysql设为readonly状态，set_read_only on，因为不知道当前是否已经有一个master在运行，所以以slave的方式先启动

master_host=`echo $OCF_RESKEY_CRM_meta_notify_master_uname|tr -d " "`

if [ "$master_host" -a "$master_host" != ${HOSTNAME} ]; then

ocf_log info "Changing MySQL configuration to replicate from $master_host."

set_master

start_slave

if [ $? -ne 0 ]; then

ocf_log err "Failed to start slave"

return $OCF_ERR_GENERIC

else

ocf_log info "No MySQL master present - clearing replication state"

unset_master

接下来我们看OCF_RESKEY_CRM_meta_notify_master_uname，这个是pacemaker notify action的结果，

$OCF_RESKEY_CRM_meta_notify_master_uname — node name of the node where the resource currently

is in the Master role

$OCF_RESKEY_CRM_meta_notify_promote_uname — node name of the node where the resource currently

is being promoted to the Master role (

promote notifications only)

$OCF_RESKEY_CRM_meta_notify_demote_uname — node name of the node where the resource currently

is being demoted to the Slave role (

demote notifications only)

当有其他的mysql被选举称为master的时候，则set_master，会调用

ocf_run $MYSQL $MYSQL_OPTIONS_REPL \

-e "CHANGE MASTER TO MASTER_HOST='$new_master', \

MASTER_USER='$OCF_RESKEY_replication_user', \

MASTER_PASSWORD='$OCF_RESKEY_replication_passwd' $master_params"

将slave指向master

start_slave() {

ocf_run $MYSQL $MYSQL_OPTIONS_REPL \

-e "START SLAVE"

}

如果当前还没有master，则调用unset_master

# Now, stop all slave activity and unset the master host

ocf_run $MYSQL $MYSQL_OPTIONS_REPL \

-e "STOP SLAVE"

if [ $? -gt 0 ]; then

ocf_log err "Error stopping rest slave threads"

exit $OCF_ERR_GENERIC

ocf_run $MYSQL $MYSQL_OPTIONS_REPL \

-e "RESET SLAVE;"

if [ $? -gt 0 ]; then

ocf_log err "Failed to reset slave"

exit $OCF_ERR_GENERIC

如果当期没有master，但是$master_host" == ${HOSTNAME}的时候，说明你自己被选作了master.

最后$CRM_MASTER -v 1

CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot "

crm_master - A convenience wrapper for crm_attribute Set, update or delete a resource's promotion score

-l, --lifetime=value

Until when should the setting take affect. Valid values: reboot, forever

当你自己是master的时候，会被调用promote action

调用mysql_promote，里面首先会stop slave，终止自己作为slave的角色

ocf_run $MYSQL $MYSQL_OPTIONS_REPL \

-e "STOP SLAVE"

# Set Master Info in CIB, cluster level attribute

update_data_master_status

master_info="$(get_local_ip)|$(get_master_status File)|$(get_master_status Position)"

${CRM_ATTR_REPL_INFO} -v "$master_info"

自己将要是master了，update_data_master_status中

update_data_master_status() {

master_status_file="${HA_RSCTMP}/master_status.${OCF_RESOURCE_INSTANCE}"

$MYSQL $MYSQL_OPTIONS_REPL -e "SHOW MASTER STATUS\G" > $master_status_file

}

将master的status保存在文件里面

CRM_ATTR_REPL_INFO="${HA_SBIN_DIR}/crm_attribute --type crm_config --name ${INSTANCE_ATTR_NAME}_REPL_INFO -s mysql_replicatio

将master的信息写入CIB

set_read_only off

正式成为master

$CRM_MASTER -v $((${OCF_RESKEY_max_slave_lag}+1))当前的master有个一个更高的score，从而原来的master回来的时候，不至于switch回去。

当其他的slave收到notify，有了新的master诞生了，马上投靠

在mysql_notify函数中

post-promote会unset_master，set_master，start_slave

Resource Templates

如果想定义多个resources，有相似的配置，则可以使用resource templates

crm configure rsc_template BigVM ocf:heartbeat:Xen params allow_mem_management=”true” op monitor timeout=60s interval=15s op stop timeout=10m op start timeout=10m

我们可以基于他生成一个resource

crm configure primitive MyVM1 @BigVM params xmfile=”/etc/xen/shared-vm/MyVM1” name=”MyVM1”

也可以覆盖template中的参数

Resource的参数

Resource多有以下几种参数，

一种称为Resource Options (Meta Attributes)，在定义中，常用meta进行定义，在resource agent里面多采用OCF_RESKEY_CRM_meta_XXX

通过crm_resource可以进行管理

List the configured resources:

# crm_resource --list

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm_resource --list

Clone Set: clone-vip [ClusterIP] (unique)

ClusterIP:0 (ocf::heartbeat:IPaddr2): Started

ClusterIP:1 (ocf::heartbeat:IPaddr2): Started

Clone Set: clone-apache [WebSite]

Started: [ pacemaker01 pacemaker02 pacemaker03 ]

List the available OCF agents:

# crm_resource --list-agents ocf

List the available OCF agents from the linux-ha project:

# crm_resource --list-agents ocf:heartbeat

Display the current location of 'myResource':

# crm_resource --resource myResource --locate

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm_resource --resource clone-apache --locate

resource clone-apache is running on: pacemaker03

resource clone-apache is running on: pacemaker01

resource clone-apache is running on: pacemaker02

root@pacemaker01:/usr/lib/ocf/resource.d/heartbeat# crm_resource --resource clone-vip --locate

resource clone-vip is running on: pacemaker01

resource clone-vip is running on: pacemaker03

Move 'myResource' to another machine:

# crm_resource --resource myResource --move

Move 'myResource' to a specific machine:

# crm_resource --resource myResource --move --node altNode

Allow (but not force) 'myResource' to move back to its original location:

# crm_resource --resource myResource --un-move

Tell the cluster that 'myResource' failed:

# crm_resource --resource myResource --fail

Stop a 'myResource' (and anything that depends on it):

# crm_resource --resource myResource --set-parameter target-role --meta --parameter-value Stopped

Tell the cluster not to manage 'myResource':

The cluster will not attempt to start or stop the resource under any circumstances.

Useful when performing maintenance tasks on a resource.

# crm_resource --resource myResource --set-parameter is-managed --meta --parameter-value false

Erase the operation history of 'myResource' on 'aNode':

The cluster will 'forget' the existing resource state (including any errors) and attempt to recover the resource.

Useful when a resource had failed permanently and has been repaired by an administrator.

# crm_resource --resource myResource --cleanup --node aNode

crm_resource --meta --resource Email --set-parameter priority --property-value 100crm_resource --meta --resource Email --set-parameter multiple-active --property-value block

第二种是Instance Attributes (Parameters)，多用params参数表示，crm ra info IPaddr2可以看到所有的参数，这些参数会传到resource agent里面

如OCF_RESKEY_cidr_netmask

crm_resource --resource Public-IP --set-parameter ip --property-value 1.2.3.4

第三种是Resource Operations，多用参数op表示，action常为monitor, start, stop，设置一般设置interval, timeout等。表示每过interval的时间，resource agent都会调用monitor来查看状态，而且start或者stop的时间不能超过timeout

requires指的是在什么条件下，操作才进行nothing, quorum, fencing

on-fail指的是当resource fail了，进行什么操作，ignore, stop, restart, fence, standby

role是什么角色才进行操作，有stopped, started, master比如

op monitor interval=”300s” role=”Stopped” timeout=”10s”

op monitor interval=”30s” timeout=”10s”

表示在running的情况下30s一监控，在stopped情况下，300s一监控。

Setting Global Defaults for Operations

crm_attribute --type op_defaults --attr-name timeout --attr-value 20s

When Resources Take a Long Time to Start/Stop

There are a number of implicit operations that the cluster will always perform - start, stop and a non-recurring monitor operation (used at startup to check the resource isn't already active). If one of these is taking too long, then you can create an entry for them and simply specify a new value.

Multiple Monitor Operations

To tell the resource agent what kind of check to perform, you need to provide each monitor with a different value for a common parameter. The OCF standard creates a special parameter called OCF_CHECK_LEVEL for this purpose and dictates that it is made available to the resource agent without the normal OCF_RESKEY_ prefix.