■ PostgreSQL 이중화 구성(Pgpool2+Watchdog)
▶ Pgpool2 + Watchdog을 이용한 PostgreSQL 이중화 구성을 Active+Standby 형태로 구성합니다.
▶ pgpool-II version 4.1.4 버전을 사용하였습니다.
▶ pgpool2과 연동하여 사용한 Watchdog 버전은 5.16 버전입니다.
▶ 구성 환경은 Ubuntu 22.04 LTS 버전에서 PostgreSQL 14.5 버전을 apt install 한 환경입니다.
▶ Failover 되면 Master DB에 장애가 발생한 상태이므로, 원인이 확인되지 않은 상태에서 원복 되는 상황을 방지하기 위해 Auto Failback은 구성하지 않았습니다.
※유의 : pgpool2 4.1.4 버전의 경우 DB노드 3개 이상으로 클러스터 구성 해야 VIP를 이용한 서비스 및 Master Server Online Recovery 등이 원활하게 테스트 되는 것으로 확인됩니다. pgpool 4.3 이상 사용을 권장드립니다.
- pgpool2는 버전마다 설정이 다소 상이한 부분이 있습니다. https://www.pgpool.net 에서 참고하시기 바랍니다.
▶ Master 서버는 현재 운영 중인 서버이며, Slave서버는 신규로 PostgreSQL설치 후 pg_baseback으로 Master 데이터를 가져온 상태입니다.
▶ pgpool은 Failover와 Failback과 관련한 스크립트가 정교하게 작성되어야 하며 버전 및 OS환경에 따라 달라질 수 있으므로 많은 사이트를 참고하여 최적화시켜야 합니다.
1. Pgpool 과 Watchdog의 기본 정보
- Pgpool은 PostgreSQL의 커넥션과 이중화를 관리해주는 프로그램입니다.
- 데이타베이스 클라이언트는 Pgpool의 VIP(가상IP 주소)로 접속하고 Pgpool에서 Master DB로 연결해 줍니다.
- Pgpool에서 DB서버 상태 체크는 Watchdog을 이용합니다.
- Watchdog은 DB서버를 감시하고 있다가 Master에 장애가 발생하면 VIP를 Slave로 넘겨줍니다.
2. 시스템 구성 환경 설정
1) Hostname 과 IP
- Masger : berasix(192.168.48.129)
- Slave : berasix-2(192.168.48.130)
- VIP : 192.168.48.121
2) pgpool2 환경
: Config 파일 위치 - /etc/pgpool2/
3) DB Users
: postgres - PostgreSQL 관리 및 Online recovery
: pgpool - Pgpool Health. 신규 생성 필요
: repluser - Replication User. 이전 포스팅에서 생성
4) Script
4)-1. Failover
- /data/pgpool/script/failover.sh
: /data/pgpool/script 디렉터리는 신규 생성 및 postgres 권한 필요
4)-2. Failback
- /data/pgpool/script/failback.sh
: /data/postgres14-data/script 디렉터리는 신규 생성 및 postgres 권한 필요
4)-3. Online Recovery
- /data/postgres14-data/script/recovery_1st_stage
- /data/postgres14-data/script/pgpool_remote_start
4)-4. Watchdog escalation : Executes this command at escalation on new active pgpool
- /data/pgpool/script/escalation.sh
5) 디렉토리 생성과 권한 부여(Master, slave 서버 모두)
5)-1. failover, failback 스트립트 디렉터리 생성
root@berasix:~# mkdir -p /data/pgpool/script
root@berasix:~# chown -R postgres.postgres /data/pgpool
5)-2. pgpool2 로그 디렉토리 생성 및 설정
- 디렉토리 및 파일 생성
root@berasix:~# mkdir /var/log/pgpool2
root@berasix:~# touch /var/log/pgpool2/pgpool.log
root@berasix:~# chown -R syslog.adm /var/log/pgpool2
- syslog 설정
root@berasix:~# vi /etc/rsyslog.d/50-default.conf
...
local2.* /var/log/pgpool2/pgpool.log
- log rotate 설정
root@berasix:~# vi /etc/logrotate.d/rsyslog
...
/var/log/pgpool2/pgpool.log
6) /etc/hosts 파일에 서버 등록
root@berasix:/data/postgres14-data# cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 berasix
192.168.48.130 berasix-2
192.168.48.129 berasix
3. pgpool2 설치(Master, slave 서버 모두)
- 설치는 편의상 root 계정으로 진행했습니다.
- Ubuntu의 경우 apt로 install이 완료되면 데몬이 자동 기동 되니 알고 계시기 바랍니다.
root@berasix-2:/data/postgres14-data# apt install pgpool2
- 설치된 pgpool2의 버전은 pgpool-II version 4.1.4 (karasukiboshi)입니다. 최신 버전은 4.3.3 이지만 Ubuntu repository에 등록된 버전을 사용하고자 아래 버전을 사용했습니다.
4. Watchdog 설치(Master, slave 서버 모두)
- 설치는 편의상 root 계정으로 진행했습니다.
root@berasix:~# apt install watchdog
5. pgpool db 계정 생성(Master, slave 서버 모두)
root@berasix:~# sudo -u postgres psql
could not change directory to "/root": Permission denied
psql (14.5 (Ubuntu 14.5-0ubuntu0.22.04.1))
Type "help" for help.
postgres=# create role pgpool with login;
CREATE ROLE
postgres=# \password pgpool
Enter new password for user "pgpool":
Enter it again:
postgres=# grant pg_monitor to pgpool;
GRANT ROLE
postgres=#
6. Pgpool 환경설정(pgpool.conf)
1) pgpool.conf 파일의 master, slave 설정 파일은 같으며, 다른 점은 hostname[서로의 IP를 설정] 설정 부분입니다.
- 설정파일의 원본 파일은 백업해 둡니다.
: root@berasix:~# cp -arp /etc/pgpool2/pgpool.conf /etc/pgpool2/pgpool.conf.org
- vi /etc/pgpool2/pgpool.conf
#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------
listen_addresses = '*' #서비스 할 주소 입력
port = 5433 #서비스 할 기본 포트
pcp_listen_addresses = '*'
pcp_port = 9898
# - Backend Connection Settings - #PostgreSQL 정보 입력
backend_hostname0 = '192.168.48.129' # Master(berasix)
backend_port0 = 5432 # Master DB Port
backend_weight0 = 1
backend_data_directory0 = '/data/postgres14-data/main' # Master DB Directory
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '192.168.48.130' # Slave(berasix-2)
backend_port1 = 5432 # DB Service Port
backend_weight1 = 1
backend_data_directory1 = '/data/postgres14-data/main' # DB Directory
backend_flag1 = 'ALLOW_TO_FAILOVER'
# - Authentication -
enable_pool_hba = on #접속 인증용 pool_hba.conf 사용을 위해 on으로 합니다.
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------
# - What to log -
log_connections = on # Default off.
log_hostname = on # Default off.
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------
logdir = '/var/log/postgresql' # LOG파일, pgpool_status 파일의 위치
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
load_balance_mode = on # Default off. Active + hot_standby 로 운영시 on으로 변경 필요
#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------
master_slave_mode = on # Default off. Active + hot_standby 로 운영시 on으로 변경 필요
master_slave_sub_mode = 'stream'
# - Streaming -
sr_check_period = 5 # Disabled (0) by default. Active + hot_standby 로 운영시 on으로 변경 필요
sr_check_user = 'repluser' # Streaming 복제 계정
sr_check_password = 'replica'
sr_check_database = 'postgres'
# - Special commands -
follow_master_command = ''
#follow_master_command = '/etc/pgpool2/follow_master.sh %d %h %p %D %m %H %M %P %r %R' ## 3개 이상 노드에서 failover 시 master 설정. 2개노드에서는 필요없음
#------------------------------------------------------------------------------
# HEALTH CHECK GLOBAL PARAMETERS
#------------------------------------------------------------------------------
health_check_period = 5 # Disabled (0) by default.
health_check_timeout = 10
health_check_user = 'pgpool'
health_check_password = 'pgpool-2'
health_check_max_retries = 3 # 0 by default. Health Check 재시도는 상황에 맞게 설정
#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
failover_command = '/data/pgpool/script/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S'
failback_command = '/data/pgpool/script/failback.sh %d %h %p %D %m %H %M %P %r %R %N %S'
auto_failback = off # Default off
#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------
recovery_user = 'postgres'
recovery_password = ''
recovery_1st_stage_command = 'recovery_1st_stage' # pcp_recovery_node 명령 사용시
#------------------------------------------------------------------------------
# WATCHDOG
#------------------------------------------------------------------------------
# - Enabling -
use_watchdog = on # Default off. Watchdog 사용
# - Watchdog communication Settings -
wd_hostname = '192.168.48.129' # Watchdog 체크 IP입니다. Slave(berasix-2) 서버는 '192.168.48.130'을 넣어줍니다.
wd_port = 9000
# - Virtual IP control Setting -
delegate_IP = '192.168.48.121' # db 서비스용 VIP입니다.
#if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0'
if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev ens33 label ens33:0' # NIC 장치명에 맞게 넣어주세요
#if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0'
if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev ens33'
arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I ens33' # NIC 장치명에 맞게 넣어주세요
# - Behaivor on escalation Setting -
wd_escalation_command = '/data/pgpool/script/escalation.sh' # Escalation 명령어 스크립트. watchdog down시 VIP를 내립니다.
# -- heartbeat mode --
heartbeat_destination0 = '192.168.48.130' # heartbea 체크 IP입니다. #상태측 서버 IP 정보를 넣어줍니다.
other_pgpool_hostname0 = '192.168.48.130' #상태측 서버 IP 정보를 넣어줍니다.
other_pgpool_port0 = 5433 # Default 5432 인데 상대측 pgpool port 이므로 5433을 넣어줘야 한다.
other_wd_port0 = 9000
#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------
# - Where to log -
log_destination = 'syslog'
syslog_facility = 'local2'
#------------------------------------------------------------------------------
※ 참고
1) arping 명령 - ARP 요청을 이용한 네트워크 연결을 확인 - Ubuntu 22.04에서는 기본 미설치로 설치를 해야 합니다. root@berasix:/etc/pgpool2# apt install arping 2) VIP 사용을 위한 /sbin/ip, /usr/sbin/arping 명령을 postgres 명령에서 패스워드 없이 사용 가능 하도록 설정해줘야 합니다. root@berasix:/etc/pgpool2# echo 'postgres ALL=NOPASSWD: /sbin/ip, /usr/sbin/arping' >> /etc/sudoers |
7. ssh key 생성(Master, Slave 서버 모두)
- Failover 및 Failback을 위한 원격지 DB서버 ssh 접속 key 생성
- ssh key 생성 관련해서는 아래 포스팅을 참고 하세요.
https://berasix.tistory.com/entry/SSH-%ED%8C%A8%EC%8A%A4%EC%9B%8C%EB%93%9C%EC%97%86%EC%9D%B4-%EB%A1%9C%EA%B7%B8%EC%9D%B8%ED%95%98%EA%B8%B0
1) root
cd ~/.ssh
ssh-keygen -t rsa -f id_rsa_pgpool
ssh-copy-id -i id_rsa_pgpool.pub postgres@192.168.48.129
ssh-copy-id -i id_rsa_pgpool.pub postgres@192.168.48.130
2) postgres
su - postgres
mkdir ~/.ssh
chmod 700 ~/.ssh
cd ~/.ssh
ssh-keygen -t rsa -f id_rsa_pgpool
ssh-copy-id -i id_rsa_pgpool.pub postgres@192.168.48.129
ssh-copy-id -i id_rsa_pgpool.pub postgres@192.168.48.130
8. 스크립트 생성 및 복사(Master 서버 => Slave 서버)
1) 해당 스크립트는 9번 내용을 참고 하시기 바랍니다.
2) Online recovery를 위한 스트립트 경로는 아래와 같습니다.
$PGDATA/recovery_1st_stage
$PGDATA/pgpool_remote_start
3) 스크립트를 Slave 서버 동일 경로에 복사 합니다.
root@berasix:~# chown -R postgres.postgres /data/pgpool
root@berasix:~# chown postgres.postgres /etc/pgpool2/pgpool.conf
root@berasix:~# scp -p /etc/pgpool2/pgpool.conf postgres@192.168.48.130:/etc/pgpool2/
root@berasix:~# scp -p /data/postgres14-data/main/recovery_1st_stage postgres@192.168.48.130:/data/postgres14-data/main
root@berasix:~# scp -p /data/postgres14-data/main/pgpool_remote_start postgres@192.168.48.130:/data/postgres14-data/main
root@berasix:~# scp -p /data/pgpool/script/failover.sh postgres@192.168.48.130:/data/pgpool/script
root@berasix:~# scp -p /data/pgpool/script/failback.sh postgres@192.168.48.130:/data/pgpool/script
root@berasix:~# scp -p /data/pgpool/script/escalation.sh postgres@192.168.48.130:/data/pgpool/script
※ 참고
recovery_1st_stage 스크립트는 테이블스페이스를 지원하지 않습니다. 테이블스페이스를 사용하는 경우 테이블스페이스를 지원하도록 스크립트를 수정해야 합니다. |
9. pgpool 스크립트
- 해당 스크립트는 pgpool 예제 사이트 ( https://www.pgpool.net/docs/41/en/html/example-cluster.html )를 참고하시기 바랍니다.
1) Failover 스크립트
- /data/pgpool/script/failover.sh
#!/bin/bash
# This script is run by failover_command.
set -o xtrace
exec > >(logger -i -p local1.info) 2>&1
# Special values:
# %d = failed node id
# %h = failed node hostname
# %p = failed node port number
# %D = failed node database cluster path
# %m = new master node id
# %H = new master node hostname
# %M = old master node id
# %P = old primary node id
# %r = new master port number
# %R = new master database cluster path
# %N = old primary node hostname
# %S = old primary node port number
# %% = '%' character
FAILED_NODE_ID="$1"
FAILED_NODE_HOST="$2"
FAILED_NODE_PORT="$3"
FAILED_NODE_PGDATA="$4"
NEW_MASTER_NODE_ID="$5"
NEW_MASTER_NODE_HOST="$6"
OLD_MASTER_NODE_ID="$7"
OLD_PRIMARY_NODE_ID="$8"
NEW_MASTER_NODE_PORT="$9"
NEW_MASTER_NODE_PGDATA="${10}"
OLD_PRIMARY_NODE_HOST="${11}"
OLD_PRIMARY_NODE_PORT="${12}"
PGHOME=/usr/lib/postgresql/14
REPL_SLOT_NAME=${FAILED_NODE_HOST//[-.]/_}
POSTGRESQL_STARTUP_USER=postgres
SSH_KEY_FILE=id_rsa_pgpool
SSH_OPTIONS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/${SSH_KEY_FILE}"
logger -i -p local1.info failover.sh: start: failed_node_id=$FAILED_NODE_ID old_primary_node_id=$OLD_PRIMARY_NODE_ID failed_host=$FAILED_NODE_HOST new_master_host=$NEW_MASTER_NODE_HOST
## If there's no master node anymore, skip failover.
if [ $NEW_MASTER_NODE_ID -lt 0 ]; then
logger -i -p local1.info failover.sh: All nodes are down. Skipping failover.
exit 0
fi
## Test passwordless SSH
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NEW_MASTER_NODE_HOST} ls /tmp > /dev/null
if [ $? -ne 0 ]; then
logger -i -p local1.info failover.sh: passwordless SSH to postgres@${NEW_MASTER_NODE_HOST} failed. Please setup passwordless SSH.
exit 1
fi
## If Standby node is down, skip failover.
if [ $OLD_PRIMARY_NODE_ID != "-1" -a $FAILED_NODE_ID != $OLD_PRIMARY_NODE_ID ]; then
logger -i -p local1.info failover.sh: Standby node is down. Skipping failover.
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${OLD_PRIMARY_NODE_HOST} "
${PGHOME}/bin/psql -p $OLD_PRIMARY_NODE_PORT -c \"SELECT pg_drop_replication_slot('${REPL_SLOT_NAME}')\"
"
if [ $? -ne 0 ]; then
logger -i -p local1.error failover.sh: drop replication slot "${REPL_SLOT_NAME}" failed
exit 1
fi
exit 0
fi
## Promote Standby node.
logger -i -p local1.info failover.sh: Primary node is down, promote standby node ${NEW_MASTER_NODE_HOST}.
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NEW_MASTER_NODE_HOST} ${PGHOME}/bin/pg_ctl -D ${NEW_MASTER_NODE_PGDATA} -w promote
if [ $? -ne 0 ]; then
logger -i -p local1.error failover.sh: new_master_host=$NEW_MASTER_NODE_HOST promote failed
exit 1
fi
logger -i -p local1.info failover.sh: end: new_master_node_id=$NEW_MASTER_NODE_ID started as the primary node
exit 0
2) Failback 스크립트
- /data/pgpool/script/failback.sh
#!/bin/bash
# This script is run after failover_command to synchronize the Standby with the new Primary.
# First try pg_rewind. If pg_rewind failed, use pg_basebackup.
set -o xtrace
exec > >(logger -i -p local1.info) 2>&1
# Special values:
# 1) %d = node id
# 2) %h = hostname
# 3) %p = port number
# 4) %D = database cluster path
# 5) %m = new primary node id
# 6) %H = new primary node hostname
# 7) %M = old master node id
# 8) %P = old primary node id
# 9) %r = new primary port number
# 10) %R = new primary database cluster path
# 11) %N = old primary node hostname
# 12) %S = old primary node port number
# 13) %% = '%' character
NODE_ID="$1"
NODE_HOST="$2"
NODE_PORT="$3"
NODE_PGDATA="$4"
NEW_MASTER_NODE_ID="$5"
NEW_MASTER_NODE_HOST="$6"
OLD_MASTER_NODE_ID="$7"
OLD_PRIMARY_NODE_ID="$8"
NEW_MASTER_NODE_PORT="$9"
NEW_MASTER_NODE_PGDATA="${10}"
PGHOME=/usr/lib/postgresql/14
ARCHIVEDIR=/data/postgres14-data/archive
SSH_USER=postgres
PCP_USER=pgpool
PGPOOL_PATH=/usr/sbin
PCP_PORT=9898
REPL_SLOT_NAME=${NODE_HOST//[-.]/_}
POSTGRESQL_STARTUP_USER=postgres
SSH_KEY_FILE=id_rsa_pgpool
SSH_OPTIONS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/${SSH_KEY_FILE}"
logger -i -p local1.info follow_master.sh: start: Standby node ${NODE_ID}
# Check the connection status of Standby
${PGHOME}/bin/pg_isready -h ${NODE_HOST} -p ${NODE_PORT} > /dev/null 2>&1
if [ $? -ne 0 ]; then
logger -i -p local1.info follow_master.sh: node_id=${NODE_ID} is not running. skipping follow master command
exit 0
fi
## Test passwordless SSH
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NEW_MASTER_NODE_HOST} ls /tmp > /dev/null
if [ $? -ne 0 ]; then
logger -i -p local1.info follow_master.sh: passwordless SSH to postgres@${NEW_MASTER_NODE_HOST} failed. Please setup passwordless SSH.
exit 1
fi
## Get PostgreSQL major version
PGVERSION=`${PGHOME}/bin/initdb -V | awk '{print $3}' | sed 's/\..*//' | sed 's/\([0-9]*\)[a-zA-Z].*/\1/'`
if [ $PGVERSION -ge 12 ]; then
RECOVERYCONF=${NODE_PGDATA}/myrecovery.conf
else
RECOVERYCONF=${NODE_PGDATA}/recovery.conf
fi
# Synchronize Standby with the new Primary.
logger -i -p local1.info follow_master.sh: pg_rewind for node $NODE_ID
# Create replication slot "${REPL_SLOT_NAME}"
${PGHOME}/bin/psql -h ${NEW_MASTER_NODE_HOST} -p ${NEW_MASTER_NODE_PORT} \
-c "SELECT pg_create_physical_replication_slot('${REPL_SLOT_NAME}');" >/dev/null 2>&1
if [ $? -ne 0 ]; then
logger -i -p local1.error follow_master.sh: create replication slot \"${REPL_SLOT_NAME}\" failed. You may need to create replication slot manually.
fi
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NODE_HOST} "
set -o errexit
${PGHOME}/bin/pg_ctl -w -m f -D ${NODE_PGDATA} stop
${PGHOME}/bin/pg_rewind -D ${NODE_PGDATA} --source-server=\"user=postgres host=${NEW_MASTER_NODE_HOST} port=${NEW_MASTER_NODE_PORT}\"
rm -rf ${NODE_PGDATA}/pg_replslot/*
cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${NEW_MASTER_NODE_HOST} port=${NEW_MASTER_NODE_PORT} user=${REPLUSER} application_name=${NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${SSH_OPTIONS} ${NEW_MASTER_NODE_HOST}:${ARCHIVEDIR}/%f %p'
primary_slot_name = '${REPL_SLOT_NAME}'
EOT
if [ ${PGVERSION} -ge 12 ]; then
sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
-e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${NODE_PGDATA}/postgresql.conf
touch ${NODE_PGDATA}/standby.signal
else
echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
fi
${PGHOME}/bin/pg_ctl -l /dev/null -w -D ${NODE_PGDATA} start
"
# If pg_rewind failed, try pg_basebackup
if [ $? -ne 0 ]; then
logger -i -p local1.error follow_master.sh: end: pg_rewind failed. Try pg_basebackup.
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NODE_HOST} "
set -o errexit
rm -rf ${NODE_PGDATA}
rm -rf ${ARCHIVEDIR}/*
${PGHOME}/bin/pg_basebackup -h ${NEW_MASTER_NODE_HOST} -U $REPLUSER -p ${NEW_MASTER_NODE_PORT} -D ${NODE_PGDATA} -X stream
cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${NEW_MASTER_NODE_HOST} port=${NEW_MASTER_NODE_PORT} user=${REPLUSER} application_name=${NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${SSH_OPTIONS} ${NEW_MASTER_NODE_HOST}:${ARCHIVEDIR}/%f %p'
primary_slot_name = '${REPL_SLOT_NAME}'
EOT
if [ ${PGVERSION} -ge 12 ]; then
sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
-e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${NODE_PGDATA}/postgresql.conf
touch ${NODE_PGDATA}/standby.signal
else
echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
fi
"
if [ $? -ne 0 ]; then
# drop replication slot
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NEW_MASTER_NODE_HOST} "
${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c \"SELECT pg_drop_replication_slot('${REPL_SLOT_NAME}')\"
"
logger -i -p local1.error follow_master.sh: end: pg_basebackup failed
exit 1
fi
# start Standby node on ${NODE_HOST}
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${NODE_HOST} $PGHOME/bin/pg_ctl -l /dev/null -w -D ${NODE_PGDATA} start
fi
# If start Standby successfully, attach this node
if [ $? -eq 0 ]; then
# Run pcp_attact_node to attach Standby node to Pgpool-II.
${PGPOOL_PATH}/pcp_attach_node -w -h localhost -U $PCP_USER -p ${PCP_PORT} -n ${NODE_ID}
if [ $? -ne 0 ]; then
logger -i -p local1.error follow_master.sh: end: pcp_attach_node failed
exit 1
fi
else
# If start Standby failed, drop replication slot "${REPL_SLOT_NAME}"
${PGHOME}/bin/psql -h ${NEW_MASTER_NODE_HOST} -p ${NEW_MASTER_NODE_PORT} \
-c "SELECT pg_drop_replication_slot('${REPL_SLOT_NAME}');" >/dev/null 2>&1
if [ $? -ne 0 ]; then
logger -i -p local1.error follow_master.sh: drop replication slot \"${REPL_SLOT_NAME}\" failed. You may need to drop replication slot manually.
fi
logger -i -p local1.error follow_master.sh: end: follow master command failed
exit 1
fi
logger -i -p local1.info follow_master.sh: end: follow master command complete
exit 0
3) VIP Escalation 스크립트
- /data/pgpool/script/escalation.sh
#!/bin/bash
# This script is run by wd_escalation_command to bring down the virtual IP on other pgpool nodes
# before bringing up the virtual IP on the new active pgpool node.
set -o xtrace
#PGPOOLS=(server1 server2)
PGPOOLS=(192.168.48.129 192.168.48.130)
VIP=192.168.48.121
DEVICE=ens33
USER=postgres
for pgpool in "${PGPOOLS[@]}"; do
[ "$HOSTNAME" = "$pgpool" ] && continue
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null ${USER}@$pgpool -i ~/.ssh/id_rsa_pgpool "
/usr/bin/sudo /sbin/ip addr del $VIP/24 dev $DEVICE
"
done
exit 0
4) Master Server Online Recovery(standby mode) 스크립트
- /data/postgres14-data/main/recovery_1st_stage
#!/bin/bash
# This script is executed by "recovery_1st_stage" to recovery a Standby node.
set -o xtrace
exec > >(logger -i -p local1.info) 2>&1
PRIMARY_NODE_PGDATA="$1"
DEST_NODE_HOST="$2"
DEST_NODE_PGDATA="$3"
PRIMARY_NODE_PORT="$4"
DEST_NODE_ID="$5"
DEST_NODE_PORT="$6"
PRIMARY_NODE_HOST=$(hostname)
PGHOME=/usr/lib/postgresql/14
ARCHIVEDIR=/data/postgres14-data/archive
POSTGRESQL_CONF=/etc/postgresql/14/main
REPLUSER=repluser
REPL_SLOT_NAME=${DEST_NODE_HOST//[-.]/_}
POSTGRESQL_STARTUP_USER=postgres
SSH_KEY_FILE=id_rsa_pgpool
SSH_OPTIONS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/${SSH_KEY_FILE}"
logger -i -p local1.info recovery_1st_stage: start: pg_basebackup for Standby node $DEST_NODE_ID
## Test passwordless SSH
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${DEST_NODE_HOST} ls /tmp > /dev/null
if [ $? -ne 0 ]; then
logger -i -p local1.info recovery_1st_stage: passwordless SSH to postgres@${DEST_NODE_HOST} failed. Please setup passwordless SSH.
exit 1
fi
## Get PostgreSQL major version
PGVERSION=`${PGHOME}/bin/initdb -V | awk '{print $3}' | sed 's/\..*//' | sed 's/\([0-9]*\)[a-zA-Z].*/\1/'`
if [ $PGVERSION -ge 12 ]; then
RECOVERYCONF=${DEST_NODE_PGDATA}/myrecovery.conf
else
RECOVERYCONF=${DEST_NODE_PGDATA}/recovery.conf
fi
## Create replication slot "${REPL_SLOT_NAME}"
${PGHOME}/bin/psql -p ${PRIMARY_NODE_PORT} << EOQ
SELECT pg_create_physical_replication_slot('${REPL_SLOT_NAME}');
EOQ
## Execute pg_basebackup to recovery Standby node
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@$DEST_NODE_HOST "
set -o errexit
rm -rf $DEST_NODE_PGDATA
rm -rf $ARCHIVEDIR/*
${PGHOME}/bin/pg_basebackup -h $PRIMARY_NODE_HOST -U $REPLUSER -p $PRIMARY_NODE_PORT -D $DEST_NODE_PGDATA -X stream
cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${PRIMARY_NODE_HOST} port=${PRIMARY_NODE_PORT} user=${REPLUSER} application_name=${DEST_NODE_HOST} passfile=''/var/lib/pgsql/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${SSH_OPTIONS} ${PRIMARY_NODE_HOST}:${ARCHIVEDIR}/%f %p'
primary_slot_name = '${REPL_SLOT_NAME}'
EOT
if [ ${PGVERSION} -ge 12 ]; then
sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
-e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${POSTGRESQL_CONF}/postgresql.conf
touch ${DEST_NODE_PGDATA}/standby.signal
else
echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
fi
sed -i \"s/#*port = .*/port = ${DEST_NODE_PORT}/\" ${POSTGRESQL_CONF}/postgresql.conf
"
if [ $? -ne 0 ]; then
${PGHOME}/bin/psql -p ${PRIMARY_NODE_PORT} << EOQ
SELECT pg_drop_replication_slot('${REPL_SLOT_NAME}');
EOQ
logger -i -p local1.error recovery_1st_stage: end: pg_basebackup failed. online recovery failed
exit 1
fi
logger -i -p local1.info recovery_1st_stage: end: recovery_1st_stage complete
exit 0
5) Online Recovery 완료 후 기동 스크립트
- /data/postgres14-data/main/pgpool_remote_start
#!/bin/bash
# This script is run after recovery_1st_stage to start Standby node.
set -o xtrace
exec > >(logger -i -p local1.info) 2>&1
DEST_NODE_HOST="$1"
DEST_NODE_PGDATA="$2"
PGHOME=/usr/lib/postgresql/14
POSTGRESQL_CONF=/etc/postgresql/14/main
POSTGRESQL_STARTUP_USER=postgres
SSH_KEY_FILE=id_rsa_pgpool
SSH_OPTIONS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/.ssh/${SSH_KEY_FILE}"
logger -i -p local1.info pgpool_remote_start: start: remote start Standby node $DEST_NODE_HOST
## Test passwordless SSH
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${DEST_NODE_HOST} ls /tmp > /dev/null
if [ $? -ne 0 ]; then
logger -i -p local1.info pgpool_remote_start: passwordless SSH to postgres@${DEST_NODE_HOST} failed. Please setup passwordless SSH.
exit 1
fi
## Start Standby node
ssh -T ${SSH_OPTIONS} ${POSTGRESQL_STARTUP_USER}@${DEST_NODE_HOST} "
$PGHOME/bin/pg_ctl -l /dev/null -l /data/pgpool/pgctl_log -w -D $POSTGRESQL_CONF start
"
if [ $? -ne 0 ]; then
logger -i -p local1.error pgpool_remote_start: $DEST_NODE_HOST PostgreSQL start failed.
exit 1
fi
logger -i -p local1.info pgpool_remote_start: end: $DEST_NODE_HOST PostgreSQL started successfully.
exit 0
※ 참고 및 출처
https://www.pgpool.net/docs/41/en/html/example-cluster.html
https://www.pgpool.net/docs/42/en/html/example-cluster.html
https://www.pgpool.net/docs/41/en/html/runtime-watchdog-config.html
'DBMS' 카테고리의 다른 글
[PostgreSQL 이중화] #6. PostgreSQL Pgpool2+Watchdog 이중화 Failover와 Failback (1) | 2022.11.10 |
---|---|
[PostgreSQL 이중화] #5. PostgreSQL 14.5 Pgpool2+Watchdog을 이용한 이중화(2) (0) | 2022.11.09 |
[PostgreSQL 이중화] #3. Ubuntu 22.04 LTS에서 PostgreSQL 14.5 이중화 구성(HA) (2) | 2022.11.07 |
[PostgreSQL 이중화] #2. PostgreSQL 14.5 백업과 복구(pg_dumpall & psql) (4) | 2022.11.04 |
[PostgreSQL 이중화] #1. DB이중화와 PostgreSQL Cluster (5) | 2022.11.03 |
댓글