原文地址是 PostgreSQL and the logical replication ,主要介绍了 PostgreSQL 10 的逻辑复制功能,并使用 Docker 搭建环境演示了该功能的简单使用。本文算是中文翻译版本,所有命令都是重新执行过的,有些地方可能略做补充说明。
本文我将简单测试 PostgreSQL 10 新的复制模式——逻辑复制(Logical Replication)。其实官方给出了详细的说明,请查看 Logical Replication。
逻辑复制 – Logical Replication
在 PostgreSQL 10 之前,我们只能复制整个 Cluster,即所有的数据库表。通常情况下,这种复制模式被人们所接受;但在一些场景下却不是那么的高效,比如我只想在备库上读某个表,为什么要把主库的所有表都复制过来呢。
逻辑复制解决了这一痛点,允许我们只复制特定的表。它的操作原则如下:
- 在主库选择需要复制的表,并创建 publication
- 在备库创建 subscription
注意:数据库模版和 DDL 命令无法被复制,必须手动在备库执行相应的操作以保持一致性。更多有关逻辑复制的限制请查阅 Restrictions。
通过 Docker 搭建环境
接下来,我们将使用 Docker 演示逻辑复制的基本功能。
- 使用 Docker 建立演示环境
$ docker network create lab
26c093e0c39d5e5d02133f69acc2c00f4bc47868f4008e1efb3f35c3c2356978
$ docker container run --detach --name pgmaster --network lab -e POSTGRES_PASSWORD=password postgres:10.1-alpine
Unable to find image 'postgres:10.1-alpine' locally
10.1-alpine: Pulling from library/postgres
605ce1bd3f31: Pull complete
8d2abe692710: Pull complete
a18eca080102: Pull complete
91cfdfa52a99: Pull complete
452cbf05af6d: Pull complete
4a256b03e5bd: Pull complete
379c8298095d: Pull complete
12207b3814c9: Pull complete
373a2b3fdfa1: Pull complete
Digest: sha256:d6150452877361e26d0fd178d278af8f80d59c75cbb5c2413cfb4fdb5ed4f750
Status: Downloaded newer image for postgres:10.1-alpine
2932a6d81218c7c0548a0f005d230ba46c3f23d40571d40fdb97c8c7ce354cce
$ docker container run --detach --name pgslave --network lab -e POSTGRES_PASSWORD=password postgres:10.1-alpine
ad0c4c67710757922904bfcf471f9a8bb6351e5fca3aa37b50805349b4bfcbfd
主库设置
更改 WAL 级别并重启 container:逻辑复制需要 “logical” 级别
$ docker container exec -it pgmaster /bin/sh
/ # psql -U postgres
psql (10.1)
Type "help" for help.
postgres=# ALTER SYSTEM SET wal_level = 'logical'; ALTER SYSTEM
postgres=# \q
/ # exit
$ docker container restart pgmaster
pgmaster
创建数据库和表
$ docker container exec -it pgmaster /bin/sh
/ # psql -U postgres
psql (10.1)
Type "help" for help.
postgres=# create database pea;
CREATE DATABASE
postgres=# \c pea
You are now connected to database "pea" as user "postgres".
pea=# create table hashes(id serial, value char(33), primary key(value));
CREATE TABLE
pea=#
注意:创建的表必须设置主键或唯一性索引,否则需要定义 REPLICA IDENTITY
才能实现逻辑复制。
向表里插入一些数据
pea=# insert into hashes (select generate_series(1,1000),md5(random()::text)) ;
INSERT 0 1000
创建用于逻辑复制的用户
pea=# CREATE ROLE replicate WITH LOGIN PASSWORD 'Azerty' REPLICATION ;
CREATE ROLE
为该表创建 publication
pea=# CREATE PUBLICATION pubhash FOR TABLE hashes ;
CREATE PUBLICATION
设置逻辑复制用户对该表的权限,至少有读权限才能做逻辑复制
pea=# GRANT SELECT ON hashes TO replicate;
GRANT
开启主库的日志记录,打开 logging_collector 开关并设置日志文件位置和目录等,保存后重启备库。
$ docker container exec -it pgmaster /bin/sh
/ # vi /var/lib/postgresql/data/postgresql.conf
/ # exit
$ docker container restart pgmaster
注:可使用 “docker logs -f pgmaster” 查看 PostgreSQL 的日志输出,这样的话就不需要设置 logging_collector 了。
至此,主库的设置完毕,下面来看备库的设置。
备库设置
首先,按照主库的设置方法,打开备库的日志记录功能。
创建数据库
$ docker container exec -it pgslave /bin/sh
/ # psql -U postgres
psql (10.1)
Type "help" for help.
postgres=# create database pea_repl;
CREATE DATABASE
postgres=# \c pea_repl
You are now connected to database "pea_repl" as user "postgres".
pea_repl=#
创建一个 subscription
pea_repl=# CREATE SUBSCRIPTION subhash CONNECTION 'host=pgmaster dbname=pea user=replicate password=Azerty' PUBLICATION pubhash;
ERROR: relation "public.hashes" does not exist
pea_repl=#
注意这里 PostgreSQL 报错了,提示 hashes 这个表不存在,这是逻辑复制的 “缺点”:需要手动创建要复制的表。
pea_repl=# create table hashes (id serial, value char(33), primary key(value));
CREATE TABLE
再次创建 subscription
pea_repl=# CREATE SUBSCRIPTION subhash CONNECTION 'host=pgmaster dbname=pea user=replicate password=Azerty' PUBLICATION pubhash;
NOTICE: created replication slot "subhash" on publisher
CREATE SUBSCRIPTION
pea_repl=#
现在,逻辑复制已经生效,hashes 表的内容已经被复制到了备库。
pea_repl=# select count(*) from hashes;
count
-------
1000
(1 row)
逻辑复制功能已经正常工作了。
如何查看日志
逻辑复制已经生效,查看主库和备库的日志。
主库日志
$ docker container exec -it pgmaster /bin/sh
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012727.log |tee
2019-02-28 01:32:41.096 UTC [49] LOG: logical decoding found consistent point at 0/161F308
2019-02-28 01:32:41.096 UTC [49] DETAIL: There are no running transactions.
2019-02-28 01:32:41.105 UTC [50] LOG: starting logical decoding for slot "subhash"
2019-02-28 01:32:41.105 UTC [50] DETAIL: streaming transactions committing after 0/161F340, reading WAL from 0/161F308
2019-02-28 01:32:41.106 UTC [50] LOG: logical decoding found consistent point at 0/161F308
2019-02-28 01:32:41.106 UTC [50] DETAIL: There are no running transactions.
2019-02-28 01:32:41.130 UTC [51] LOG: logical decoding found consistent point at 0/161F340
2019-02-28 01:32:41.130 UTC [51] DETAIL: There are no running transactions.
备库日志
$ docker container exec -it pgslave /bin/sh
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012828.log |tee
2019-02-28 01:28:28.136 UTC [19] LOG: database system was shut down at 2019-02-28 01:28:27 UTC
2019-02-28 01:28:28.142 UTC [1] LOG: database system is ready to accept connections
2019-02-28 01:32:28.294 UTC [36] ERROR: relation "public.hashes" does not exist
2019-02-28 01:32:28.294 UTC [36] STATEMENT: CREATE SUBSCRIPTION subhash CONNECTION 'host=pgmaster dbname=pea user=replicate password=Azerty' PUBLICATION pubhash ;
2019-02-28 01:32:41.103 UTC [38] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:32:41.107 UTC [39] LOG: logical replication table synchronization worker for subscription "subhash", table "hashes" has started
2019-02-28 01:32:41.139 UTC [39] LOG: logical replication table synchronization worker for subscription "subhash", table "hashes" has finished
其他测试
为 hashes 表添加一列,然后删除所有数据
/ # psql -U postgres
psql (10.1)
Type "help" for help.
postgres=# \c pea
You are now connected to database "pea" as user "postgres".
pea=# alter table hashes add column gold boolean default false ;
ALTER TABLE
pea=# delete from hashes ;
DELETE 1000
pea=#
这时,我们会在备库收到错误信息。
$ docker container exec -it pgslave /bin/sh
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012828.log |tee
2019-02-28 01:46:18.339 UTC [38] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:46:18.340 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 38) exited with exit code 1
2019-02-28 01:46:18.341 UTC [71] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:46:18.354 UTC [71] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:46:18.354 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 71) exited with exit code 1
同时,主库也会提示逻辑复制出错的信息。
$ docker container exec -it pgmaster /bin/sh
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012727.log |tee
2019-02-28 01:50:19.310 UTC [138] LOG: logical decoding found consistent point at 0/161F420
2019-02-28 01:50:19.310 UTC [138] DETAIL: There are no running transactions.
2019-02-28 01:50:19.321 UTC [138] LOG: could not send data to client: Connection reset by peer
2019-02-28 01:50:19.321 UTC [138] CONTEXT: slot "subhash", output plugin "pgoutput", in the change callback, associated LSN 0/166F778
2019-02-28 01:50:24.330 UTC [139] LOG: starting logical decoding for slot "subhash"
2019-02-28 01:50:24.330 UTC [139] DETAIL: streaming transactions committing after 0/1661690, reading WAL from 0/161F420
2019-02-28 01:50:24.330 UTC [139] LOG: logical decoding found consistent point at 0/161F420
2019-02-28 01:50:24.330 UTC [139] DETAIL: There are no running transactions.
2019-02-28 01:50:24.350 UTC [139] LOG: could not send data to client: Connection reset by peer
2019-02-28 01:50:24.350 UTC [139] CONTEXT: slot "subhash", output plugin "pgoutput", in the change callback, associated LSN 0/1668950
2019-02-28 01:50:29.360 UTC [143] LOG: starting logical decoding for slot "subhash"
2019-02-28 01:50:29.360 UTC [143] DETAIL: streaming transactions committing after 0/1661690, reading WAL from 0/161F420
2019-02-28 01:50:29.360 UTC [143] LOG: logical decoding found consistent point at 0/161F420
2019-02-28 01:50:29.360 UTC [143] DETAIL: There are no running transactions.
2019-02-28 01:50:29.371 UTC [143] LOG: could not send data to client: Connection reset by peer
2019-02-28 01:50:29.371 UTC [143] CONTEXT: slot "subhash", output plugin "pgoutput", in the change callback, associated LSN 0/166E158
2019-02-28 01:50:34.381 UTC [144] LOG: starting logical decoding for slot "subhash"
2019-02-28 01:50:34.381 UTC [144] DETAIL: streaming transactions committing after 0/1661690, reading WAL from 0/161F420
2019-02-28 01:50:34.381 UTC [144] LOG: logical decoding found consistent point at 0/161F420
2019-02-28 01:50:34.381 UTC [144] DETAIL: There are no running transactions.
2019-02-28 01:50:34.391 UTC [144] LOG: could not send data to client: Connection reset by peer
2019-02-28 01:50:34.391 UTC [144] CONTEXT: slot "subhash", output plugin "pgoutput", in the change callback, associated LSN 0/166A5E8
解决这个问题就需要将备库的数据库表修改成与主库表的样子,修改之后逻辑复制就恢复正常了。
$ docker container exec -it pgslave /bin/sh
/ # psql -U postgres
psql (10.1)
Type "help" for help.
postgres=# \c pea_repl
You are now connected to database "pea_repl" as user "postgres".
pea_repl=# alter table hashes add column gold boolean default false ;
ALTER TABLE
pea_repl=# \q
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012828.log |tee
2019-02-28 01:53:20.053 UTC [169] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:53:20.067 UTC [169] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:53:20.067 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 169) exited with exit code 1
2019-02-28 01:53:25.073 UTC [173] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:53:25.088 UTC [173] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:53:25.088 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 173) exited with exit code 1
2019-02-28 01:53:30.090 UTC [175] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:53:30.107 UTC [175] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:53:30.107 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 175) exited with exit code 1
2019-02-28 01:53:35.114 UTC [176] LOG: logical replication apply worker for subscription "subhash" has started
添加新表并填充数据,更新 publication
$ docker container exec -it pgmaster /bin/sh
/ # psql -U postgres pea
psql (10.1)
Type "help" for help.
pea=# create table hash2hash (id serial, value char(33), primary key(value) ) ;
CREATE TABLE
pea=# grant select on hash2hash to replicate ;
GRANT
pea=# insert into hash2hash (select generate_series(1,1000),md5(md5(random()::text))) ;
INSERT 0 1000
pea=# alter publication pubhash add table hash2hash ;
ALTER PUBLICATION
pea=#
同时,在备库创建 hash2hash 表,刷新 subscription,查看日志确认逻辑复制已经生效。
$ docker container exec -it pgslave /bin/sh
/ # psql -U postgres pea_repl
psql (10.1)
Type "help" for help.
pea_repl=# create table hash2hash (id serial, value char(33), primary key(value) ) ;
CREATE TABLE
pea_repl=# alter subscription subhash refresh publication ;
ALTER SUBSCRIPTION
pea_repl=# \q
/ # tail -f /var/lib/postgresql/data/log/postgresql-2019-02-28_012828.log |tee
2019-02-28 01:53:20.067 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 169) exited with exit code 1
2019-02-28 01:53:25.073 UTC [173] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:53:25.088 UTC [173] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:53:25.088 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 173) exited with exit code 1
2019-02-28 01:53:30.090 UTC [175] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 01:53:30.107 UTC [175] ERROR: logical replication target relation "public.hashes" is missing some replicated columns
2019-02-28 01:53:30.107 UTC [1] LOG: worker process: logical replication worker for subscription 16394 (PID 175) exited with exit code 1
2019-02-28 01:53:35.114 UTC [176] LOG: logical replication apply worker for subscription "subhash" has started
2019-02-28 02:16:01.200 UTC [226] LOG: logical replication table synchronization worker for subscription "subhash", table "hash2hash" has started
2019-02-28 02:16:01.228 UTC [226] LOG: logical replication table synchronization worker for subscription "subhash", table "hash2hash" has finished
同时,可以在备库上查到新表的数据。
$ docker container exec -it pgslave /bin/sh
/ # psql -U postgres pea_repl
psql (10.1)
Type "help" for help.
pea_repl=# select count(*) from hash2hash ;
count
-------
1000
(1 row)
pea_repl=#
在备库停止逻辑复制
停止逻辑复制后,之前复制到备库的数据不会丢失。
$ docker container exec -it pgslave /bin/sh
/ # psql -U postgres pea_repl
psql (10.1)
Type "help" for help.
pea_repl=# alter subscription subhash disable ;
ALTER SUBSCRIPTION
pea_repl=# drop subscription subhash ;
NOTICE: dropped replication slot "subhash" on publisher
DROP SUBSCRIPTION
pea_repl=#
pea_repl=# select count(*) from hash2hash ;
count
-------
1000
(1 row)
pea_repl=#
分类:PostgreSQL