Replication from Postgres
Doltgres has the capability to replicate from a running Postgres server via Postgres's logical replication feature. This is useful when you want to continue serving production traffic with your existing Postgres server, but want to add auditing and diff capability via a live DoltgreSQL backup.
Configuring the Postgres primary server
Logical replication must be enabled on your primary Postgres server via the postgresql.conf
file, followed by a server restart. Your config file may be in a different location from the one below.
After enabling logical replication, create a publication for the tables you want replicated to dolt, as well as a replication slot for the DoltgreSQL replica to connect to. From the psql
shell, run the following commands as an admin user:
This setup only needs to be run once. Change the value of the exported environment variables to change which database and tables are exported to Doltgres. The publication name and the slot name do not need to be the same, but are set to the same value above to make them easier to identify to administrators.
You can also replicate all tables to Doltgres with the alternate syntax:
Or to replicate more than one table, separate them with commas like so:
Postgres also allows replicating only certain rows by specifying a WHERE
clause in the CREATE PUBLICATION
command. See the Postgres documentation on this subject for more detail.
Configuring the Doltgres replica
CREATE TABLE
and other DDL statements are not replicated from the primary. Every table being replicated must exist on the replica before replication begins, so create them with CREATE TABLE
statements before starting the server in replication mode. Notes on replicated schemas:
Column types do not need to exactly match the primary, but they must be compatible (e.g.
int8
andint4
).The replicated table can have extra columns as long as they can contain
NULL
.It's possible to replicate only a subset of the columns from a table. See the Postgres documentation on this subject for more detail.
To run your Doltgres server in replica mode, include the following stanza in the config.yaml
provided to the server on the command line:
The server will connect to postgres and log information for every replication message it receives.
To automatically create a Dolt commit for every replicated transaction, set the dolt_transaction_commit
parameter in the behavior
stanza to true. This will let you see diffs between any two replicated transactions on the primary.
In every other respect, a replication server is a normal Doltgres server, which means you can make other connections to it and run other queries (including queries that may interfere with replication).
Importing initial data into DoltgreSQL
By default, Doltgres will begin replicating from WAL position 0/0
(the very beginning of table history). For some primary servers, this location in the WAL may no longer be available because Postgres periodically compresses old WAL data. In this case, it's necessary to start replication after seeding the replica with a dump, via a multi-step process:
Create a snapshot of the data to be replicated on the primary with a tool like
pg_dump
. Make sure writes to the primary are disabled during the dump process.Record the current WAL location on the primary with
SELECT pg_current_wal_lsn();
Re-enable writes to the primary.
Import the dump to the replica.
psql < pg_dump
can accomplish this on a running server.Write the WAL location recorded in step 2 to the file
./.doltcfg/pg_wal_location
Restart the Doltgres replica server with replication enabled as above.
Doltgres should now receive updates from the primary beginning at the point just after the snapshot was taken.
Last updated