Friday, November 28, 2008

postgresql replication using slony-I

As most postgresql users must be knowing, postgresql does not provide any inbuilt replication solution. There are lots of 3rd party replication products available for postgresql. Slony is one of them. Slony is a trigger based replication solution, that is it used triggers to push data to the slave. Slony is supposed to be one of the most stable replication solutions for postgresql.

You can download slony from www.slony.info. There are two major versions of slony - slony-I & slony-II. Slony-I is a simple master-slave replication solution. Whereas slony-II is a advanced multi-master replication solution. We will go ahead with simple master-slave replication solution. So we will download Slony-I. The latest version available is Slony-I 1.2.15. Slony-I 2.0 is in RC and should be soon released. But we will go with a stable release - 1.2.15.

Postgresql version being used is 8.3.3. To install slony, simply untar the downloaded file and run
./configure --with-pgconfigdir=<path to pg_config>
make
sudo make install


I have used two machines for setting up replication. Installed postgresql and slony-I on both of them.

master server ip : 172.16.3.211
slave server ip : 172.16.3.173


We will be working with the superuser postgres which is used to start and stop the postgresql server.

Quick steps

  • Define environment variables on master & slave. The main purpose is to make our task easier. Lets create an env.sh file containing all the definitions.

    #!/bin/sh

    REPLICATIONUSER=postgres
    CLUSTERNAME=replcluster
    MASTERDBNAME=repltestdb
    SLAVEDBNAME=repltestdb
    MASTERHOST=172.16.3.211
    SLAVEHOST=172.16.3.173
    MASTERPORT=5432
    SLAVEPORT=5432
    MASTERDBA=postgres
    SLAVEDBA=postgres
    PSQL=/usr/local/pgsql/bin/psql
    CREATEDB=/usr/local/pgsql/bin/createdb
    CREATELANG=/usr/local/pgsql/bin/createlang
    CREATEUSER=/usr/local/pgsql/bin/createuser
    PGDUMP=/usr/local/pgsql/bin/pg_dump

    export REPLICATIONUSER CLUSTERNAME MASTERDBNAME SLAVEDBNAME MASTERHOST SLAVEHOST PSQL CREATEDB CREATELANG CREATEUSER PGDUMP MASTERPORT SLAVEPORT MASTERDBA SLAVEDBA


    As you can see here, my postgresql is installed in /usr/local/pgsql. I have defined the IP addresses & ports of master and slave servers. I have used the superuser postgres for replication. And i have defined the master and slave databases to be used for replication. You can replicate between databases with different names on master and slave - just change the names in all the scripts.

  • Create database on master & slave
    On master run
    /usr/local/pgsql/bin/createdb -O $REPLICATIONUSER -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME
    On slave run
    /usr/local/pgsql/bin/createdb -O $REPLICATIONUSER -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME

  • Since slony-I depends on triggers for replication, you will need to install the plsql procedural language on master to generate and run triggers & stored procedures for pushing data to slave.
    /usr/local/pgsql/bin/createlang -h $MASTERHOST -p $MASTERPORT plpgsql $MASTERDBNAME

  • Put some tables in the $MASTERDBNAME on master, which you want to replicate. And port the tables to slave. It has to be done manually.

    Dump the tables on master
    /usr/local/pgsql/bin/pg_dump -s -U $MASTERDBA -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME > replmaster.sql

    Import the tables on slave
    /usr/local/pgsql/bin/psql -U $SLAVEDBA -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME < replmaster.sql

  • And now configure the databases for replication. When you install Slony-I, it puts two binaries slonik and slon in the pgsql/bin directory. Slonik is the tool which is used for creating configuration tables, stored procedures and triggers. All we need to do is create a configuration file to pass it to the slonik tool. Here i am assuming that there are two tables which need to be replicated - parent & child.

    vim replconfig.cnf
    # define the namespace the replication system uses in our example it is
    # replcluster
    cluster name = replcluster;
    # admin conninfo's are used by slonik to connect to the nodes one for each
    # node on each side of the cluster, the syntax is that of PQconnectdb in
    # the C-API
    node 1 admin conninfo = 'dbname=repltestdb host=172.16.3.211 port=5432 user=postgres';
    node 2 admin conninfo = 'dbname=repltestdb host=172.16.3.173 port=5432 user=postgres';
    # init the first node. Its id MUST be 1. This creates the schema
    # _$CLUSTERNAME containing all replication system specific database
    # objects.
    init cluster ( id=1, comment = 'Master Node');
    # Add unique keys to table that do not have one.
    # This command adds a bigint column named "_Slony-I_$CLUSTERNAME_rowID" to the table which will have a default value of nextval('_$CLUSTERNAME.s1_rowid_seq') and have UNIQUE & NOT NULL constraints applied on it.
    # table add key (node id = 1, fully qualified name = 'table_name');
    # Slony-I organizes tables into sets. The smallest unit a node can
    # subscribe is a set.
    # you need to have a set add table() for each table you wish to replicate
    create set (id=1, origin=1, comment='parent child table')
    set add table (set id=1, origin=1, id=1, fully qualified name = 'public.parent', comment='parent table');
    set add table (set id=1, origin=1, id=2, fully qualified name = 'public.child', comment='child table');
    # Create the second node (the slave) tell the 2 nodes how to connect to
    # each other and how they should listen for events.
    store node (id=2, comment = 'Slave node');
    store path (server = 1, client = 2, conninfo='dbname=repltestdb host=172.16.3.211 port=5432 user=postgres');
    store path (server = 2, client = 1, conninfo='dbname=repltestdb host=172.16.3.173 port=5432 user=postgres');
    store listen (origin=1, provider = 1, receiver =2);
    store listen (origin=2, provider = 2, receiver =1);


    Pass the config file to slonik for creating required triggers & config tables.

    /usr/local/pgsql/bin/slonik replconfig.cnf

  • Lets start the replication daemons on master & slave

    On master run
    /usr/local/pgsql/bin/slon $CLUSTERNAME "dbname=$MASTERDBNAME user=$MASTERDBA host=$MASTERHOST port=$MASTERPORT" > slon.log &

    On slave run
    /usr/local/pgsql/bin/slon $CLUSTERNAME "dbname=$SLAVEDBNAME user=$SLAVEDBA host=$SLAVEHOST port=$SLAVEPORT" > slon.log &

    Check out the output in slon.log files

  • Now everything is setup and from the slon.log files on master and slave you can see that both the servers are trying to sync with each other. But still replication is not on way. To start replication we need to make the slave subscribe to the master. Here is the required config file for doing this

    startrepl.cnf
    # This defines which namespace the replication system uses
    cluster name = replcluster;
    # connection info for slonik to connect to master & slave
    node 1 admin conninfo = 'dbname=repltestdb host=172.16.3.211 port=5432 user=postgres';
    node 2 admin conninfo = 'dbname=repltestdb host=172.16.3.173 port=5432 user=postgres';
    # Node 2 subscribes set 1
    subscribe set ( id = 1, provider = 1, receiver = 2, forward = no);


    Passing this file to slonik will do the trick and replication would start happening.

    /usr/local/pgsql/bin/slonik startrepl.cnf



Now simply make some inserts, updates and deletes on the master and check out whether they are happening on the slave as well. Officially, since replication is on full swing all changes in master tables should be replicated on the slave.

Please note that new tables & changes to table structures wont be replicated automatically. So whenever a new table is created or an existing table is altered the changes has to be manually propagated to slave and the scripts need to be run to make appropriate changes in the triggers and config tables.

Another important thing to note is that postgresql on master and slave should be able to communicate with both the ip addresses. For this add the ip addresses in the pgsql/data/pg_hba.conf.

For the able replication i had added the lines
host all all 172.16.3.211/32 trust
host all all 172.16.3.173/32 trust

to the pg_hba.conf file in both master & slave.

6 comments:

Serge Chevarie-Pelletier said...

create set (id=1, origin=1, comment='parent child table')

This lines seems to be missing a ;

thanks,

Serge

Anonymous said...

Do the second and third script have to be run on the slave as well. I tried to run the second script in the slave and it gave an error saying no password supplied.
Please advice..

gamegeek said...

You need to run the env.sh script before running any script. This ensures that the environment variables which include "password" are set.

Also check the lines that need to be added to pg_hba.conf for enabling access to master and slave - at the end of the blog post.

Unknown said...

Hi Jayant,

I downloaded the Slony-I, but it fails to configure, due to missing pg_config and I can't find this file anywhere in my system. I use Fedora Linux. Thanks for your advice. Michael

gamegeek said...

@michael: that is strange. If you have installed postgresql properly, you should be able to find pg_config in /usr/local/pgsql/bin - that is if you have compiled and installed using source

sunil said...

Hi Jayanth

Thats a great useful stuff which you posted, thank you for it.