Further Testing
Quering recovery process
If you have a live database that processes a lot of data it may
take some time for the replication server to “catch up” There are a few ways to see where it is in the process.
From the slave database server log in as root and run the
following command
> ps -AF | grep post
|
Here are two more ways to query the database to see what WAL
file it is working on .
From the Master database you can run the following command
> psql -c "SELECT
pg_current_xlog_location()"
|
Which returns something like this
So how do you read this?
If you look in the pg_xlog folder on the main server you would
see files named like this
0000000100000E8C00000095
0000000100000E8C00000096
0000000100000E8C00000097
0000000100000E8C00000098
0000000100000E8C00000099
0000000100000E8C0000009A
Now lets deconstruct the E8C/987975C0
E8C/987975C0
0000000100000E8C00000098
That is how the numbers line up and the rest is a HEX offset
within the file (Where within the file it is reading)
Now from the Slave database you can run the following command.
> psql -c "SELECT
pg_last_xlog_replay_location()"
|
Which returns something like this
Primary stop/start
As some further testing I stopped the primary database and
started it back up after a few minutes. (I also stopped and
restarted my python program)
>
sudo /etc/init.d/postgresql stop
>
sleep 120
>
sudo /etc/init.d/postgresql start
>
./insertDB.py
|
The replication database had no problem with this
Replication stop/start
From the replication server I stopped and restarted the postgres
server.
>
sudo /etc/init.d/postgresql stop
>
sleep 240
>
sudo /etc/init.d/postgresql start
|
No problem with this either
Test Delete from Replication
The Replication server only has read access, let’s try and
delete.
From the database on the replication server try and run this
>
\c nand
>
delete from data;
|
Change the Replication server
to a primary server
Warning do not do this in a live system!!! I just did this to
make sure it works…
The recover.conf file defined /home/postgres/failover as the
trigger_file. Create a file here and see if it stops being a
replication server.
Run the following commands
>
sudo mkdir -p /home/postgres
>
touch /home/postgres/failover
|
Now when I log into the database from the replication server and
run the following commands.
>
\c nand
>
select count(*) from data;
|
I see that it is no longer getting fed from the primary database
server.
And I can run the following command
>
delete from data;
|
And it can now write to the database.
Also when this occurs recovery.conf was renamed to
recovery.done.
Now that the Replication server has become a Primary server you
can’t easily switch it back to a Replication server. Which is it should
be!
No comments:
Post a Comment