Wednesday, January 22, 2014

Automate Backup Process and Store on AWS S3

  Automate Backup Process and Store on AWS S3

Hello Guys,

This is in fact the problem with most of the system administrator that they want to automate the backup process and want to store there on a reliable and highly available location.
 Location from which data can be retrieve at very high speed if required in case of server crash and production copy lost and they also want to store the data in GBs.

In My case I have around i have to daily backup 100GBs of data from mysql and store it .
so I have used a trick and use a cheap storage but highly reliable.
I come to know about Amazon S3 storage server its very cost effective and highly available through out the world.

Then Second problem i face is interfacing the two systems to work together to solve my problem like automated backup should be done and it should be stored in a particular folder and after certain time it should delete the older one like backup rotation.

SO i have to write a script which create a lazy backup and store it on s3 at the same time it should delete the backup which i have created 7 days back as its not much use full for me.

So I have use s3cmd as a interface between my backup script and Amazone S3.
Installation is simeple no rocket science involved.As i am using ubuntu 12.04

sudo apt-get install s3cmd

i have created amazon account and created credentials etc.

after thet I have write a script to take a backup of all postgres databases.

#!/bin/sh
#### BEGIN CONFIGURATION ####
# set dates for backup rotation
NOWDATE=`date +%Y-%m-%d`
LASTDATE=$(date +%Y-%m-%d --date='1 week ago')
# set backup directory variables
SRCDIR='/tmp/s3backups'
DESTDIR='path/to/s3folder'
BUCKET='s3bucket'
# database access details
HOST='127.0.0.1'
PORT='5432'
USER='user'
#### END CONFIGURATION ####
# make the temp directory if it doesn't exist
mkdir -p $SRCDIR
# dump each database to its own sql file
DBLIST=`psql -l -h$HOST -p$PORT -U$USER \
| awk '{print $1}' | grep -v "+" | grep -v "Name" | \
grep -v "List" | grep -v "(" | grep -v "template" | \
grep -v "postgres" | grep -v "root" | grep -v "|" | grep -v "|"`
# get list of databases
for DB in ${DBLIST}
do
pg_dump -h$HOST -p$PORT -U$USER $DB -f $SRCDIR/$DB.sql
done
# tar all the databases into $NOWDATE-backups.tar.gz
cd $SRCDIR
tar -czPf $NOWDATE-backup.tar.gz *.sql
# upload backup to s3
/usr/bin/s3cmd put $SRCDIR/$NOWDATE-backup.tar.gz s3://$BUCKET/$DESTDIR/
# delete old backups from s3
/usr/bin/s3cmd del --recursive s3://$BUCKET/$DESTDIR/$LASTDATE-backup.tar.gz
# remove all files in our source directory
cd
rm -f $SRCDIR/*

Sanme way I have script for mysql too.

#!/bin/sh
#### BEGIN CONFIGURATION ####
# set dates for backup rotation
NOWDATE=`date +%Y-%m-%d`
LASTDATE=$(date +%Y-%m-%d --date='1 week ago')
# set backup directory variables
SRCDIR='/tmp/s3backups'
DESTDIR='path/to/s3folder'
BUCKET='s3bucket'
# database access details
HOST='127.0.0.1'
PORT='3306'
USER='user'
PASS='pass'
#### END CONFIGURATION ####
# make the temp directory if it doesn't exist
mkdir -p $SRCDIR
# repair, optimize, and dump each database to its own sql file
for DB in $(mysql -h$HOST -P$PORT -u$USER -p$PASS -BNe 'show databases' | grep -Ev 'mysql|information_schema|performance_schema')
do
mysqldump -h$HOST -P$PORT -u$USER -p$PASS --quote-names --create-options --force $DB > $SRCDIR/$DB.sql
mysqlcheck -h$HOST -P$PORT -u$USER -p$PASS --auto-repair --optimize $DB
done
# tar all the databases into $NOWDATE-backups.tar.gz
cd $SRCDIR
tar -czPf $NOWDATE-backup.tar.gz *.sql
# upload backup to s3
/usr/bin/s3cmd put $SRCDIR/$NOWDATE-backup.tar.gz s3://$BUCKET/$DESTDIR/
# delete old backups from s3
/usr/bin/s3cmd del --recursive s3://$BUCKET/$DESTDIR/$LASTDATE-backup.tar.gz
# remove all files in our source directory
cd
rm -f $SRCDIR/*

#Same script we can use with variations hope you know all it Enjoy..