206.2. Backup operations
206.2 Backup operations
Weight: 3
Description: Candidates should be able to use system tools to back up important system data.
Key Knowledge Areas:
Knowledge about directories that have to be include in backups
Awareness of network backup solutions such as Amanda, Bacula, Bareos and BackupPC
Knowledge of the benefits and drawbacks of tapes, CDR, disk or other backup media
Perform partial and manual backups.
Verify the integrity of backup files.
Partially or fully restore backups.
Terms and Utilities:
/bin/sh
dd
tar
/dev/st* and /dev/nst*
mt
rsync
Why do we need to Backup?
We have already talk about RAID and LVM, although making RAID group or creating LVM, make sort of reliability and safetiness but they are not consider as backup solutions. We might lose data and experience failure because of different reasons:
electricity goes off
Hardware failure (mobo, cpu, ram, hard disks, ...)
human kind miss configuration
...
To tell the truth the last reason is the most common and the most dangerous one. To avoid them we need to backup, we have to backup, we are forced to backup.
What to backup?
Not all directories and file are required to be backed up especially when there backup space issues. In linux File System Hierarchy Standard (FHS) there are directories and files with different priority for backing up:
How to backup?
In any platform there are always some native tools and third party programs for backing up:
All three Packages has Linux, windows, MacOS versions. Now lets spend time on some traditional native tools for backing up.
tape
Using tapes for backing up is some how out modded but tapes are still used because they are cheap and huge but they are very slow. If we have had chance to administrate a system with tape device plugged, we would see these directories.
/dev/st0
/dev/nst0
The /dev/nst0 device is a non rewinding tape device, where as the /dev/st0 device is a rewinding tape device. The device you choose to use depends on your goal. Both devices are for the same piece of hardware but they behave differently. we can rewind /dev/st0 by using software but we can not use software for /dev/nst0 so we have to rewind it physically.
Understanding tape file marks and block size
Each tape device can store multiple tape backup files. Tape backup files are created using cpio, tar, dd, and so on. However, tape device can be opened, written data to, and closed by the various program. We can store several backups (tapes) on physical tape. Between each tape file is a “tape file mark”. This is used to indicate where one tape file ends and another begins on physical tape. You need to use mt command to positions the tape (winds forward and rewinds and marks).
mt command
mt command is used to control operations of the tape drive, such as finding status or seeking through files on a tape or writing tape control marks to the tape.
Here is the list of tape position commands:
and many many other options.
How is data stored on a tape drive ?
All data is stored subsequently in sequential tape archive format using tar. The first tape archive will start on the physical beginning of the tape (tar #0). The next will be tar #1 and so on.
tar
We have got to use tar (tape archive(r)) to create tar files, but infact its designed to archive files on tape device.
Opps, delete some file inorder to restore it from our backup:
List the files inside tar file with -tvf switches:
We need to restore myfile and dir3 from our backup:
Lets combine usefull tar switches as a quick review:
rsync
Rsync (Remote Sync) is a most commonly used command for copying and synchronizing files and directories remotely as well as locally in Linux systems. With the help of rsync command we can copy and synchronize our data remotely and locally across directories, across disks and networks, perform data backups and mirroring between two Linux machines.
Some advantages and features of Rsync:
It efficiently copies and sync files to or from a remote system.
Supports copying links, devices, owners, groups and permissions.
It’s faster than scp (Secure Copy) Why ? because rsync uses remote-update protocol which allows to transfer just the differences between two sets of files. First time, it copies the whole content of a file or a directory from source to destination but from next time, it copies only the changed blocks and bytes to the destination.
Rsync consumes less bandwidth as it uses compression and decompression method while sending and receiving data both ends.
We might need to install rsync using yum install rsync
command or apt install rsync
in Debian.
Basic syntax of rsync is like rsync options source destination
, and what are options:
Enough introduction lets see rsync in action:
and to Copy a Directory from Local Server to a Remote Server:
and result:
And visa versa, Copy/Sync a Remote Directory to a Local Machine:
rsync over ssh
Most of the time, rsync is run on top of ssh. In the rare case where someone has bothered to set up an rsync daemon, that uses port 873:
using SSH protocol while transferring our data you can be ensured that your data is being transferred in a secured connection with encryption so that nobody can read your data while it is being transferred over the wire on the internet.
when we use rsync we need to provide the user/root password to accomplish that particular task, so using SSH option will send our logins in an encrypted manner so that our password will be safe. use -e option to make sure we are using rsync over ssh:
Other rsync usefull commands:
Show Progress While Transferring Data with rsync:
rsync -azve ssh --progress mydirectory root@192.168.10.151:/home/
Include and exclude:
rsync -azve ssh --include 'D*' --exclude '*' mydirectory root@192.168.10.151:/home/
: include those files and directory only which starts with ‘D’ and exclude all other files and directory
delete option
If a file or directory not exist at the source, but already exists at the destination, you might want to delete that existing file/directory at the target while syncing .
We can use ‘–delete‘ option to delete files that are not there in source directory:
rsync -azv --delete root@192.168.10.151:/home/mydirectory
by default rsync syncs changed blocks and bytes only, if you want explicitly want to sync whole file then you use ‘-W‘
option with it:
dd
The dd command stands for “data duplicator” and used for copying and converting data. It is very powerful low level utility of Linux. We should be very careful while working with this utility, data loss can convert the dd utility as a “data destroyer” for us. That’s why it is recommended that do not use dd command on a production machine until you get familiarity on this.
It can be used for making clones of volumes, filesystems, writing images to disks, and even erasing drives. The syntax of dd command is dd if=<source file name> of=<target file name> [Options]
dd command can be pretty dangerous, watch out when using it.
Last updated