Debian Clusters for Education and Research: The Missing Manual

Cloning Worker Nodes with Rsync

From Debian Clusters

Jump to: navigation, search

This is the second page of a two part tutorial on cloning worker nodes with rsync. The full tutorial includes

Contents

Rsync

Rsync is an elegant tool to copy files. It does this by looking at the differences between the source and destination file system and only copying over the differences, speeding up the process over a flat copy if some of the files already exist on the destination system.

One of the differences between using a data dump (dd) and rsync is that the latter won't create the partition tables or boot loader on the destination drives, so that needs to be done manually. Unlike using Updcast, where the udp-sender broadcasts its image out to the network and all of the other nodes receive the image at the same time, a data dump essentially works point-to-point. The other main difference is that rsync's occur on one machine with the sending and receiving hard drives hooked up to it. Multiple rsyncs can be occurring at the same time, but each one is a separate process and having too many run at once will bog down the system.

Setup

The first step is to take the hard drives out of the worker nodes (unless they're already out), and put them all into another machine. This machine can be the machine holding the hard drive that's already set up, but it doesn't need to be. Next, the operating system of the machine that's going to be cloned shouldn't be running while this operation takes place. The easiest way to get around this is to use a bootable CD, like Ubuntu's live CD or Knoppix. After all of the drives are hooked up, turn the machine on and boot from the CD.

Next, you'll need to become root. If you're using an Ubuntu CD, do this by issuing sudo su -. Otherwise, become root as you would normally. Then run

fdisk -l

If you don't run it as root, it will run without returning any drives. What you want to see is something like this:

root@ubuntu:~# fdisk -l

Disk /dev/hda: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

Disk /dev/hdc: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdc1   *           1        9483    76172166   83  Linux
/dev/hdc2            9484        9729     1975995   82  Linux swap / Solaris

Fdisk -l shows all of the hard drives currently plugged into the system. In the example above, I have two hard drives plugged in: /dev/hda/ and /dev/hdc. The hd part refers to them being IDE drives; SATA drives will show up as sd-something.

Next, we need to figure out which hard drive is the one that's already set up. If you don't already have an operating system installed on the others, you'll be able to see the difference when using fdisk (that's the case in the above example). But if you can't tell the difference using fdisk, you can start mounting them one at a time and looking at the files on the hard disk until you find the master one. To mount a drive, first create a directory

mkdir /mnt

and then mount one of the hard drives

mount /dev/<your hard drive>1 /mnt

The 1 is important. Rather than mounting an entire hard drive (including the swap space), you want to only mount the part that contains a filesystem. For instance, mounting one of mine would be

mount /dev/hda1 /mnt

After you're done looking at the files, to unmount, use

umount /dev/<your drive>1 /mnt

Creating Partitions

Each worker node first needs to have its partitions set up the same way as the master image has then set up. If you need to see the master drive's partitions to remind you, run fdisk -l and look for the correct hard drive.

One at a time, for each worker node drive to be imaged, run

fdisk /dev/<drive to be imaged>

This will put you into an interactive mode with fdisk, a disk formatter. You'll be using these commands:

  • n - create a new partition
  • d - delete partition (if you mess up)
  • p - print the status of partition tables
  • a - make a partition active
  • w - write out the changes

At a minimum, you should have two partitions: the filesystem partition, and the swap partition. Here is a transcript showing how to create the filesystem with one filesystem partition and one swap partition like I did for my 80 GB drive.

Sometimes, on Ubuntu (and possibly other live CDs), it will finish with an error like the following:

WARNING: Re-reading the partition table failed with error 16: Device or resource 
busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

This is fine; as it says, rebooting will take care of the problem. You'll have to reboot before you can create the file system and swap space.

Creating the File System and Swap

With the partitions set up, they're now ready to have the file system and swap created on them. Again, make sure you're running these commands on the correct drives! Creating a file system on your master image will wipe out all of your current files! Only run these commands on the new hard drives.

For each worker node, create a file system on any ext3 file system partitions. (If you've forgotten which ones those were, run fdisk -l again. Do this with

mkfs.ext3 /dev/<drive><file system partition #>

For instance, mine from the transcript example would be mkfs.ext3 /dev/hda1. This sets up on the file system so we'll be able to copy files over to it. (It's like formatting a drive in Windows.)

Next, for each worker node, create the swap space on the swap space partitions. Do this with

mkswap /dev/<drive><swap partition #>

Mine from the example is mkswap /dev/hda2.

Running Rsync

Now, we're finally ready to copy the files over. First, create a new directory for each drive, including the master image. I like to make the master image's directory slightly different so it's easy to tell them apart. Here's an example for three drives, where the c drive happens to be the master:

mkdir /tmp/mnta /tmp/mntc-master /tmp/mntd

Then, one at a time, mount the file system partition for each of the drives. (If you've forgotten which partition it is, run fdisk -l.)

mount /dev/<your drive><file system partition #> /tmp/<directory name>

Now you'll need to run rsync once for each drive to be imaged. You can do this in different terminal windows at the same time. To copy the files from the master image to a new drive, use

rsync -plarv --progress /tmp/<master location>/ /tmp/<new drive location>/

The trailing slashes are very important, because they indicate the copying should be for that particular directory.

Installing GRUB

All the files have now been copied over, which is excellent. There's one last step necessary in order to boot off the new drives: they need a boot loader installed. This is fairly easy to do. To enter the interactive mode with GRUB, a common Linux boot loader, run

grub

and then, for each new hard drive, run

root (hdX,0)
setup (hdX)

where X is the number for that drive. (For instance, hda/sda would be 0, hdb/sdb would be 1, and so on.)

Done!

Once rsync finishes, you're done. Congratulations, you've just made a copy of your master image hard drive.

Personal tools