Showing posts with label raid. Show all posts
Showing posts with label raid. Show all posts


Setting up a software RAID with sgdisk and mdadm

I wanted to set up a RAID0 (striped array) on two HDDs to servce as cache for Duplicity backups. And I wanted to use GPT and only command line tools: sgdisk(8) for partitioning, and mdadm(8) for creating the software RAID. (I have usually just used Gparted, a GUI partitioning tool.)

All of this was done on Red Hat Enterprise Linux 6.5.

So, I have two (spinning disc) HDDs, each 931 GB, mounted as
  • /dev/sda
  • /dev/sdb
First, zap any partitioning information they may have. (In all the examples below, "X" should be replaced by "a" or "b".)

# sgdisk -Z /dev/sdX

Next, partition them. The partitions have to be of type 0xFD00 "Linux RAID". You can do "sgdisk -L" to see a list of all available types. These type codes are not the same as the type codes used by fdisk(8).

The partitions will be 512 GB, leaving some for other uses.

# sgdisk -n 0:0:+512G -c 0:"cache" -t 0:0xFD00 /dev/sdX
# sgdisk -n 0:0:0 -c 0:"misc" /dev/sdX

The "0" first digit of the argument to "-n", "-c", and "-t" is shorthand for the first available partition number. In this case, the first line would be "1" and the second line would be "2". (N.B. this is automatically set; "1" and "2" do not need to be used in the commandline.)

In the second line, note that "-n 0:0:0" uses the default of starting at the first unallocated sector, and ending at the last allocateable sector on the drive, thereby using up the rest of the HDD for the "misc" partition 2.  Leaving out the type specification, "-t", gives the default 0x8300 "Linux filesystem."

Print out the partition info to check:

# sgdisk -p /dev/sdX
Disk /dev/sdX: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): xxxxxxxxxxxxxxx-xxx-xxxxxxxxxxxxxxxx
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048      1073743871   512.0 GiB   FD00  cache
   2      1073743872      1953525134   419.5 GiB   8300  misc

And we see that it has done what we expected.

Next, we create the RAID0 from /dev/sda1 and /dev/sdb1:

# mdadm -v -C /dev/md/mycomputer:cache -l stripe -n 2 /dev/sda1 /dev/sdb1

This creates a device /dev/md127 with a symbolic link for the readable name:

lrwxrwxrwx 1 root root 8 2017-10-18 18:11:31 -0400 /dev/md/mycomputer:cache -> ../md127

In mdadm v3.3.4 in RHEL6, I found that the name given to the "-C/--create" option would always end up being "mycomputer:something", where "something" was determined by what you actually give it. This name comes up after reboot.

The "-v" is for verbose output, "-l" is for the RAID level (which can be specified by integer, or string), "-n" is the number of devices, and the positional arguments are a list of the devices to be used.

Also, the integer N in /dev/mdN is determined by the system. It seems to start with 127.

For instance, doing

# mdadm -C /dev/md0

after rebooting gave this:

lrwxrwxrwx 1 root root 8 2017-10-18 18:11:31 -0400 /dev/md/mycomputer:0 -> ../md127 

And doing

# mdadm -C /dev/md/cache 


lrwxrwxrwx 1 root root 8 2017-10-18 18:11:31 -0400 /dev/md/mycomputer:cache -> ../md127

I wised up on my third time through, and named it what it was going to pick, anyway.

The RAID needs to be "assembled" and activated at boot time. This is not done by default. To do this, a file /etc/mdadm.conf must be created. (Other distros may have a different location for this file.)

Assuming there is no such file, start by using mdadm(8) to output the array specification to the file:

# mdadm -Ds /dev/md/mycomputer:cache > /etc/mdadm.conf
# cat /etc/mdadm.conf
ARRAY /dev/md/mycomputer:cache metadata=1.2 name=mycomputer:cache UUID=xxxxxxxx

Very important: this UUID will not be the same as the UUID of the filesystem we will create later.

Add DEVICE, MAILADDR, and AUTO lines to /etc/mdadm.conf, resulting in:

DEVICE /dev/sda1 /dev/sdb1
AUTO +all

ARRAY /dev/md/mycomputer:cache metadata=1.2 name=mycomputer:cache UUID=xxxxxxxx

I did the next bit in single-user mode as I wanted this to be mounted as /var/cache, which is also used by several other things. Also, since it gets tiresome writing out the whole device name, I used the short name /dev/md127.

# telinit 1
# mkfs.ext4 /dev/md127

Next, I mounted the device in a temporary location to transfer the existing contents:

# mkdir /mnt/tmpmnt
# mount /dev/md127 /mnt/tmpmnt
# cd /var/cache
# tar cf - * | ( cd /mnt/tmpmnt ; tar xvf - )

Get the UUID of this new filesystem for use in /etc/fstab:

# blkid /dev/md127

And create an entry in /etc/fstab:

UUID=yyyyyyyyy-yyyyyyyy-yyyyyyyyyy-yyyyyyyyy  /var/cache  ext4   defaults    0 2

And reboot!


Software RAID, encrypted volumes, and mount-at-boot

This is a brief HOWTO on manually setting up an encrypted RAID-1 on Fedora 13, via the command line. I used the command line because I discovered that using Palimpsest hid some details, and left me with an encrypted volume that did not mount at boot.

Along the way, I also decided to use the GUID Partition Table (GPT) rather than the more common Master Boot Record (MBR) partition scheme. Palimpsest supports GPT directly. However, to use a command line utility to handle GPT, you can use parted, or you can install the gdisk package available in Fedora. gdisk's interface behaves a lot like fdisk.

First, some preliminaries. My machine was installed with Fedora 13 on a single disk in the usual way. Once everything was installed, I obtained a pair of 500GB hard drives. My aim was to use those as an encrypted software RAID-1 volume for /home. The two drives were /dev/sdb and /dev/sdc. I also wanted to increase the amount of swap available, and have it striped across the three drives.


I used parted to create GPT partition tables, and then a (roughly) 2GB partition for swap and the remainder for the /home RAID.
    myhost> sudo parted /dev/sdb
    GNU Parted 2.1
    Using /dev/sdb
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) mklabel gpt
    Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
    Yes/No? Yes 
    (parted) p                                                                
    Model: ATA WDC WD5000AAKS-0 (scsi)
    Disk /dev/sdb: 500GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Number  Start  End  Size  File system  Name  Flags

The, we create the partitions. Warning: the in-app help for mkpart in parted is wrong. The actual syntax is: mkpart PART-NAME [PART-TYPE] START END. Or, you can just type mkpart and you will be prompted for each option. I will give both types of usage below. The abbreviation -1cyl (minus one) stands for the last cylinder of the disk.
    (parted) mkpart swap-sdeb1 linux-swap 0cyl 256cyl
    (parted) p
    Model: ATA WDC WD5000AAKS-0 (scsi)
    Disk /dev/sdb: 500GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    Number  Start   End     Size    File system     Name       Flags
     1      1049kB  2106MB  2104MB  linux-swap(v1)  swap-sdb1

    (parted) mkpart
    Partition name?  []? raid-home-sdb2                                       
    File system type?  [ext2]? ext4                                           
    Start? 256cyl            
    End? -1cyl
    (parted) unit cyl
    (parted) p                                                                
    Model: ATA WDC WD5000AAKS-0 (scsi)
    Disk /dev/sdb: 500GB
    Sector size (logical/physical): 512B/512B
    BIOS cylinder,head,sector geometry: 60801,255,63.  Each cylinder is 8225kB.
    Partition Table: gpt 
    Number  Start   End       Size      File system     Name            Flags
     1      0cyl    255cyl    255cyl    linux-swap(v1)  swap-sdb1
     2      255cyl  60801cyl  60545cyl                  raid-home-sdb2

Then, do the same to /dev/sdc by selecting the device, and running the same commands above with sdc in place of sdb:
    (parted) select /dev/sdc
    Using /dev/sdc

Now, see what you have created:
    myhost> sudo blkid
    /dev/sda1: UUID="4e41d2b8-a62c-4228-a8ec-bd14d7585fda" TYPE="ext4" 
    /dev/sda2: UUID="dad0fa7b-47b9-4c52-a5ad-9dc0264484a9" TYPE="crypto_LUKS" 
    /dev/sda3: UUID="e5d865a1-9cee-458b-911b-773bf1a429bf" TYPE="swap" 
    /dev/mapper/luks-dad0fa7b-47b9-4c52-a5ad-9dc0264484a9: UUID="e10ca302-1932-4ff3-9b49-45a275e3c4f2" TYPE="ext4" 
    /dev/sdc1: UUID="f9f04c2d-c85a-42c3-a368-9a93103f7751" TYPE="swap" 
    /dev/sdb1: UUID="f67ca37b-de6b-4b51-a4db-9dd4ee4c3665" TYPE="swap" 
    /dev/sdb2: UUID="70747600-ea8e-c722-8645-b548b88b1a63"
    /dev/sdc2: UUID="de283446-102b-a904-6249-2fefd5a801bf"


Add the two new swap partitions to /etc/fstab by adding these two lines. Make the priorities (option pri) equal to the priority for the swap partition which is already in the file:
    UUID=f67ca37b-de6b-4b51-a4db-9dd4ee4c3665 swap                    swap    pri=1
    UUID=f9f04c2d-c85a-42c3-a368-9a93103f7751 swap                    swap    pri=1

Then, turn on the new swap partitions by doing: sudo swapon -a. Check that the swap partitions are active:
    myhost> sudo swapon -s
    Filename    Type  Size Used Priority
    /dev/sda3                               partition 2096120 0 1
    /dev/sdb1                               partition 2046968 0 1
    /dev/sdc1                               partition 2055160 0 1

Encrypted RAID for /home

The order of operations is this:
  1. Create RAID
  2. Set up encryption on RAID
  3. Open encrypted device
  4. Create ext4 filesystem
  5. Mount
To create the RAID-1 (mirrored) device:
    myhost> sudo mdadm --create /dev/md0 --level=mirror --raid-devices=2 /dev/sdb2 /dev/sdc2
And, use blkid to find the UUID for /dev/md0 which you will need for the next step. Then, encrypt the volume:
    myhost> sudo cryptsetup --verbose --verify-passphrase --aes-cbc-plain luksFormat /dev/md0

    This will overwrite data on /dev/md0 irrevocably.
    Are you sure? (Type uppercase yes): YES
    Enter LUKS passphrase: 
    Verify passphrase: 
    Command successful.
Next, open the encrypted device by supplying the password. You will also give it a name, which is the string "luks-" followed by the UUID for the /dev/md0 device (71559f74-fb59-439f-9219-8f529b4fc535 in this example), and have a look in /dev/mapper to see the decrypted device.
    myhost> sudo cryptsetup luksOpen /dev/md0 luks-71559f74-fb59-439f-9219-8f529b4fc535
    Enter passphrase for /dev/md0: 
    myhost> sudo ls -l /dev/mapper
    total 0
    crw-rw---- 1 root root 10, 62 Jul 19 11:07 control
    lrwxrwxrwx 1 root root      7 Jul 19 13:49 luks-71559f74-fb59-439f-9219-8f529b4fc535 -> ../dm-1
    lrwxrwxrwx 1 root root      7 Jul 19 11:07 luks-dad0fa7b-47b9-4c52-a5ad-9dc0264484a9 -> ../dm-0
In order that this volume will be mounted at boot, add the following line to the file /etc/crypttab:
    luks-71559f74-fb59-439f-9219-8f529b4fc535 UUID=71559f74-fb59-439f-9219-8f529b4fc535 none
Now, reboot. You will be prompted for a password to access the encrypted drive you created. When it comes back, you may find that the RAID device is no longer /dev/md0 (use the blkid command), but /dev/md127. That does not matter since we will always refer to volumes by their UUIDs which do not change even if the device mapping has changed.

Before you do the following last step, you will have to move your home directory to a different volume first. I created /tmphome, then copied my files over to there, and modified the /etc/passwd file.
    myhost> sudo mkdir /tmphome
    myhost> sudo mv /home/myname /tmphome
    myhost> sudo vipw     Edit and change your home directory to /tmphome/myname
Then, log out and back in.

Now, edit /etc/fstab and add an entry for this new device:
    /dev/mapper/luks-7b06c7df-a893-456f-a950-15135d52bc89 /home                   ext4    defaults        1 1
And mount it: sudo mount -a

Finally, move your home directory back to /home, and change /etc/passwd back. You may want to reboot to see if it all works as you expect.