Recovering from disaster

Hello,

I am a linux administrator in the making still. :slight_smile:

I have an almalinux server up and running with my disks in RAID 1.

I have some questions around disaster recovery.

  1. If a disk fails, how easy is it to simply get back up and running again from the mirror drive? Is it hot swappable and how would I go about it? Is there any documentation on this?

  2. If the motherboard blows, how easy is it to recover from this type of disaster from your RAID set up?

NOTE: I am aware that I obviously have to have backups of my data which I do have but this question is purely around getting back up and running from either a disk failure or motherboard failure.

Eager to hear your thoughts :smiley:

In RAID 1, mirror, you have one logical drive and everything you write to the logical drive is written to both physical drives. The content of the drives is thus identical.

If one of the physical drives breaks, then you still have same logical drive with same content (in the remaining physical drive). One would replace the broken disk and the RAID would resync content from the ok drive in the background.

Drive is hot-swappable, if the hardware supports hot-swap. However, some disk breaks do report up to OS, which can panic. A restart might be necessary.

With reboot, we get to big question: What kind of RAID 1?

  • Hardware RAID has dedicated RAID controller and does all the RAID operations within the controller. From OS point of view it “just has a disk”. The mirrors are complete.
  • Software RAID is all in software and CPU does all the work. Since OS sees the physical drives, it is usually the individual partitions that are mirrored. If legacy mode is still in use, boot starts from sector 0, which are not mirrored. They should thus be written so that boot from the second drive is possible.
  • Fakeraid is effectively software RAID. Chipset has some tools, but OS and CPU still do all the work.

That gets us to the motherboard. With hardware and software RAIDs all metadata about the RAID array is in the disks, hardware controller or software. Mobo does not matter.

With fakeraid, the chip on new mobo must be compatible with the old mobo. Furthermore, during boot the bootloader loads kernel and initrd-image. The image contains drivers that kernel needs to initialize essential hardware. E.g. to mount the root filesystem. If mobo has different components, then the image probably lacks drivers from them. The image of rescue kernel has all drivers, so it can be used to fix non-booting regular kernel.

1 Like

Thank you so much for that informative reply :smiley:

The RAID on my server was built through the AlmaLinux installation process. So it’s a software raid and not linked to the hardware at all.

Does this matter in a disaster recovery? If a drive fails, will it just be a matter of simply shutting down the server, removing the bad drive and reinserting a new one and then booting up again?

I don’t care too much for rebooting. I just want to make sure I can recover from a disaster.

You can test that:

  1. Shut down
  2. Remote one drive
  3. Boot
  4. Shut down
  5. Put drive back
  6. Boot
  7. Let mirror rebuild completely
  8. Shut down
  9. Remove the other drive
  10. Boot

If either of the degraded boots fails, then you need to adjust your setup.

:rofl: :rofl:

as much as I would love to test it, I can’t. I still have live data running on the sever and don’t want to bork the server unnecessarily.

So I assume there is no way to test this other than doing what you said? I was hoping someone could tell me a more clear answer on how the software raid works in AlmaLinux in the event of a disaster.

Thanks for your help though :smiley:

Replacing a Failed Mirror Disk in a Software RAID Array (mdadm) – The Geek Diary gives most of the necessary steps to creating and repairing a mirror with a link to a page with more detail at the bottom.

Thank you very much! This is extremely helpful although the first link assumes that the drive is still available but you suspect it is failing. In this scenario you first have to mark the disk as failed. This is a great example.

So what I take away from this example is that it should be fairly simple.

I think the first steps probably wouldn’t be necessary if the drive dies completely as the OS would probably mark it as failed? Right? … and in that situation it would be a matter of simply replacing the dead drive?? or not

I was hoping it was easier by just removing the failing drive and popping in the new drive but seems there are more steps to this.

You will have to replace the failed disk, copy the partition table from the working disk to the new disk, and add new disk to the array. I made a video about this a few weeks ago.

2 Likes

That depends upon how the RAID was built, if the RAID is built from spindles you won’t have a partition table. We probably need to see the output of

mdadm --detail --scan --verbose

and possibly that from pvs, vgs and lvs (though be careful about sensitive information on a public forum). In my own case the output from the first command shows:

ARRAY /dev/md/0 level=raid5 num-devices=3 metadata=1.2 name=tamar.home:0 UUID=<a very long hexadecimal string>
   devices=/dev/sdb,/dev/sdc,/dev/sdd

which shows that md0 is a RAID of spindles, no partition table at all. The looking at the output of pvs it is clear that the whole RAIDset is trated as an unpartitioned disk and given over to LVM:

  PV         VG               Fmt  Attr PSize   PFree  
  /dev/md0   <a VG>           lvm2 a--   <1.82t      0 
  /dev/sdi4  <a VG>           lvm2 a--  833.36g 347.29g
  /dev/sdj2  <a different VG> lvm2 a--   <1.82t   1.75t

Great video sir! I watched it from start to finish which was incredibly helpful

Thankfully my disks are SSD drives. :grin::+1:t2:

I covered all that in the video, except for the part about spindles. Using the AlmaLinux installer, partition tables were created on the disks. The last part of the video mentions “documenting” your config with mdadm also.

Sorry Bluesteam, but in this context SSD drives are “spindles”. The usage comes from big systems with external RAIDs where “disk” is thoroughly ambiguous.

Agreed on documenting. All the diskfull systems I’ve managed for years have a daily cron job that dumps the following into a file called (for historical reasons) /df-h. It’s the only actual file that I expect to see at the top level. I dump out:

df -hl
swapon -s
fdisk -l
lvs
vgs
pvs
mdadm --detail --scan --verbose

it takes about 10 KiB on larger systems, less on small ones. If I need to rebuild from bare metal it gives me enough information to structure the disks so that the L0 dumps can be applied and since it is refreshed daily it is guaranteed to be caught on every backup.

1 Like

Hmmm. Then I guess that negates the process of recovering from a failed disk using the guide step by step from @joebeasley. That’s disappointing. Was almost sure I could use that process if a disk fails. :thinking:

I haven’t yet watched his video, I responded to his posting. The critical thing is to understand exactly how your system is put together and not assume anything. The mantra of all professional system managers and sysadmins is backup everything and record the configuration. If your system is built on partions then you need the partitioning tables before you can recover, if it is not, then you need to know that configuration. Ignorance isn’t bliss, it is a long caffeine fueled night!

What’s interesting here is if I execute lvs, vgs or pvs there is nothing that returns at all. Am I supposed to be seeing something when executing those commands as they are shown in your list?

Do you use LVM? If the commands succeed but produce no output it would suggest that you have no LVM partitions.

If you’re not certain look at the output from df and if you see devices like /dev/mapper/... in column 1 you use LVM. If there are no such devices, then have a look in /dev/mapper and see what’s there. If you’re using LVM you should see a link for each partion, something like almalinux_myhost-root -> ../dm-0. There may also be a character special file called control.

FYI: lvs lists the logical volumes, vgs the volume groups and pvs the physical* volumes. The mdadm command looks for and lists the md arrays that it finds.

*NB “physical” according to the software. An external hardware RAID for instance is a single “physical disk” to LVM, even if it was a 10 spindle RAIDset.

1 Like

Thanks Martin :slight_smile:

Here is the output of my df

[root@###1 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 7997180 0 7997180 0% /dev
tmpfs 8015556 0 8015556 0% /dev/shm
tmpfs 8015556 763264 7252292 10% /run
tmpfs 8015556 0 8015556 0% /sys/fs/cgroup
/dev/md127 36648468 10723776 25924692 30% /
/dev/md124 957146496 58069176 899077320 7% /home
/dev/md126 1040264 298596 741668 29% /boot
/dev/md123 613120 5908 607212 1% /boot/efi
/dev/loop0 1536488 2816 1453956 1% /tmp
tmpfs 1603108 0 1603108 0% /run/user/0
[root@### ~]#

Right, forget LVM, you don’t appear to be using it. Forget also the tmpfs and devtmpfs filesystems, they don’t exist on physical storage, and for that matter /tmp is probably virtual only as well.

So, you have four RAIDsets: 123, 124, 126 and 127. These have been built according to the video that Joe put up. Follow his instructions, they fit your particular case to a “T”. As both Joe and I have said though, make sure this is all documented where it can be found in the event of a crashed system.

@Joe: I had a quick skim through the video, looks good for this config, well worth recommending.

As an asside, don’t do like a PC I once had to deal with (about 15 years ago): “a recovery CD is not supplied, instructions are online” - great! How do you read them with a crashed PC? :rage: