Software RAID Recovery on Linux

From ConShell
Jump to navigation Jump to search

Notes from an investigation and repair of a degraded disk involving software RAID on linux (md).

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md1              487M  320M  142M  70% /
/dev/md0               99M   16M   79M  17% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/md2               15G  8.8G  4.8G  65% /home
/dev/md4              7.8G  2.0G  5.4G  27% /usr
/dev/md5              259G   20G  227G   8% /var
/dev/sdc1             2.8T  1.9T  847G  70% /mnt/backup

#cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] 
md0 : active raid1 sda1[0]
      104320 blocks [2/1] [U_]
      
md2 : active raid1 sda2[0]
      15358016 blocks [2/1] [U_]
      
md3 : active raid1 sdb3[1] sda3[0]
      8393856 blocks [2/2] [UU]
      
md4 : active raid1 sda5[0]
      8393856 blocks [2/1] [U_]
      
md5 : active raid1 sda7[0]
      279803968 blocks [2/1] [U_]
      
md1 : active raid1 sda6[0]
      513984 blocks [2/1] [U_]
      
unused devices: <none>

#grep md: /var/log/dmesg
md:  adding sdb1 ...
md: md0 already running, cannot run sdb1
md: export_rdev(sdb1)
md: ... autorun DONE.

#cat /etc/mdadm.conf
DEVICE partitions
MAILADDR root
ARRAY /dev/md1 super-minor=1
ARRAY /dev/md0 super-minor=0
ARRAY /dev/md2 super-minor=2
ARRAY /dev/md4 super-minor=4
ARRAY /dev/md5 super-minor=5
ARRAY /dev/md3 super-minor=3

#mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Mon Feb 19 07:54:06 2007
     Raid Level : raid1
     Array Size : 104320 (101.89 MiB 106.82 MB)
    Device Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Wed Feb  6 17:09:50 2008
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           UUID : 9f809745:8bf08609:14ed382c:d0ecabc6
         Events : 0.7840

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed

#mdadm --manage /dev/md5 --re-add /dev/sdb7
mdadm: re-added /dev/sdb7

#mdadm --detail /dev/md5
/dev/md5:
        Version : 00.90.03
  Creation Time : Mon Feb 19 08:43:35 2007
     Raid Level : raid1
     Array Size : 279803968 (266.84 GiB 286.52 GB)
    Device Size : 279803968 (266.84 GiB 286.52 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 5
    Persistence : Superblock is persistent

    Update Time : Wed Feb 27 08:30:55 2008
          State : clean, degraded, recovering
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

 Rebuild Status : 0% complete

           UUID : ad6bc290:5fe289ce:8c051c7c:5f972730
         Events : 0.38424642

    Number   Major   Minor   RaidDevice State
       0       8        7        0      active sync   /dev/sda7
       2       8       23        1      spare rebuilding   /dev/sdb7

#cat /proc/mdstat 
...
md5 : active raid1 sdb7[2] sda7[0]
      279803968 blocks [2/1] [U_]
      [>....................]  recovery =  2.1% (6104128/279803968) finish=94.1min speed=48467K/sec
...      

Now do the same for the other degraded arrays.

#mdadm --manage /dev/md0 --re-add /dev/sdb1
#mdadm --manage /dev/md1 --re-add /dev/sdb6
#mdadm --manage /dev/md2 --re-add /dev/sdb2
#mdadm --manage /dev/md4 --re-add /dev/sdb5

A very handy command for displaying the progress of the rebuilds is:

#watch -d -n 10 cat /proc/mdstat

Related

I wonder if clearing the disks would decrease the resync time. See Linux P2V for an example.