I did not know why my NAS said one of the Basic volumes as 'Crashed' and I could not write files to it. At the beginning I though it should be easy task so I logged the system through SSH directly and umounted the filesystem and run a 'e2fsck' on it. It got some errors and fixed them then I could mount it without any issue, and of course I could write it without any problem.
It was fixed, right? Yes, until I rebooted the NAS.
After the reboot, I found this volume still was marked as 'Crashed' and the filesystem became readonly again. And this time I wanted to fix it permanently.
When I tried to assembly the raid array, I got below error message:
dsm> mdadm -Av /dev/md3 /dev/sdd3 mdadm: looking for devices for /dev/md3 mdadm: /dev/sdd3 is identified as a member of /dev/md3, slot 0. mdadm: device 0 in /dev/md3 has wrong state in superblock, but /dev/sdd3 seems ok mdadm: added /dev/sdd3 to /dev/md3 as 0 mdadm: /dev/md3 has been started with 1 drive. dsm> e2fsck /dev/md3 e2fsck 1.42.6 (21-Sep-2012) 1.42.6-5644: is cleanly umounted, 809/91193344 files, 359249805/364756736 blocks
I checked the raid array and the disk partition, while could not find any issue:
dsm> mdadm -D /dev/md3 /dev/md3: Version : 1.2 Creation Time : Thu Feb 4 15:03:34 2016 Raid Level : raid1 Array Size : 1459026944 (1391.44 GiB 1494.04 GB) Used Dev Size : 1459026944 (1391.44 GiB 1494.04 GB) Raid Devices : 1 Total Devices : 1 Persistence : Superblock is persistent Update Time : Sat Dec 30 16:13:47 2017 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name : Gen8:3 UUID : d0796c7b:c9b0ab70:211e65c0:843891e2 Events : 40 Number Major Minor RaidDevice State 0 8 51 0 active sync /dev/sdd3 dsm> mdadm -E /dev/sdd3 /dev/sdd3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : d0796c7b:c9b0ab70:211e65c0:843891e2 Name : Gen8:3 Creation Time : Thu Feb 4 15:03:34 2016 Raid Level : raid1 Raid Devices : 1 Avail Dev Size : 2918053888 (1391.44 GiB 1494.04 GB) Array Size : 2918053888 (1391.44 GiB 1494.04 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : a9eafb18:84f3974f:42751688:7136418d Update Time : Sat Dec 30 16:59:19 2017 Checksum : faa9b8c3 - correct Events : 40 Device Role : Active device 0 Array State : A ('A' == active, '.' == missing)
So for some unknown reason, I always got the wrong state in the /proc/mdstat:
dsm> cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md3 : active raid1 sdd3[0](E) 1459026944 blocks super 1.2 [1/1] [E] md2 : active raid1 sdc3[0] 17175149551 blocks super 1.2 [1/1] [U] md1 : active raid1 sdc2[0] sdd2[1] 2097088 blocks [12/2] [UU__________] md0 : active raid1 sdc1[0] sdd1[1] 2490176 blocks [12/2] [UU__________] unused devices: <none>
There was an '[E]' letter and I though maybe it meant 'Error', so how to clear it?
I searched and read lots of posts, and below one save me:
recovering-a-raid-array-in-e-state-on-a-synology-nas
I found similar solutions before reaching this post, while I did not dare to try it as I had not copied all the data, but the above information gave me more confidence and I run below commands:
dsm> mdadm -Cf /dev/md3 -e1.2 -n1 -l1 /dev/sdd3 -ud0796c7b:c9b0ab70:211e65c0:843891e2 mdadm: /dev/sdd3 appears to be part of a raid array: level=raid1 devices=1 ctime=Thu Feb 4 15:03:34 2016 Continue creating array? y mdadm: array /dev/md3 started. dsm> cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md3 : active raid1 sdd3[0] 1459026944 blocks super 1.2 [1/1] [U] md2 : active raid1 sdc3[0] 17175149551 blocks super 1.2 [1/1] [U] md1 : active raid1 sdc2[0] sdd2[1] 2097088 blocks [12/2] [UU__________] md0 : active raid1 sdc1[0] sdd1[1] 2490176 blocks [12/2] [UU__________] unused devices: <none> dsm> e2fsck -pvf -C0 /dev/md3 809 inodes used (0.00%, out of 91193344) 5 non-contiguous files (0.6%) 1 non-contiguous directory (0.1%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 613/84/104 359249805 blocks used (98.49%, out of 364756736) 0 bad blocks 180 large files 673 regular files 127 directories 0 character device files 0 block device files 0 fifos 0 links 0 symbolic links (0 fast symbolic links) 0 sockets ------------ 800 files dsm> cat /etc/fstab none /proc proc defaults 0 0 /dev/root / ext4 defaults 1 1 /dev/md3 /volume2 ext4 usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0,synoacl 0 0 /dev/md2 /volume1 ext4 usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0,synoacl 0 0 dsm> mount /dev/md3 dsm> df -h Filesystem Size Used Avail Use% Mounted on /dev/root 2.3G 676M 1.6G 31% / /tmp 2.0G 120K 2.0G 1% /tmp /run 2.0G 2.5M 2.0G 1% /run /dev/shm 2.0G 0 2.0G 0% /dev/shm none 4.0K 0 4.0K 0% /sys/fs/cgroup /dev/bus/usb 2.0G 4.0K 2.0G 1% /proc/bus/usb /dev/md2 16T 14T 2.5T 85% /volume1 /dev/md3 1.4T 1.4T 21G 99% /volume2 dsm> ls /volume2 @eaDir @tmp aquota.group aquota.user downloads lost+found synoquota.db
After that I checked the volume state again and the annoying 'Crashed' became 'Normal'.