On 10/18/22 09:35, serge@0x0000.su wrote:
> I have raid1 volume (one of two on PC) with 2 disks.
>
> # disklabel sd5
> # /dev/rsd5c:
> type: SCSI
> disk: SCSI disk
> label: SR RAID 1
> duid: 7a03a84165b3d165
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 243201
> total sectors: 3907028640
> boundstart: 0
> boundend: 3907028640
> drivedata: 0
>
> 16 partitions:
> # size offset fstype [fsize bsize cpg]
> a: 3907028608 0 4.2BSD 8192 65536 52270 # /home/vmail
> c: 3907028640 0 unused
>
>
> Recently I got an error in dmesg
>
> mail# dmesg | grep retry
> sd5: retrying read on block 767483392
>
> (This happened during copying process)
>
> and system marked volume as degraded
>
> mail# bioctl sd5
> Volume Status Size Device
> softraid0 1 Degraded 2000398663680 sd5 RAID1
> 0 Online 2000398663680 1:0.0 noencl <sd2a>
> 1 Offline 2000398663680 1:1.0 noencl <sd3a>
>
> I tried to reread this sector (and a couple around) with dd to make sure
> the sector is unreadable:
>
> mail# dd if=/dev/rsd3c of=/dev/null bs=512 count=16 skip=767483384
> 16+0 records in
> 16+0 records out
> 8192 bytes transferred in 0.025 secs (316536 bytes/sec)
> mail# dd if=/dev/rsd5c of=/dev/null bs=512 count=16 skip=767483384
> 16+0 records in
> 16+0 records out
> 8192 bytes transferred in 0.050 secs (161303 bytes/sec)
>
> but error did not appeared.
> Are there any methods to check if sector is bad (preferably on the fly)?
> If this is not a disk error (im going to replace cables just in case)
> should i just get disk back online with
> bioctl -R /dev/sd3a sd5
> ?
You made some assumptions about the math that the disk uses vs. the math
dd uses, and I'm not sure I agree with them. I'd suggest doing a dd read
of the entire disk (rsd3c), rather than trying to read just the one
sector. Remember, there's an offset between the sectors of sd5 (the
softraid drive) and sd2 & sd3 where sd5 lives. So I'd kinda expect your
sd3 check to pass because you missed the bad spot, and I'd expect your
sd5 check to pass because the bad drive is locked out of the array and
no longer a problem.
IF you are a cheap ******* or the machine is in another country, you might
want to try dd'ing zeros and 0xff's over the entire disk before putting it
back in the array. That sometimes triggers a discovery of a bad spot and
locks it out and replaces it with a spare. I've had some success with
this process, actually, though it's a bad idea. :)
Nick.
No comments:
Post a Comment