Re: sas is fine for me
I believe they both had controller failures, I don't have the logs from the first failure in 2023 anymore but I recall the events saying the disk was not responding anymore.
Most recent failure was last month first error was
pd 14 port a0 on 3:0:1: cmdstat:0x1a (TE_PATHSICK -- Path is sick), scsistat:0x02 (Check condition), snskey:0x0b (Aborted command), asc/ascq:0x47/0x82 (Vendor-specific ASC/ASCQ code), info:0x0, cmd_spec:0x0, sns_spec:0x0, host:0x6, abort:0, CDB:28001441CED000000800 (Read10), blk:0x1441ced0, blkcnt 0x8, fru_cd:0x0, LUN:0, LUN_WWN:0000000000000000 after 0.001s, toterr:95, deverr:39
then 2 hours later
pd 14 port a0 on 3:0:1: cmdstat:0x1e (TE_UNITATT -- Unit attention error), scsistat:0x02 (Check condition), snskey:0x06 (Unit attention), asc/ascq:0x29/0x2 (Scsi bus reset occurred), info:0x0, cmd_spec:0x0, sns_spec:0x0, host:0x6, abort:0, CDB:8800000000019977E8C0000000200000 (Read16), blk:0x19977e8c0, blkcnt 0x20, fru_cd:0x0, LUN:0, LUN_WWN:0000000000000000 after 0.000s, toterr:96, deverr:40
few seconds later
hw_disk:5001173100A6357C pd 14 port a0 on 3:0:1: cmdstat:0x1a (TE_PATHSICK -- Path is sick), scsistat:0x02 (Check condition), snskey:0x06 (Unit attention), asc/ascq:0x29/0x7 (I_t nexus loss occurred), info:0x27174e00, cmd_spec:0x0, sns_spec:0x0, host:0x6, abort:0, CDB:8800000000019974DCC0000000200000 (Read16), blk:0x19974dcc0, blkcnt 0x20, fru_cd:0x0, LUN:0, LUN_WWN:0000000000000000 after 0.002s, toterr:97, deverr:41
about 90mins later the RAID rebuild was complete and the drive was disabled
No indication that the drive went to read-only mode(unless that is buried in the hex codes somewhere).
None of the drives report less than 84% wear life remaining, really blows my mind still how reliable it has been. Zero read or write errors reported on any disk(except the ones that failed but those stats were reset when the drives were replaced).