RFI?
Just last week, I had reason to read a manufacturer number off of a part in a pedestal server... needed a flashlight.... oh hey, cell phone! I touched nothing in the case and got what I needed, but when I emerged from the server room/storage closet (small-ish business), I was hearing cries of despair, the main local repository of most of our company's work has suffered, um, Total Inability To Support (insert something funny, I'm not funny, and also American, so help me out here)! (a whee bit of exaggeration there)...
Reboot, everything was working again. Didn't think too much of it at the time. Fast forward to Monday, I go investigating another unrelated issue and notice that the RAID was missing two disks... Seems "something" caused one of the disk controllers in the box to glitch out and lock up, resulting in a soft lockup of the system, and software RAID6 coming back up with two disks missing... survivable, etc. The disks appeared to be fine, and it was clear a controller crap-out was the cause. Anyhow, I re-add the disks (which are all in perfect health), and while it's merrily rebuilding, the controller flakes out again... this time very noticeably, given the wall of crap on the console. After the reboot, the RAID6 is stuck in some silly state where two of the disks seem to have become.. independent .. of the other 4, and unfortunately, those two were part of the formerly working array...
Fast forward to today, we've cloned the disks, made backups, and forcibly reassembled the RAID, fsck'd the filesystems, etc., and the thing is back chugging along with a new disk controller, now with only 2 disks on each.
Could've been a lot worse, as we became acutely aware that our backup target had become completely full (I had been trying to keep up on that, but non-IT workload had become a bit excessive), and the last full backup had occurred some time in mid-December. Lessons learned. Getting fresh NAS hardware, expanding the offsite backup target, and revising our policies/procedures...
Root cause? Dunno, flaky Marvell onboard SATA controller (which had worked perfectly for several years prior) and possibly RFI from the cellphone... anyone here ever encountered that sort of thing?
edit: Actually, my father had suggested, in passing, the possibility of some sort of photosensitivity issue, a la the Raspberry Pis that could be crashed/rebooted with a camera flash. Having operated the thing for many years, and shone many flashlights in it, it seems unlikely... still an interesting thought