Some stuff is just funny to me. You I today went ahead and updated my server remotely (since I am not home atm for some time, ssh sure is nice ey)
So updated my Ubuntu on my homeserver just fine, new kernel and all. Went ahead to reboot. That takes some time normally with the homeserver.
When it was up again I ssh’d into again and I made a habit to check if any systemd-services did fail after a reboot. There was actually one: docker. It could not mount its storage-system.
Digging a bit on the internet revealed that some kernel module might not been loaded properly. Looking through some stuff I found I had the kernel
5.15.0-100-generic
was loaded (via uname -r) but under /boot I only had 5.15.0-105-generic
and 5.15.0-92-generic
How does that work xD
Looking around more I found out a disk was not mounted in the raid1 that is the system-disks (both SSDs).
So checked that disk and …. it seems fine?
Then why would linux throw it out of the md-raid?
I am utterly confused but that explains how I can load that kernel that does not exist on the other disk.
Basically I have only software-raid so both disks have their independent bootloaders installed.
It seems to me the disk got dropped out some time before the update today? and the update only took place on one disk?
The one disk that dropped was in a lower sata slot. So the dropped out disk was /dev/sdf, while the one still going is /dev/sdg
I suppose the system will mostly use /dev/sdf bootloader and somehow find that 100 kernel in there and boot that? I am utterly confused how this can even work….
Since the kernel modules are not quite the same linux gets a tad confused I suppose. I am though kinda surprised how usuable that server still is in that state.
I mean most stuff like sharkey, nextcloud and postgres run only on that array of raid5 spinning rust I have.
Right now I added the dropped-out disk in again and its now resyncing….