Thursday, October 24, 2019

MYSQL/BTRFS/NVME failure

It is a very bad idea to run database (especially production one with lots of I/O) on BTRFS because the filesystem at any random time might become readonly:
Oct 24 12:30:22 db02 kernel: BTRFS: error (device nvme0n1) in btrfs_run_delayed_refs:2936: errno=-28 No space left
Oct 24 12:30:22 db02 kernel: BTRFS info (device nvme0n1): forced readonly
And then you find that you need to do rebalance. You try and find out that rebalance can not be done because - you guessed it - there is no space left. They suggest to delete couple of snapshots though. You delete them, start rebalance and now the whole filesystem is stuck completely.

If you need HA mysql db with snapshots, then you should go with mysq/LVM/DRBD path, see this link for insight: https://rarforge.com/w/index.php/2_Node_Cluster:_Dual_Primary_DRBD_%2B_CLVM_%2B_KVM_%2B_Live_Migrations