aboutsummaryrefslogtreecommitdiffstats
path: root/managemon.c
diff options
context:
space:
mode:
authorMartin Wilck <mwilck@arcor.de>2013-07-30 23:18:33 +0200
committerNeilBrown <neilb@suse.de>2013-07-31 13:00:46 +1000
commit6ca1e6eccb2c6661ec111a455bcc2f3f5593cb06 (patch)
tree7b9872615725c2d76261a4dee6b51467a0616faf /managemon.c
parent30b83120ede68eb28f28118e3af4ff9c1de91fa0 (diff)
downloadmdadm-6ca1e6eccb2c6661ec111a455bcc2f3f5593cb06.tar.gz
mdmon: manage_member: fix race condition during slow meta data writes
In order to track kernel state changes, the monitor needs to notice changes in sysfs. If the changes are transient, and the monitor is busy writing meta data, it can happen that the changes are missed. This will cause the meta data to be inconsistent with the real state of the array. I can reproduce this in a test scenario with a DDF container and two subarrays, where I set a disk to "failed" and then add a global hot-spare. On a typical MD test setup with loop devices, I can reliably reproduce a failure where the metadata show degraded members although the kernel finished the recovery successfully. This patch fixes this problem by applying two changes. First, when a metadata update is queued, wait until it is certain that the monitor actually applied these meta data (the for loop is actually needed to avoid failures completely in my test case). Second, after triggering the recovery, set prev_state of the changed array to "recover", in case the monitor misses the transient "recover" state. Signed-off-by: Martin Wilck <mwilck@arcor.de> Signed-off-by: NeilBrown <neilb@suse.de>
Diffstat (limited to 'managemon.c')
-rw-r--r--managemon.c8
1 files changed, 7 insertions, 1 deletions
diff --git a/managemon.c b/managemon.c
index a6551081..40c863f1 100644
--- a/managemon.c
+++ b/managemon.c
@@ -535,8 +535,14 @@ static void manage_member(struct mdstat_ent *mdstat,
}
queue_metadata_update(updates);
updates = NULL;
+ while (update_queue_pending || update_queue) {
+ check_update_queue(container);
+ usleep(15*1000);
+ }
replace_array(container, a, newa);
- sysfs_set_str(&a->info, NULL, "sync_action", "recover");
+ if (sysfs_set_str(&a->info, NULL, "sync_action", "recover")
+ == 0)
+ newa->prev_action = recover;
dprintf("%s: recovery started on %s\n", __func__,
a->info.sys_name);
out: