--- linux-2.4.20~patches.common_jfs-locking-fix/fs/jfs/jfs_logmgr.c +++ linux-2.4.20/fs/jfs/jfs_logmgr.c @@ -1824,6 +1824,7 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp) { struct lbuf *bp; + unsigned long flags; /* * allocate a log buffer @@ -1842,7 +1845,11 @@ generic_make_request(READ, &bp->l_bh); run_task_queue(&tq_disk); + LCACHE_LOCK(flags); /* disable+lock */ + - wait_event(bp->l_ioevent, (bp->l_flag != lbmREAD)); + LCACHE_SLEEP_COND(bp->l_ioevent, (bp->l_flag != lbmREAD), flags); + + LCACHE_UNLOCK(flags); /* unlock+enable */ return 0; } ----------------------------------------------------------------------- This is a secondary fix for the issue found in bug 23873, the primary fix is called jfs-blk-atomic-fix and was this: fs/jfs/jfs_logmgr.c: static int lbmRead() set_bit(BH_Req, &bp->l_bh.b_state); bp->l_bh.b_rdev = bp->l_bh.b_dev; bp->l_bh.b_rsector = bp->l_blkno << (log->l2bsize - 9); + bh_elv_seq(&bp->l_bh) = 0; generic_make_request(READ, &bp->l_bh); run_task_queue(&tq_disk); ... set_bit(BH_Req, &bp->l_bh.b_state); bp->l_bh.b_rdev = bp->l_bh.b_dev; bp->l_bh.b_rsector = bp->l_blkno << (bp->l_log->l2bsize - 9); + bh_elv_seq(&bp->l_bh) = 0; generic_make_request(WRITE, &bp->l_bh); ----------------------------------------------------------------------- History of this patch: In SuSE bugzilla 23873: Chris Mason wrote in comment #16: | What kind of locking protects lbmRead from racing against the | end_io routine? | | It looks like checks for (l_flag != lbmREAD) can race and lead | to a missed wakeup. In comment #19, he wrote: | I've just done some jfs testing on vanilla 2.4.21-pre4. | I had one proc doing this: | while(true); do mount /dev/jfs_disk /data ; touch /data/foo ; umount /data | ;done | | And other doing dbench 50 on a reiserfs disk. It resulted in the oops I'm | attaching, this is an ia32 SMP system w/1.5 GB of ram and the tests were done | on scsi. It took about 5 minutes before the oops. After some discussion, he made finally clear what he meant: | Dave, the race isn't in the structure of wait_event usage, | it's the non-atomic nature of the test. Without locking | or memory barriers, the waiting process might sleep after | a different CPU has already cleared the read flag and done a | wakeup. If the timing is just right, the wakeup will be missed. So the lock is against the not-atomicness of this test in the code: bp->l_flag != lbmREAD It was later agreed that using bitops(set_bit, test_bit, etc., ) would be the best solution for this. Subsequenty, the inital patch turned out to took the lock to early, it was taken before the call to run_task_queue() which could schedule and caused lockups on mount on SMP machines. The fix was to move the taking of the lock below the call to run_task_queue(), described in SuSE bugzilla #26139: ------------------------------------------------------------------- Tue Apr 8 20:49:59 CEST 2003 - bk@suse.de - JFS deadlock fix: can't hold LCACHE_LOCK while doing I/O [#26139] ------------------------------------------------------------------- On 2003-03-13, Andrea took the old version of the wrong patch provided in bug 23873 http://bugzilla.suse.de/attachment.cgi?id=3772&action=view into kernels/v2.4/2.4.21pre5aa1/9985_blk-atomic-aa6-jfs-1 It has to be splitted into a blk-atomic part (which is jfs-blk-atomic-fix as quoted above) and this patch. Therefore, I dropped patches.common/9985_blk-atomic-aa6-jfs-1 and re-add patches.common/jfs-blk-atomic-fix and patches.common/jfs-locking-fix (this file), which contain the correct fix in splitted way -> they are two different fixes in the same area, not one fix Don't delete this patch if the resulting code would have call to run_task_queue(&tq_disk) within these lines! + LCACHE_LOCK(flags); /* disable+lock */ + - wait_event(bp->l_ioevent, (bp->l_flag != lbmREAD)); + LCACHE_SLEEP_COND(bp->l_ioevent, (bp->l_flag != lbmREAD), flags); + + LCACHE_UNLOCK(flags); /* unlock+enable */ Bernhard