From: Andrea Arcangeli <andrea@suse.de>

I think I understood why some ext2 fs corruption still happens even after
the last i_size fix.

what happened I believe is that the writepages layer got a not a fully
uptodate page (in turn with bh mapped on top of it), and then right before
unlocking the page and entering the writeback mode, it freed all the bh. 
Without bh a not uptodate page will trigger a full readpage from disk, that
overwrites the pagecache before the multi-bio gets submitted, generating fs
corruption.

I believe the below patch should fix it (untested) against kernel CVS.

The testcases developed by Kurt showed the pagecache being overwritten with
on-disk data at block offsets, and Chris as well was wondering about races
between wait_on_page_writeback and readpage.  the below fix just explains
everything we've seen since not-fully-uptodate pages must have always bh on
them and the below patch enforces just that invariant, and it should fix
our pagecache-overwritten-by-disk-data problem.

Signed-off-by: Andrea Arcangeli <andrea@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/fs/mpage.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletion(-)

diff -puN fs/mpage.c~writepages-drops-bh-on-not-uptodate-page fs/mpage.c
--- 25/fs/mpage.c~writepages-drops-bh-on-not-uptodate-page	2004-07-27 22:09:19.949673280 -0700
+++ 25-akpm/fs/mpage.c	2004-07-27 22:09:19.954672520 -0700
@@ -553,7 +553,12 @@ alloc_new:
 			bh = bh->b_this_page;
 		} while (bh != head);
 
-		if (buffer_heads_over_limit)
+		/*
+		 * we cannot drop the bh if the page is not uptodate
+		 * or a concurrent readpage would fail to serialize with the bh
+		 * and it would read from disk before we reach the platter.
+		 */
+		if (buffer_heads_over_limit && PageUptodate(page))
 			try_to_free_buffers(page);
 	}
 
_