From: Andrea Arcangeli I think I understood why some ext2 fs corruption still happens even after the last i_size fix. what happened I believe is that the writepages layer got a not a fully uptodate page (in turn with bh mapped on top of it), and then right before unlocking the page and entering the writeback mode, it freed all the bh. Without bh a not uptodate page will trigger a full readpage from disk, that overwrites the pagecache before the multi-bio gets submitted, generating fs corruption. I believe the below patch should fix it (untested) against kernel CVS. The testcases developed by Kurt showed the pagecache being overwritten with on-disk data at block offsets, and Chris as well was wondering about races between wait_on_page_writeback and readpage. the below fix just explains everything we've seen since not-fully-uptodate pages must have always bh on them and the below patch enforces just that invariant, and it should fix our pagecache-overwritten-by-disk-data problem. Signed-off-by: Andrea Arcangeli Signed-off-by: Andrew Morton --- 25-akpm/fs/mpage.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletion(-) diff -puN fs/mpage.c~writepages-drops-bh-on-not-uptodate-page fs/mpage.c --- 25/fs/mpage.c~writepages-drops-bh-on-not-uptodate-page 2004-07-27 22:09:19.949673280 -0700 +++ 25-akpm/fs/mpage.c 2004-07-27 22:09:19.954672520 -0700 @@ -553,7 +553,12 @@ alloc_new: bh = bh->b_this_page; } while (bh != head); - if (buffer_heads_over_limit) + /* + * we cannot drop the bh if the page is not uptodate + * or a concurrent readpage would fail to serialize with the bh + * and it would read from disk before we reach the platter. + */ + if (buffer_heads_over_limit && PageUptodate(page)) try_to_free_buffers(page); } _