With data=ordered it is often the case that a quick write-and-truncate will leave large numbers of pages on the page LRU with no ->mapping, and attached buffers. Because ext3 was not ready to let the pages go at the time of truncation. These pages are trivially reclaimable, but their seeming absence makes the VM overcommit accounting confused (they don't count as "free", nor as pagecache). And they make the /proc/meminfo stats look odd. So what we do here is to try to strip the buffers from these pages as the buffers exit the journal commit. 25-akpm/fs/jbd/commit.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 49 insertions(+), 2 deletions(-) diff -puN fs/jbd/commit.c~ext3-truncate-ordered-pages fs/jbd/commit.c --- 25/fs/jbd/commit.c~ext3-truncate-ordered-pages Thu Apr 3 14:38:48 2003 +++ 25-akpm/fs/jbd/commit.c Thu Apr 3 14:38:48 2003 @@ -18,6 +18,8 @@ #include #include #include +#include +#include #include extern spinlock_t journal_datalist_lock; @@ -36,6 +38,49 @@ static void journal_end_buffer_io_sync(s } /* + * When an ext3-ordered file is truncated, it is possible that many pages are + * not sucessfully freed, because they are attached to a committing transaction. + * After the transaction commits, these pages are left on the LRU, with no + * ->mapping, and with attached buffers. These pages are trivially reclaimable + * by the VM, but their apparent absence upsets the VM accounting, and it makes + * the numbers in /proc/meminfo look odd. + * + * So here, we have a buffer which has just come off the forget list. Look to + * see if we can strip all buffers from the backing page. + * + * Called under lock_journal(), and possibly under journal_datalist_lock. The + * caller provided us with a ref against the buffer, and we drop that here. + */ +static void release_buffer_page(struct buffer_head *bh) +{ + struct page *page; + + if (buffer_dirty(bh)) + goto nope; + if (atomic_read(&bh->b_count) != 1) + goto nope; + page = bh->b_page; + if (!page) + goto nope; + if (page->mapping) + goto nope; + + /* OK, it's a truncated page */ + if (TestSetPageLocked(page)) + goto nope; + + page_cache_get(page); + __brelse(bh); + try_to_free_buffers(page); + unlock_page(page); + page_cache_release(page); + return; + +nope: + __brelse(bh); +} + +/* * journal_commit_transaction * * The primary function for committing a transaction to the log. This @@ -213,7 +258,7 @@ write_out_data_locked: __journal_unfile_buffer(jh); jh->b_transaction = NULL; __journal_remove_journal_head(bh); - __brelse(bh); + release_buffer_page(bh); } } if (bufs == ARRAY_SIZE(wbuf)) { @@ -691,7 +736,9 @@ skip_commit: /* The journal should be un __journal_unfile_buffer(jh); jh->b_transaction = 0; __journal_remove_journal_head(bh); - __brelse(bh); + spin_unlock(&journal_datalist_lock); + release_buffer_page(bh); + continue; } spin_unlock(&journal_datalist_lock); } _