From: Trond Myklebust --=-ZF2iePfdJEJSqJn3I5IN Content-Type: text/plain Content-Transfer-Encoding: 7bit NFSv2/v3/v4: When pdflush() is trying to free up memory by calling our writepages() method, throttle all writes to that mountpoint. NFSv2/v3/v4: Make the struct nfs_page allocator use GFP_KERNEL rather than GFP_NOFS. Cheers, Trond --=-ZF2iePfdJEJSqJn3I5IN Content-Disposition: attachment; filename=linux-2.6.4-04-congestion.dif Content-Transfer-Encoding: base64 Content-Type: text/plain; name=linux-2.6.4-04-congestion.dif; charset=ISO-8859-1 IHBhZ2VsaXN0LmMgfCAgICAyIC0NCiB3cml0ZS5jICAgIHwgICA4MyArKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0tDQogMiBmaWxlcyBj aGFuZ2VkLCA3MCBpbnNlcnRpb25zKCspLCAxNSBkZWxldGlvbnMoLSkNCg0KZGlmZiAtdSAtLXJl Y3Vyc2l2ZSAtLW5ldy1maWxlIC0tc2hvdy1jLWZ1bmN0aW9uIGxpbnV4LTIuNi40LTIxLXNtYWxs X3dzaXplL2ZzL25mcy9wYWdlbGlzdC5jIGxpbnV4LTIuNi40LTIyLWNvbmdlc3Rpb24vZnMvbmZz L3BhZ2VsaXN0LmMNCi0tLSBsaW51eC0yLjYuNC0yMS1zbWFsbF93c2l6ZS9mcy9uZnMvcGFnZWxp c3QuYwkyMDA0LTAzLTA4IDE4OjE1OjU3LjAwMDAwMDAwMCAtMDUwMA0KKysrIGxpbnV4LTIuNi40 LTIyLWNvbmdlc3Rpb24vZnMvbmZzL3BhZ2VsaXN0LmMJMjAwNC0wMy0wOCAxODoxNjoyMy4wMDAw MDAwMDAgLTA1MDANCkBAIC0zMiw3ICszMiw3IEBAIHN0YXRpYyBpbmxpbmUgc3RydWN0IG5mc19w YWdlICoNCiBuZnNfcGFnZV9hbGxvYyh2b2lkKQ0KIHsNCiAJc3RydWN0IG5mc19wYWdlCSpwOw0K LQlwID0ga21lbV9jYWNoZV9hbGxvYyhuZnNfcGFnZV9jYWNoZXAsIFNMQUJfTk9GUyk7DQorCXAg PSBrbWVtX2NhY2hlX2FsbG9jKG5mc19wYWdlX2NhY2hlcCwgU0xBQl9LRVJORUwpOw0KIAlpZiAo cCkgew0KIAkJbWVtc2V0KHAsIDAsIHNpemVvZigqcCkpOw0KIAkJSU5JVF9MSVNUX0hFQUQoJnAt PndiX2xpc3QpOw0KZGlmZiAtdSAtLXJlY3Vyc2l2ZSAtLW5ldy1maWxlIC0tc2hvdy1jLWZ1bmN0 aW9uIGxpbnV4LTIuNi40LTIxLXNtYWxsX3dzaXplL2ZzL25mcy93cml0ZS5jIGxpbnV4LTIuNi40 LTIyLWNvbmdlc3Rpb24vZnMvbmZzL3dyaXRlLmMNCi0tLSBsaW51eC0yLjYuNC0yMS1zbWFsbF93 c2l6ZS9mcy9uZnMvd3JpdGUuYwkyMDA0LTAzLTA4IDE4OjE2OjE0LjAwMDAwMDAwMCAtMDUwMA0K KysrIGxpbnV4LTIuNi40LTIyLWNvbmdlc3Rpb24vZnMvbmZzL3dyaXRlLmMJMjAwNC0wMy0wOCAx ODozNjoyMy4wMDAwMDAwMDAgLTA1MDANCkBAIC03NiwxMSArNzYsMTUgQEAgc3RhdGljIHN0cnVj dCBuZnNfcGFnZSAqIG5mc191cGRhdGVfcmVxdQ0KIAkJCQkJICAgIHVuc2lnbmVkIGludCwgdW5z aWduZWQgaW50KTsNCiBzdGF0aWMgdm9pZCBuZnNfd3JpdGViYWNrX2RvbmVfcGFydGlhbChzdHJ1 Y3QgbmZzX3dyaXRlX2RhdGEgKiwgaW50KTsNCiBzdGF0aWMgdm9pZCBuZnNfd3JpdGViYWNrX2Rv bmVfZnVsbChzdHJ1Y3QgbmZzX3dyaXRlX2RhdGEgKiwgaW50KTsNCitzdGF0aWMgaW50IG5mc193 YWl0X29uX3dyaXRlX2Nvbmdlc3Rpb24oc3RydWN0IGFkZHJlc3Nfc3BhY2UgKiwgaW50KTsNCitz dGF0aWMgaW50IG5mc193YWl0X29uX3JlcXVlc3RzKHN0cnVjdCBpbm9kZSAqLCB1bnNpZ25lZCBs b25nLCB1bnNpZ25lZCBpbnQpOw0KIA0KIHN0YXRpYyBrbWVtX2NhY2hlX3QgKm5mc193ZGF0YV9j YWNoZXA7DQogc3RhdGljIG1lbXBvb2xfdCAqbmZzX3dkYXRhX21lbXBvb2w7DQogc3RhdGljIG1l bXBvb2xfdCAqbmZzX2NvbW1pdF9tZW1wb29sOw0KIA0KK3N0YXRpYyBERUNMQVJFX1dBSVRfUVVF VUVfSEVBRChuZnNfd3JpdGVfY29uZ2VzdGlvbik7DQorDQogc3RhdGljIF9faW5saW5lX18gc3Ry dWN0IG5mc193cml0ZV9kYXRhICpuZnNfd3JpdGVkYXRhX2FsbG9jKHZvaWQpDQogew0KIAlzdHJ1 Y3QgbmZzX3dyaXRlX2RhdGEJKnA7DQpAQCAtMjYxLDggKzI2NSw3IEBAIHN0YXRpYyBpbnQgbmZz X3dyaXRlcGFnZV9hc3luYyhzdHJ1Y3QgZmkNCiAvKg0KICAqIFdyaXRlIGFuIG1tYXBwZWQgcGFn ZSB0byB0aGUgc2VydmVyLg0KICAqLw0KLWludA0KLW5mc193cml0ZXBhZ2Uoc3RydWN0IHBhZ2Ug KnBhZ2UsIHN0cnVjdCB3cml0ZWJhY2tfY29udHJvbCAqd2JjKQ0KK2ludCBuZnNfd3JpdGVwYWdl KHN0cnVjdCBwYWdlICpwYWdlLCBzdHJ1Y3Qgd3JpdGViYWNrX2NvbnRyb2wgKndiYykNCiB7DQog CXN0cnVjdCBpbm9kZSAqaW5vZGUgPSBwYWdlLT5tYXBwaW5nLT5ob3N0Ow0KIAl1bnNpZ25lZCBs b25nIGVuZF9pbmRleDsNCkBAIC0zMDAsOCArMzAzLDExIEBAIGRvX2l0Og0KIAlsb2NrX2tlcm5l bCgpOw0KIAlpZiAoIUlTX1NZTkMoaW5vZGUpICYmIGlub2RlX3JlZmVyZW5jZWQpIHsNCiAJCWVy ciA9IG5mc193cml0ZXBhZ2VfYXN5bmMoTlVMTCwgaW5vZGUsIHBhZ2UsIDAsIG9mZnNldCk7DQot CQlpZiAoZXJyID49IDApDQorCQlpZiAoZXJyID49IDApIHsNCiAJCQllcnIgPSAwOw0KKwkJCWlm ICh3YmMtPmZvcl9yZWNsYWltKQ0KKwkJCQllcnIgPSBXUklURVBBR0VfQUNUSVZBVEU7DQorCQl9 DQogCX0gZWxzZSB7DQogCQllcnIgPSBuZnNfd3JpdGVwYWdlX3N5bmMoTlVMTCwgaW5vZGUsIHBh Z2UsIDAsIG9mZnNldCk7IA0KIAkJaWYgKGVyciA9PSBvZmZzZXQpDQpAQCAtMzA5LDMyICszMTUs NDYgQEAgZG9faXQ6DQogCX0NCiAJdW5sb2NrX2tlcm5lbCgpOw0KIG91dDoNCi0JdW5sb2NrX3Bh Z2UocGFnZSk7DQorCWlmIChlcnIgIT0gV1JJVEVQQUdFX0FDVElWQVRFKQ0KKwkJdW5sb2NrX3Bh Z2UocGFnZSk7DQogCWlmIChpbm9kZV9yZWZlcmVuY2VkKQ0KIAkJaXB1dChpbm9kZSk7DQogCXJl dHVybiBlcnI7IA0KIH0NCiANCi1pbnQNCi1uZnNfd3JpdGVwYWdlcyhzdHJ1Y3QgYWRkcmVzc19z cGFjZSAqbWFwcGluZywgc3RydWN0IHdyaXRlYmFja19jb250cm9sICp3YmMpDQorLyoNCisgKiBO b3RlOiBjYXVzZXMgbmZzX3VwZGF0ZV9yZXF1ZXN0KCkgdG8gYmxvY2sgb24gdGhlIGFzc3VtcHRp b24NCisgKiAJIHRoYXQgdGhlIHdyaXRlYmFjayBpcyBnZW5lcmF0ZWQgZHVlIHRvIG1lbW9yeSBw cmVzc3VyZS4NCisgKi8NCitpbnQgbmZzX3dyaXRlcGFnZXMoc3RydWN0IGFkZHJlc3Nfc3BhY2Ug Km1hcHBpbmcsIHN0cnVjdCB3cml0ZWJhY2tfY29udHJvbCAqd2JjKQ0KIHsNCisJc3RydWN0IGJh Y2tpbmdfZGV2X2luZm8gKmJkaSA9IG1hcHBpbmctPmJhY2tpbmdfZGV2X2luZm87DQogCXN0cnVj dCBpbm9kZSAqaW5vZGUgPSBtYXBwaW5nLT5ob3N0Ow0KLQlpbnQgaXNfc3luYyA9ICF3YmMtPm5v bmJsb2NraW5nOw0KIAlpbnQgZXJyOw0KIA0KIAllcnIgPSBnZW5lcmljX3dyaXRlcGFnZXMobWFw cGluZywgd2JjKTsNCiAJaWYgKGVycikNCi0JCWdvdG8gb3V0Ow0KKwkJcmV0dXJuIGVycjsNCisJ d2hpbGUgKHRlc3RfYW5kX3NldF9iaXQoQkRJX3dyaXRlX2Nvbmdlc3RlZCwgJmJkaS0+c3RhdGUp ICE9IDApIHsNCisJCWlmICh3YmMtPm5vbmJsb2NraW5nKQ0KKwkJCXJldHVybiAwOw0KKwkJbmZz X3dhaXRfb25fd3JpdGVfY29uZ2VzdGlvbihtYXBwaW5nLCAwKTsNCisJfQ0KIAllcnIgPSBuZnNf Zmx1c2hfaW5vZGUoaW5vZGUsIDAsIDAsIDApOw0KIAlpZiAoZXJyIDwgMCkNCiAJCWdvdG8gb3V0 Ow0KLQlpZiAod2JjLT5zeW5jX21vZGUgPT0gV0JfU1lOQ19IT0xEKQ0KLQkJZ290byBvdXQ7DQot CWlmIChpc19zeW5jICYmIHdiYy0+c3luY19tb2RlID09IFdCX1NZTkNfQUxMKSB7DQotCQllcnIg PSBuZnNfd2JfYWxsKGlub2RlKTsNCi0JfSBlbHNlDQotCQluZnNfY29tbWl0X2lub2RlKGlub2Rl LCAwLCAwLCAwKTsNCisJd2JjLT5ucl90b193cml0ZSAtPSBlcnI7DQorCWlmICghd2JjLT5ub25i bG9ja2luZyAmJiB3YmMtPnN5bmNfbW9kZSA9PSBXQl9TWU5DX0FMTCkgew0KKwkJZXJyID0gbmZz X3dhaXRfb25fcmVxdWVzdHMoaW5vZGUsIDAsIDApOw0KKwkJaWYgKGVyciA8IDApDQorCQkJZ290 byBvdXQ7DQorCX0NCisJZXJyID0gbmZzX2NvbW1pdF9pbm9kZShpbm9kZSwgMCwgMCwgMCk7DQor CWlmIChlcnIgPiAwKQ0KKwkJd2JjLT5ucl90b193cml0ZSAtPSBlcnI7DQogb3V0Og0KKwljbGVh cl9iaXQoQkRJX3dyaXRlX2Nvbmdlc3RlZCwgJmJkaS0+c3RhdGUpOw0KKwl3YWtlX3VwX2FsbCgm bmZzX3dyaXRlX2Nvbmdlc3Rpb24pOw0KIAlyZXR1cm4gZXJyOw0KIH0NCiANCkBAIC01NDYsNiAr NTY2LDM4IEBAIG5mc19zY2FuX2NvbW1pdChzdHJ1Y3QgaW5vZGUgKmlub2RlLCBzdHINCiB9DQog I2VuZGlmDQogDQorc3RhdGljIGludCBuZnNfd2FpdF9vbl93cml0ZV9jb25nZXN0aW9uKHN0cnVj dCBhZGRyZXNzX3NwYWNlICptYXBwaW5nLCBpbnQgaW50cikNCit7DQorCXN0cnVjdCBiYWNraW5n X2Rldl9pbmZvICpiZGkgPSBtYXBwaW5nLT5iYWNraW5nX2Rldl9pbmZvOw0KKwlERUZJTkVfV0FJ VCh3YWl0KTsNCisJaW50IHJldCA9IDA7DQorDQorCW1pZ2h0X3NsZWVwKCk7DQorDQorCWlmICgh YmRpX3dyaXRlX2Nvbmdlc3RlZChiZGkpKQ0KKwkJcmV0dXJuIDA7DQorCWlmIChpbnRyKSB7DQor CQlzdHJ1Y3QgcnBjX2NsbnQgKmNsbnQgPSBORlNfQ0xJRU5UKG1hcHBpbmctPmhvc3QpOw0KKwkJ c2lnc2V0X3Qgb2xkc2V0Ow0KKw0KKwkJcnBjX2NsbnRfc2lnbWFzayhjbG50LCAmb2xkc2V0KTsN CisJCXByZXBhcmVfdG9fd2FpdCgmbmZzX3dyaXRlX2Nvbmdlc3Rpb24sICZ3YWl0LCBUQVNLX0lO VEVSUlVQVElCTEUpOw0KKwkJaWYgKGJkaV93cml0ZV9jb25nZXN0ZWQoYmRpKSkgew0KKwkJCWlm IChzaWduYWxsZWQoKSkNCisJCQkJcmV0ID0gLUVSRVNUQVJUU1lTOw0KKwkJCWVsc2UNCisJCQkJ c2NoZWR1bGUoKTsNCisJCX0NCisJCXJwY19jbG50X3NpZ3VubWFzayhjbG50LCAmb2xkc2V0KTsN CisJfSBlbHNlIHsNCisJCXByZXBhcmVfdG9fd2FpdCgmbmZzX3dyaXRlX2Nvbmdlc3Rpb24sICZ3 YWl0LCBUQVNLX1VOSU5URVJSVVBUSUJMRSk7DQorCQlpZiAoYmRpX3dyaXRlX2Nvbmdlc3RlZChi ZGkpKQ0KKwkJCXNjaGVkdWxlKCk7DQorCX0NCisJZmluaXNoX3dhaXQoJm5mc193cml0ZV9jb25n ZXN0aW9uLCAmd2FpdCk7DQorCXJldHVybiByZXQ7DQorfQ0KKw0KIA0KIC8qDQogICogVHJ5IHRv IHVwZGF0ZSBhbnkgZXhpc3Rpbmcgd3JpdGUgcmVxdWVzdCwgb3IgY3JlYXRlIG9uZSBpZiB0aGVy ZSBpcyBub25lLg0KQEAgLTU1OCwxMSArNjEwLDE0IEBAIHN0YXRpYyBzdHJ1Y3QgbmZzX3BhZ2Ug Kg0KIG5mc191cGRhdGVfcmVxdWVzdChzdHJ1Y3QgZmlsZSogZmlsZSwgc3RydWN0IGlub2RlICpp bm9kZSwgc3RydWN0IHBhZ2UgKnBhZ2UsDQogCQkgICB1bnNpZ25lZCBpbnQgb2Zmc2V0LCB1bnNp Z25lZCBpbnQgYnl0ZXMpDQogew0KKwlzdHJ1Y3QgbmZzX3NlcnZlciAqc2VydmVyID0gTkZTX1NF UlZFUihpbm9kZSk7DQogCXN0cnVjdCBuZnNfcGFnZQkJKnJlcSwgKm5ldyA9IE5VTEw7DQogCXVu c2lnbmVkIGxvbmcJCXJxZW5kLCBlbmQ7DQogDQogCWVuZCA9IG9mZnNldCArIGJ5dGVzOw0KIA0K KwlpZiAobmZzX3dhaXRfb25fd3JpdGVfY29uZ2VzdGlvbihwYWdlLT5tYXBwaW5nLCBzZXJ2ZXIt PmZsYWdzICYgTkZTX01PVU5UX0lOVFIpKQ0KKwkJcmV0dXJuIEVSUl9QVFIoLUVSRVNUQVJUU1lT KTsNCiAJZm9yICg7Oykgew0KIAkJLyogTG9vcCBvdmVyIGFsbCBpbm9kZSBlbnRyaWVzIGFuZCBz ZWUgaWYgd2UgZmluZA0KIAkJICogQSByZXF1ZXN0IGZvciB0aGUgcGFnZSB3ZSB3aXNoIHRvIHVw ZGF0ZQ0K --=-ZF2iePfdJEJSqJn3I5IN-- --- 25-akpm/fs/nfs/pagelist.c | 2 - 25-akpm/fs/nfs/write.c | 83 ++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 70 insertions(+), 15 deletions(-) diff -puN fs/nfs/pagelist.c~nfs-04-congestion fs/nfs/pagelist.c --- 25/fs/nfs/pagelist.c~nfs-04-congestion 2004-03-14 15:12:34.967414616 -0800 +++ 25-akpm/fs/nfs/pagelist.c 2004-03-14 15:12:34.970414160 -0800 @@ -32,7 +32,7 @@ static inline struct nfs_page * nfs_page_alloc(void) { struct nfs_page *p; - p = kmem_cache_alloc(nfs_page_cachep, SLAB_NOFS); + p = kmem_cache_alloc(nfs_page_cachep, SLAB_KERNEL); if (p) { memset(p, 0, sizeof(*p)); INIT_LIST_HEAD(&p->wb_list); diff -puN fs/nfs/write.c~nfs-04-congestion fs/nfs/write.c --- 25/fs/nfs/write.c~nfs-04-congestion 2004-03-14 15:12:34.968414464 -0800 +++ 25-akpm/fs/nfs/write.c 2004-03-14 15:12:34.972413856 -0800 @@ -76,11 +76,15 @@ static struct nfs_page * nfs_update_requ unsigned int, unsigned int); static void nfs_writeback_done_partial(struct nfs_write_data *, int); static void nfs_writeback_done_full(struct nfs_write_data *, int); +static int nfs_wait_on_write_congestion(struct address_space *, int); +static int nfs_wait_on_requests(struct inode *, unsigned long, unsigned int); static kmem_cache_t *nfs_wdata_cachep; static mempool_t *nfs_wdata_mempool; static mempool_t *nfs_commit_mempool; +static DECLARE_WAIT_QUEUE_HEAD(nfs_write_congestion); + static __inline__ struct nfs_write_data *nfs_writedata_alloc(void) { struct nfs_write_data *p; @@ -259,8 +263,7 @@ static int nfs_writepage_async(struct fi /* * Write an mmapped page to the server. */ -int -nfs_writepage(struct page *page, struct writeback_control *wbc) +int nfs_writepage(struct page *page, struct writeback_control *wbc) { struct inode *inode = page->mapping->host; unsigned long end_index; @@ -298,8 +301,11 @@ do_it: lock_kernel(); if (!IS_SYNC(inode) && inode_referenced) { err = nfs_writepage_async(NULL, inode, page, 0, offset); - if (err >= 0) + if (err >= 0) { err = 0; + if (wbc->for_reclaim) + err = WRITEPAGE_ACTIVATE; + } } else { err = nfs_writepage_sync(NULL, inode, page, 0, offset); if (err == offset) @@ -307,32 +313,46 @@ do_it: } unlock_kernel(); out: - unlock_page(page); + if (err != WRITEPAGE_ACTIVATE) + unlock_page(page); if (inode_referenced) iput(inode); return err; } -int -nfs_writepages(struct address_space *mapping, struct writeback_control *wbc) +/* + * Note: causes nfs_update_request() to block on the assumption + * that the writeback is generated due to memory pressure. + */ +int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc) { + struct backing_dev_info *bdi = mapping->backing_dev_info; struct inode *inode = mapping->host; - int is_sync = !wbc->nonblocking; int err; err = generic_writepages(mapping, wbc); if (err) - goto out; + return err; + while (test_and_set_bit(BDI_write_congested, &bdi->state) != 0) { + if (wbc->nonblocking) + return 0; + nfs_wait_on_write_congestion(mapping, 0); + } err = nfs_flush_inode(inode, 0, 0, 0); if (err < 0) goto out; - if (wbc->sync_mode == WB_SYNC_HOLD) - goto out; - if (is_sync && wbc->sync_mode == WB_SYNC_ALL) { - err = nfs_wb_all(inode); - } else - nfs_commit_inode(inode, 0, 0, 0); + wbc->nr_to_write -= err; + if (!wbc->nonblocking && wbc->sync_mode == WB_SYNC_ALL) { + err = nfs_wait_on_requests(inode, 0, 0); + if (err < 0) + goto out; + } + err = nfs_commit_inode(inode, 0, 0, 0); + if (err > 0) + wbc->nr_to_write -= err; out: + clear_bit(BDI_write_congested, &bdi->state); + wake_up_all(&nfs_write_congestion); return err; } @@ -544,6 +564,38 @@ nfs_scan_commit(struct inode *inode, str } #endif +static int nfs_wait_on_write_congestion(struct address_space *mapping, int intr) +{ + struct backing_dev_info *bdi = mapping->backing_dev_info; + DEFINE_WAIT(wait); + int ret = 0; + + might_sleep(); + + if (!bdi_write_congested(bdi)) + return 0; + if (intr) { + struct rpc_clnt *clnt = NFS_CLIENT(mapping->host); + sigset_t oldset; + + rpc_clnt_sigmask(clnt, &oldset); + prepare_to_wait(&nfs_write_congestion, &wait, TASK_INTERRUPTIBLE); + if (bdi_write_congested(bdi)) { + if (signalled()) + ret = -ERESTARTSYS; + else + schedule(); + } + rpc_clnt_sigunmask(clnt, &oldset); + } else { + prepare_to_wait(&nfs_write_congestion, &wait, TASK_UNINTERRUPTIBLE); + if (bdi_write_congested(bdi)) + schedule(); + } + finish_wait(&nfs_write_congestion, &wait); + return ret; +} + /* * Try to update any existing write request, or create one if there is none. @@ -556,11 +608,14 @@ static struct nfs_page * nfs_update_request(struct file* file, struct inode *inode, struct page *page, unsigned int offset, unsigned int bytes) { + struct nfs_server *server = NFS_SERVER(inode); struct nfs_page *req, *new = NULL; unsigned long rqend, end; end = offset + bytes; + if (nfs_wait_on_write_congestion(page->mapping, server->flags & NFS_MOUNT_INTR)) + return ERR_PTR(-ERESTARTSYS); for (;;) { /* Loop over all inode entries and see if we find * A request for the page we wish to update _