From: Rik Van Riel It turns out that the original swap token implementation, by Song Jiang, only enforced the swap token while the task holding the token is handling a page fault. This patch approximates that, without adding an additional flag to the mm_struct, by checking whether the mm->mmap_sem is held for reading, like the page fault code does. This patch has the effect of automatically, and gradually, disabling the enforcement of the swap token when there is little or no paging going on, and "turning up" the intensity of the swap token code the more the task holding the token is thrashing. Thanks to Song Jiang for pointing out this aspect of the token based thrashing control concept. The new code shows a slight degradation over the old swap token code, but still a big win over running without the swap token. 2.6.12+ swap token disabled $ for i in `seq 10` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done 101.74user 23.13system 8:26.91elapsed 24%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (38597major+430315minor)pagefaults 0swaps 101.98user 24.91system 8:03.06elapsed 26%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (33939major+430457minor)pagefaults 0swaps 101.93user 22.12system 7:34.90elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (33166major+421267minor)pagefaults 0swaps 101.82user 22.38system 8:31.40elapsed 24%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (39338major+433262minor)pagefaults 0swaps 2.6.12+ swap token enabled, timeout 300 seconds $ for i in `seq 4` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done 102.58user 16.08system 3:41.44elapsed 53%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (19707major+285786minor)pagefaults 0swaps 102.07user 19.56system 4:00.64elapsed 50%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (19012major+299259minor)pagefaults 0swaps 102.64user 18.25system 4:07.31elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (21990major+304831minor)pagefaults 0swaps 101.39user 19.41system 5:15.81elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (24850major+323321minor)pagefaults 0swaps 2.6.12+ with new swap token code, timeout 300 seconds $ for i in `seq 4` ; do /usr/bin/time ./qsbench -n 30000000 -p 3 ; done 101.87user 24.66system 5:53.20elapsed 35%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (26848major+363497minor)pagefaults 0swaps 102.83user 19.95system 4:17.25elapsed 47%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (19946major+305722minor)pagefaults 0swaps 102.09user 19.46system 5:12.57elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (25461major+334994minor)pagefaults 0swaps 101.67user 20.61system 4:52.97elapsed 41%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (22190major+329508minor)pagefaults 0swaps Signed-off-by: Rik Van Riel Signed-off-by: Andrew Morton --- mm/rmap.c | 6 +++++- mm/thrash.c | 2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff -puN mm/rmap.c~swaptoken-tuning mm/rmap.c --- 25/mm/rmap.c~swaptoken-tuning Tue Jun 28 15:44:30 2005 +++ 25-akpm/mm/rmap.c Tue Jun 28 15:44:30 2005 @@ -301,7 +301,11 @@ static int page_referenced_one(struct pa if (ptep_clear_flush_young(vma, address, pte)) referenced++; - if (mm != current->mm && !ignore_token && has_swap_token(mm)) + /* Pretend the page is referenced if the task has the + swap token and is in the middle of a page fault. */ + if (mm != current->mm && !ignore_token && + has_swap_token(mm) && + sem_is_read_locked(&mm->mmap_sem)) referenced++; (*mapcount)--; diff -puN mm/thrash.c~swaptoken-tuning mm/thrash.c --- 25/mm/thrash.c~swaptoken-tuning Tue Jun 28 15:44:30 2005 +++ 25-akpm/mm/thrash.c Tue Jun 28 15:44:30 2005 @@ -19,7 +19,7 @@ static unsigned long swap_token_check; struct mm_struct * swap_token_mm = &init_mm; #define SWAP_TOKEN_CHECK_INTERVAL (HZ * 2) -#define SWAP_TOKEN_TIMEOUT 0 +#define SWAP_TOKEN_TIMEOUT 300 /* * Currently disabled; Needs further code to work at HZ * 300. */ _