From: Nick Piggin This patch makes some minor tweaks to the scheduler. Well tested. Makes a bit of difference. Especially lower write_expire, which seems to be good for gcc in contest loads. Lowered antic_expire a bit. This should help all regressions. It _may_ be a little too low for old computers whose processes have high thinktimes and disks have high seek times. We really want the driver to be able to pass up a guesstimate of the average seek time of its device. Anyway. Nick Here are some results. I'll do some for deadline in a sec. Cat kernel tree during seq read mm 15.07 np 14.50 Cat kernel tree during seq write mm 19.13 np 16.33 ls -l kernel tree during seq read mm 9.10 np 9.39 ls -l kernel tree during seq write mm 5.45 np 5.49 Contest - bottom figure is +np no_load: Kernel [runs] Time CPU% Loads LCPU% Ratio 2.5.65-mm4 1 68 97.1 0.0 0.0 1.00 2.5.65-mm4 1 69 95.7 0.0 0.0 1.00 io_load: Kernel [runs] Time CPU% Loads LCPU% Ratio 2.5.65-mm4 1 108 63.9 89.8 23.1 1.59 2.5.65-mm4 1 100 70.0 81.2 22.0 1.45 read_load: Kernel [runs] Time CPU% Loads LCPU% Ratio 2.5.65-mm4 1 106 67.0 15.9 7.5 1.56 2.5.65-mm4 1 100 72.0 15.6 8.0 1.45 list_load: Kernel [runs] Time CPU% Loads LCPU% Ratio 2.5.65-mm4 1 97 70.1 7.0 20.6 1.43 2.5.65-mm4 1 94 72.3 6.0 19.1 1.36 tiobench was nothing special. even with large numbers of seq writers, the lower write_expire hasn't made a noticable impact. OraSim (np should average slightly better) mm - 138, 136 np - 129, 140 nickbench - mm Bench 2 - 2 threads, streaming reader & writer IO Rate: 40.63 MB/s Reads per write: 1.80 Bench 3 - 2 threads, streaming readers IO Rate: 47.08 MB/s Reads per read: 1.15 Bench 4 - 2 threads, streaming writers IO Rate: 39.98 MB/s Writes per write: 1.82 Bench 5 - 1 thread, read then write each block of 1 file Read then write: 22.64 MB/s CPU time per byte: 5130.859375 us/B Bench 6 - 4 threads, streaming readers IO Rate: 45.01 MB/s Greatest reads per read unfairness between 4 readers: 0.97 nickbench - np Bench 2 - 2 threads, streaming reader & writer IO Rate: 42.19 MB/s Reads per write: 2.08 Bench 3 - 2 threads, streaming readers IO Rate: 45.56 MB/s Reads per read: 1.00 Bench 4 - 2 threads, streaming writers IO Rate: 40.64 MB/s Writes per write: 1.68 Bench 5 - 2 thread, read then write each block of 1 file Read then write: 22.65 MB/s CPU time per byte: 5238.281250 us/B Bench 6 - 4 threads, streaming readers IO Rate: 44.89 MB/s Greatest reads per read unfairness between 4 readers: 0.95 drivers/block/as-iosched.c | 19 +++++++++---------- 1 files changed, 9 insertions(+), 10 deletions(-) diff -puN drivers/block/as-iosched.c~as-minor-tweaks drivers/block/as-iosched.c --- 25/drivers/block/as-iosched.c~as-minor-tweaks 2003-03-24 21:28:31.000000000 -0800 +++ 25-akpm/drivers/block/as-iosched.c 2003-03-24 21:28:31.000000000 -0800 @@ -64,7 +64,7 @@ static unsigned long read_expire = HZ / * ditto for writes, these limits are not hard, even * if the disk is capable of satisfying them. */ -static unsigned long write_expire = HZ / 2; +static unsigned long write_expire = HZ / 5; /* * read_batch_expire describes how long we will allow a stream of reads to @@ -79,9 +79,9 @@ static unsigned long read_batch_expire = static unsigned long write_batch_expire = HZ / 20; /* - * max time we may wait to anticipate a read + * max time we may wait to anticipate a read (default around 6ms) */ -static unsigned long antic_expire = HZ / 100; +static unsigned long antic_expire = ((HZ / 150) ? HZ / 150 : 1); /* * This is the per-process anticipatory I/O scheduler state. It is refcounted @@ -170,7 +170,7 @@ enum anticipation_states { ANTIC_WAIT_NEXT, /* Currently anticipating a request vs last read (which has completed) */ ANTIC_FINISHED, /* Anticipating but have found a candidate - or timed out */ + * or timed out */ }; /* @@ -633,9 +633,8 @@ static void as_antic_waitnext(struct as_ && ad->antic_status != ANTIC_WAIT_REQ); timeout = ad->antic_start + ad->antic_expire; -#if 0 /* TODO unif me. This should be fixed. */ timeout = min(timeout, ad->current_batch_expires); -#endif + mod_timer(&ad->antic_timer, timeout); ad->antic_status = ANTIC_WAIT_NEXT; @@ -710,7 +709,7 @@ static int as_close_req(struct as_data * if (delay <= 1) delta = 32; - else if (delay <= 20 && delay <= ad->antic_expire / 2) + else if (delay <= 20 && delay <= ad->antic_expire) delta = 32 << (delay-1); else return 1; @@ -771,7 +770,7 @@ static int as_can_break_anticipation(str return 1; } - if (aic && aic->ttime_mean > max(HZ/200, 1)) { + if (aic && aic->ttime_mean > ad->antic_expire) { ant_stats.big_thinktime++; return 1; } @@ -1147,7 +1146,7 @@ static int as_dispatch_request(struct as ant_stats.expired_write_batches++; } - if (!(reads && writes && as_batch_expired(ad))) { + if (!(reads && writes && as_batch_expired(ad)) ) { /* * batch is still running or no reads or no writes */ @@ -1194,7 +1193,7 @@ static int as_dispatch_request(struct as } /* - * there are either no reads or the last batch was a read + * the last batch was a read */ if (writes) { _