From: "Darrick J. Wong" I was running a random system call generator against mainline the other day and got this bug report about AIO in dmesg: kernel BUG at fs/aio.c:1249! Each ioctx structure has a "users" field that acts as a reference counter for the ioctx, and a "dead" flag that seems to indicate that the ioctx isn't associated with any particular list of IO requests. The problem, then, lies in aio.c:1247. The io_destroy function checks the (old) value of the dead flag--if it's false (i.e. the ioctx is alive), then the function calls put_ioctx to decrease the reference count on the assumption that the ioctx is no longer associated with any requests. Later, it calls put_ioctx again, on the assumption that someone called lookup_ioctx to perform some operation at some point. This BUG is caused by the reference counts being off. The testcase that I provided looks for a chunk of user memory that's read-only and passes that to the sys_io_setup syscall. sys_io_setup checks that the pointer is readable, creates the ioctx and then tries to write the ioctx handle back to userland. This is where the problems start to surface. Since the pointer points to a non-writable region of memory, the write fails. The syscall handler then destroys the ioctx. The dead flag is zero, so io_destroy calls put_ioctx...but wait! Nobody ever put the ioctx into a request list. The ioctx is alive but not in a list, yet the io_destroy code assumes that being alive implies being in a request list somewhere. Hence, calling put_ioctx is bogus; the reference count becomes 0, and the ioctx is freed. Worse yet, put_ioctx is called again (on a freed pointer!) to clear up the lookup_ioctx that never happened. put_ioctx sees that the reference count has become negative and BUGs. The patch that I've provided calls aio_cancel_all before calling io_destroy in this failure case. aio_cancel_all sets ioctx->dead = 1 and cancels all requests (there shouldn't be any in this case) in progress. Since the dead flag is 1, io_destroy calls put_ioctx once to zero the reference count and free the ioctx, and thus the BUG condition doesn't get triggered. The userland program receives an error code instead of a segfault. Signed-off-by: Darrick Wong Signed-off-by: Andrew Morton --- 25-akpm/fs/aio.c | 1 + 1 files changed, 1 insertion(+) diff -puN fs/aio.c~bug-in-io_destroy-fs-aioc1248 fs/aio.c --- 25/fs/aio.c~bug-in-io_destroy-fs-aioc1248 2005-01-22 18:58:06.591523592 -0800 +++ 25-akpm/fs/aio.c 2005-01-22 18:58:06.597522680 -0800 @@ -1285,6 +1285,7 @@ asmlinkage long sys_io_setup(unsigned nr if (!ret) return 0; + aio_cancel_all(ioctx); io_destroy(ioctx); } _