kernel/git/torvalds/linux.git - Linux kernel source tree

Age	Commit message (Collapse)	Author	Files	Lines
2006-01-09	powerpc: unbreak iSeries compilation again	Paul Mackerras	2	-1/+5
	We don't set CONFIG_PPC_MULTIPLATFORM on iSeries (yet). Avoid compiling in the prom_init stuff on iSeries. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: Don't use KERNELBASE in add_memory()	Michael Ellerman	1	-1/+1
	In add_memory() we should be using __va() to get a virtual address. Spotted by Mike Kravetz. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	powerpc: set CONFIG_PPC_OF=y always for ARCH=powerpc	Paul Mackerras	3	-7/+5
	The CONFIG_PPC_OF symbol is used to mean that the firmware device tree access functions are available. Since we always have a device tree with ARCH=powerpc, make CONFIG_PPC_OF always Y for ARCH=powerpc. This fixes some compile errors reported by Kumar Gala, but in a different way to his patch. This also makes prom_parse.o be compiled only if CONFIG_PPC_OF so that non-OF ARCH=ppc platforms will compile. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: DABR exceptions should report the address not the PC	Anton Blanchard	1	-3/+4
	When taking a DABR exception we were reporting the PC. It makes more sense to report the address that caused the exception, and the gdb guys would like it that way. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] ppc64: POWER5+ oprofile support	Anton Blanchard	1	-2/+2
	POWER5+ adds new PMU groups and as such needs to be treated differently by oprofile userspace. Change it to report itself as power5+. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] ppc64: Fix oprofile when compiled as a module	Anton Blanchard	3	-82/+59
	My recent changes to oprofile broke it when built as a module. Fix it by using an enum instead of a function pointer. This way we still retain the oprofile configuration in the cputable. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] 4/5 powerpc: Add cpufreq support for all desktop G5	Benjamin Herrenschmidt	1	-47/+449
	This patch adds cpufreq support for all desktop "tower" G5 models. The only G5 models still lacking cpufreq support at this point are the Xserve and possibly the new iMac iSight (not tested). I'll have those added soon. That patch uses the new platform functions interpreter to implement frequency and voltage switching on most models. Note that in order to find the low frequency value, I had to hack something that might now work properly on all models, so if the frequency value reported when running low speed looks bogus to you, please report it to me. (Appart from a bogus reported value, things should work fine). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] 3/5 powerpc: Add platform functions interpreter	Benjamin Herrenschmidt	10	-42/+2028
	This is the platform function interpreter itself along with the backends for UniN/U3/U4, mac-io, GPIOs and i2c. It adds the ability to execute those do-platform-* scripts in the device-tree (at least for most devices for which a backend is provided). This should replace the clock spreading hacks properly. It might also have an impact on all sort of machines since some of the scripts marked "at init" will now be executed on boot (or some other on sleep/wakeup), those will possibly do things that the kernel didn't do at all, like setting some values into some i2c devices (changing thermal sensor calibration or conversion rate) etc... Thus regression testing is MUCH welcome. Also loook for errors in dmesg. That's also why I've left rather verbose debugging enabled in this version of the patch. (I do expect some Windtunnel G4s to show some errors as they have an i2c clock chip on the PMU bus that uses some primitives that the i2c backend doesn't implement yet. I really need users that have one of those machine to come back to me so we can get that done right, though the errors themselves should be harmless, I suspect the machine might not run at full speed). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] 2/5 powerpc: Rework PowerMac i2c part 2	Benjamin Herrenschmidt	12	-1364/+563
	This is the continuation of the previous patch. This one removes the old PowerMac i2c drivers (i2c-keywest and i2c-pmac-smu) and replaces them both with a single stub driver that uses the new PowerMac low i2c layer. Now that i2c-keywest is gone, the low-i2c code is extended to support interrupt driver transfers. All i2c busses now appear as platform devices. Compatibility with existing drivers should be maintained as the i2c bus names have been kept identical, except for the SMU bus but in that later case, all users has been fixed. With that patch added, matching a device node to an i2c_adapter becomes trivial. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] 1/5 powerpc: Rework PowerMac i2c part 1	Benjamin Herrenschmidt	11	-667/+859
	This is the first part of a rework of the PowerMac i2c code. It completely reworks the "low_i2c" layer. It is now more flexible, supports KeyWest, SMU and PMU i2c busses, and provides functions to match device nodes to i2c busses and adapters. This patch also extends & fix some bugs in the SMU driver related to i2c support and removes the clock spreading hacks from the pmac feature code rather than adapting them to the new API since they'll be replaced by the platform function code completely in patch 3/5 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] ppc64: fix time syscall	Anton Blanchard	2	-29/+1
	ppc64 has its own version of sys_time. It looks pretty scary, touching a whole bunch of variables without any locking or memory ordering. In fact, a recent bugreport has shown it can actually go backwards. Time to remove it and just use the generic sys_time, which is implemented on top of do_gettimeofday. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] ppc32: Put cache flush routines back into .relocate_code section	Paul Janzen	1	-2/+4
	In 2.6.14, we had the following definition of _GLOBAL() in include/asm-ppc/processor.h: #define _GLOBAL(n)\ .stabs __stringify(n:F-1),N_FUN,0,0,n;\ .globl n;\ n: In 2.6.15, as part of the great powerpc merge, we moved this definition to include/asm-powerpc/ppc_asm.h, where it appears (to 32-bit code) as: #define _GLOBAL(n) \ .text; \ .stabs __stringify(n:F-1),N_FUN,0,0,n;\ .globl n; \ n: Mostly, this is fine. However, we also have the following, in arch/ppc/boot/common/util.S: .section ".relocate_code","xa" [...] _GLOBAL(flush_instruction_cache) [...] _GLOBAL(flush_data_cache) [...] The addition of the .text section definition in the definition of _GLOBAL overrides the .relocate_code section definition. As a result, these two functions don't end up in .relocate_code, so they don't get relocated correctly, and the boot fails. There's another suspicious-looking usage at kernel/swsusp.S:37 that someone should look into. I did not exhaustively search the source tree, though. The following is the minimal patch that fixes the immediate problem. I could easily be convinced that the _GLOBAL definition should be modified to remove the ".text;" line either instead of, or in addition to, this fix. Signed-off-by: Paul Janzen <pcj@linux.sez.to> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: set irq affinity for running threads	Arnd Bergmann	5	-16/+41
	For far, all SPU triggered interrupts always end up on the first SMT thread, which is a bad solution. This patch implements setting the affinity to the CPU that was running last when entering execution on an SPU. This should result in a significant reduction in IPI calls and better cache locality for SPE thread specific data. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: fix allocation on 64k pages	Arnd Bergmann	1	-3/+1
	The size of the local store is architecture defined and independent from the page size, so it should not be defined in terms of pages in the first place. This mistake broke a few places when building for 64kb pages. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: fix sparse warnings	Arnd Bergmann	4	-6/+5
	One local variable is missing an __iomem modifier, in another place, we pass a completely unused argument with a missing __user modifier. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: abstract priv1 register access.	Arnd Bergmann	6	-135/+233
	In a hypervisor based setup, direct access to the first priviledged register space can typically not be allowed to the kernel and has to be implemented through hypervisor calls. As suggested by Masato Noguchi, let's abstract the register access trough a number of function calls. Since there is currently no public specification of actual hypervisor calls to implement this, I only provide a place that makes it easier to hook into. Cc: Masato Noguchi <Masato.Noguchi@jp.sony.com> Cc: Geoff Levand <geoff.levand@am.sony.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: move spu_run call to its own file	Arnd Bergmann	4	-153/+160
	The logic for sys_spu_run keeps growing and it does not really belong into file.c any more since we moved away from using regular file operations to our own syscall. No functional change in here. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: clean up use of bitops	Arnd Bergmann	5	-17/+14
	checking bits manually might not be synchonized with the use of set_bit/clear_bit. Make sure we always use the correct bitops by removing the unnecessary identifiers. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: fix spufs_fill_dir error path	Arnd Bergmann	1	-35/+38
	If creating one entry failed in spufs_fill_dir, we never cleaned up the freshly created entries. Fix this by calling the cleanup function on error. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: dont leak directories in failed spu_create	Arnd Bergmann	1	-17/+37
	If get_unused_fd failed in sys_spu_create, we never cleaned up the created directory. Fix that by restructuring the error path. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs fix spu_acquire_runnable error path	Arnd Bergmann	1	-2/+2
	When spu_activate fails in spu_acquire_runnable, the state must still be SPU_STATE_SAVED, we were incorrectly setting it to SPU_STATE_RUNNABLE. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: serialize sys_spu_run per spu	Arnd Bergmann	3	-5/+12
	During an earlier cleanup, we lost the serialization of multiple spu_run calls performed on the same spu_context. In order to get this back, introduce a mutex in the spu_context that is held inside of spu_run. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: check for proper file pointer in sys_spu_run	Arnd Bergmann	3	-3/+5
	Only checking for SPUFS_MAGIC is not reliable, because it might not be unique in theory. Worse than that, we accidentally allow spu_run to be performed on any file in spufs, not just those returned from spu_create as intended. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: dont hold root->isem in spu_forget	Arnd Bergmann	1	-10/+10
	spu_forget will do mmput on the DMA address space, which can lead to lots of other stuff getting triggered. We better not hold a semaphore here that we might need in the process. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] spufs: fix locking in spu_acquire_runnable	Arnd Bergmann	1	-4/+6
	We need to check for validity of owner under down_write, down_read is not enough. Noticed by Al Viro. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] cell: enable pause(0) in cpu_idle	Arnd Bergmann	12	-14/+335
	This patch enables support for pause(0) power management state for the Cell Broadband Processor, which is import for power efficient operation. The pervasive infrastructure will in the future enable us to introduce more functionality specific to the Cell's pervasive unit. From: Maximino Aguilar <maguilar@us.ibm.com> Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: fixing compile issue with !CONFIG_PCI in legacy_serial.c	Kumar Gala	1	-1/+11
	Only build in support for ISA and PCI cases if we have enabled CONFIG_ISA and CONFIG_PCI. Additionally, isa_bridge is a global so we shouldn't use it a parameter name since it gets redefined to NULL when !CONFIG_PCI. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] ppc32: cpm_uart: fix xchar sending	Aristeu Sergio Rozanski Filho	1	-1/+1
	while using SCC as uart and as serial console at same time I got this: [ 138.214258] Oops: kernel access of bad area, sig: 11 [#1] [ 138.218832] PREEMPT [ 138.221021] NIP: C0105C48 LR: C0105E60 SP: C03D5D10 REGS: c03d5c60 TRAP: 0300 Not tainted [ 138.229280] MSR: 00009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 [ 138.234713] DAR: 00000000, DSISR: C0000000 [ 138.238745] TASK = c0349420[693] 'sh' THREAD: c03d4000 [ 138.243754] Last syscall: 6 [ 138.246402] GPR00: FEFFFFFF C03D5D10 C0349420 C01FB094 00000011 00000000 C1ECFBBC C01F24B0 [ 138.254602] GPR08: FF002820 00000000 FF0028C0 00000000 19133615 A0CBCD5E 02000300 00000000 [ 138.262804] GPR16: 00000000 01FF9E4C 00000000 7FA9A770 00000000 00000000 1003E2A8 00000000 [ 138.271003] GPR24: 100562F4 7F9B6EF4 C0210000 C02A5338 C01FB094 00000000 C01FB094 C1F14574 [ 138.279376] NIP [c0105c48] cpm_uart_tx_pump+0x4c/0x22c [ 138.284419] LR [c0105e60] cpm_uart_start_tx+0x38/0xb0 [ 138.289361] Call trace: [ 138.291762] [c0105e60] cpm_uart_start_tx+0x38/0xb0 [ 138.296547] [c010277c] uart_send_xchar+0x88/0x118 [ 138.301244] [c01029a0] uart_unthrottle+0x6c/0x138 [ 138.305942] [c00ece10] check_unthrottle+0x60/0x64 [ 138.310641] [c00ecec4] reset_buffer_flags+0xb0/0x138 [ 138.315595] [c00ecf64] n_tty_flush_buffer+0x18/0x78 [ 138.320465] [c00e81b0] tty_ldisc_flush+0x64/0x7c [ 138.325078] [c010410c] uart_close+0xf0/0x2c8 [ 138.329348] [c00e9c48] release_dev+0x724/0x8d4 [ 138.333790] [c00e9e18] tty_release+0x20/0x3c [ 138.338061] [c006e544] __fput+0x178/0x1e0 [ 138.342076] [c006c43c] filp_close+0x54/0xac [ 138.346261] [c0002d90] ret_from_syscall+0x0/0x44 [ 138.352386] note: sh[693] exited with preempt_count 2 a easy way to reproduce it is log into the system using ssh and do: cat >/dev/ttyCPM0 then, switch to minicom and write some stuff on it back to ssh, a control C produce the oops this happens because uart_close calls uart_shutdown which frees xmit.buf, currently used by xchar sending in cpm_uart_tx_pump(), which seems wrong. the attached patch fixes the oops and also fixes xchar sending. Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] Small fix in eeh definitions when CONFIG_EEH not enabled	Haren Myneni	1	-0/+3
	Undefined symbols (eeh_add_device_tree_early and eeh_remove_bus_device) when EEH is not enabled. This small patch will fix this. Acked-by: Linas Vepstas <linas@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: Fix search for the main interrupt controller	Haren Myneni	1	-6/+9
	At present, we are not looking at all interrupt controller nodes in the device tree even though the proper node was not found. This is causing the system panic. The attached patch will scan all nodes until it finds the proper interrupt controller type. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250	Kumar Gala	3	-4/+3
	In setup_arch and setup_system call find_legacy_serial_ports() if we build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: added a udbg_progress	Kumar Gala	3	-7/+8
	Added a common udbg_progress for use by ppc_md.progress() Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: Add the ability to handle SOC ports in legacy_serial	Kumar Gala	1	-10/+52
	Add the ability to configure and initialize legacy 8250 serials ports on an SOC bus. Also, fixed an issue that we would not configure any serial ports if "linux,stdout-path" was not found. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: Loosen udbg_probe_uart_speed sanity checking	Kumar Gala	1	-1/+1
	The checking of the baudrate in udbg_probe_uart_speed was too tight and would cause reporting back of the default baud rate in cases where the computed speed was valid. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-08	Merge branch 'for-linus' of ↵	Linus Torvalds	12	-163/+250
	git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
2006-01-09	[PATCH] powerpc: don't add memory to empty node/zone	Mike Kravetz	1	-5/+14
	The system will oops if an attempt is made to add memory to an empty node/zone. This patch prevents adding memory to an empty node. The code to dynamically add a node/zone is non-trivial. This patch is temporary and will be removed when the ability to dynamically add a node/zone is complete. Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-09	[PATCH] powerpc: fix two build warnings	Arnd Bergmann	2	-2/+1
	Building the arch/powerpc tree currently gives me two warnings with gcc-4.0: arch/powerpc/mm/imalloc.c: In function '__im_get_area': arch/powerpc/mm/imalloc.c:225: warning: 'tmp' may be used uninitialized in this function arch/powerpc/mm/hugetlbpage.c: In function 'hugetlb_get_unmapped_area': arch/powerpc/mm/hugetlbpage.c:608: warning: unused variable 'vma' both fixes are trivial. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-08	[PATCH] Make vm86 support optional	Matt Mackall	7	-3/+37
	This adds an option to remove vm86 support under CONFIG_EMBEDDED. Saves about 5k. This version eliminates most of the #ifdefs of the previous version and instead uses function stubs in vm86.h. Also, release_vm86_irqs is moved from asm-i386/irq.h to a more appropriate home in vm86.h so that the stubs can live together. $ size vmlinux-baseline vmlinux-novm86 text data bss dec hex filename 2920821 523232 190652 3634705 377611 vmlinux-baseline 2916268 523100 190492 3629860 376324 vmlinux-novm86 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Configure ELF core dump support	Matt Mackall	2	-2/+8
	configurable support for ELF core dumps text data bss dec hex filename 3330172 529036 190556 4049764 3dcb64 vmlinux-baseline 3325552 528912 190556 4045020 3db8dc vmlinux-no-elf add/remove: 0/8 grow/shrink: 0/0 up/down: 0/-4424 (-4424) function old new delta fill_note 32 - -32 maydump 58 - -58 dump_seek 67 - -67 writenote 180 - -180 elf_dump_thread_status 274 - -274 fill_psinfo 308 - -308 fill_prstatus 466 - -466 elf_core_dump 3039 - -3039 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Make *[ug]id16 support optional	Matt Mackall	19	-65/+28
	Configurable 16-bit UID and friends support This allows turning off the legacy 16 bit UID interfaces on embedded platforms. text data bss dec hex filename 3330172 529036 190556 4049764 3dcb64 vmlinux-baseline 3328268 529040 190556 4047864 3dc3f8 vmlinux From: Adrian Bunk <bunk@stusta.de> UID16 was accidentially disabled for !EMBEDDED. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Make x86 doublefault handling optional	Matt Mackall	3	-1/+13
	This adds configurable support for doublefault reporting on x86 add/remove: 0/3 grow/shrink: 0/1 up/down: 0/-13048 (-13048) function old new delta cpu_init 846 786 -60 doublefault_fn 188 - -188 doublefault_stack 4096 - -4096 doublefault_tss 8704 - -8704 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Trim non-IPX builds	Matt Mackall	1	-3/+1
	trivial: drop unused 802.3 code if we compile without IPX (originally from http://wohnheim.fh-wedel.de/~joern/software/kernel/je/25/) Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Uninline some fslocks.c functions	Matt Mackall	1	-4/+3
	uninline some file locking functions add/remove: 3/0 grow/shrink: 0/15 up/down: 256/-1525 (-1269) function old new delta locks_free_lock - 134 +134 posix_same_owner - 69 +69 __locks_delete_block - 53 +53 posix_locks_conflict 126 108 -18 locks_remove_posix 266 237 -29 locks_wake_up_blocks 121 87 -34 locks_block_on_timeout 83 47 -36 locks_insert_block 157 120 -37 locks_delete_block 62 23 -39 posix_unblock_lock 104 59 -45 posix_locks_deadlock 162 100 -62 locks_delete_lock 228 119 -109 sys_flock 338 217 -121 __break_lease 600 474 -126 lease_init 252 122 -130 fcntl_setlk64 793 649 -144 fcntl_setlk 793 649 -144 __posix_lock_file 1477 1026 -451 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Uninline some inode.c functions	Matt Mackall	1	-2/+2
	uninline a couple inode.c functions add/remove: 2/0 grow/shrink: 0/5 up/down: 256/-428 (-172) function old new delta ifind - 136 +136 ifind_fast - 120 +120 ilookup5_nowait 131 80 -51 ilookup 158 71 -87 ilookup5 171 80 -91 iget_locked 190 95 -95 iget5_locked 240 136 -104 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Uninline some open.c functions	Matt Mackall	1	-3/+3
	uninline some open.c functions add/remove: 3/0 grow/shrink: 0/6 up/down: 679/-1166 (-487) function old new delta do_sys_truncate - 336 +336 do_sys_ftruncate - 317 +317 __put_unused_fd - 26 +26 put_unused_fd 57 49 -8 sys_close 150 119 -31 sys_ftruncate64 260 26 -234 sys_ftruncate 272 24 -248 sys_truncate 339 25 -314 sys_truncate64 336 5 -331 Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tiny: Add bloat-o-meter to scripts	Matt Mackall	1	-0/+58
	This is a rewrite of Andi Kleen's bloat-o-meter with sorting and reporting of gainers/decliners. Sample output: add/remove: 0/8 grow/shrink: 2/0 up/down: 88/-4424 (-4336) function old new delta __copy_to_user_ll 59 103 +44 __copy_from_user_ll 59 103 +44 fill_note 32 - -32 maydump 58 - -58 dump_seek 67 - -67 writenote 180 - -180 elf_dump_thread_status 274 - -274 fill_psinfo 308 - -308 fill_prstatus 466 - -466 elf_core_dump 3039 - -3039 The summary line says: no functions added, 8 removed two functions grew, none shrunk we gained 88 bytes and lost 4424 (or -4336 net) This work was sponsored in part by CE Linux Forum Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] vr41xx: section tags fix	Jean Delvare	1	-2/+2
	module_init functions must be tagged __init rather than __devinit; likewise, module_exit functions must be tagged __exit rather than __devexit. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Yoichi Yuasa <yuasa@hh.iij4u.or.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cciss: avoid defining useless MAJOR_NR macro	Christoph Hellwig	2	-3/+1
	This sneaked in with one of the updates. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/isdn/hardware/eicon/os_4bri.c: correct the xdiLoadFile() ↵	Adrian Bunk	1	-1/+2
	signature It's not good if caller and callee disagree regarding the type of the arguments. In this case, this could cause problems on 64bit architectures. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Armin Schindler <armin@melware.de> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/isdn/: add missing #includes	Adrian Bunk	3	-0/+4
	Every file should #include the headers containing the prototypes for its global functions. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Armin Schindler <armin@melware.de> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] simplify k_getrusage()	Oleg Nesterov	1	-18/+11
	Factor out common code for different RUSAGE_xxx cases. Don't take ->sighand->siglock in RUSAGE_SELF case, suggested by Ravikiran G Thirumalai <kiran@scalex86.org>. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] i810_audio: request_irq() fix	Andrew Morton	1	-10/+8
	Request the IRQ after having set everything up. Otherwise a shared interrupt at the right time can kill the machine. Found this with David Woodhouse <dwmw2@infradead.org>'s debug-shared-irqs.patch. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] kconf: Check for eof from input stream.	Ben Collins	1	-2/+16
	Signed-off-by: Ben Collins <bcollins@ubuntu.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fix workqueue oops during cpu offline	Nathan Lynch	1	-6/+10
	Use first_cpu(cpu_possible_map) for the single-thread workqueue case. We used to hardcode 0, but that broke on systems where !cpu_possible(0) when workqueue_struct->cpu_workqueue_struct was changed from a static array to alloc_percpu. Commit id bce61dd49d6ba7799be2de17c772e4c701558f14 ("Fix hardcoded cpu=0 in workqueue for per_cpu_ptr() calls") fixed that for Ben's funky sparc64 system, but it regressed my Power5. Offlining cpu 0 oopses upon the next call to queue_work for a single-thread workqueue, because now we try to manipulate per_cpu_ptr(wq->cpu_wq, 1), which is uninitialized. So we need to establish an unchanging "slot" for single-thread workqueues which will have a valid percpu allocation. Since alloc_percpu keys off of cpu_possible_map, which must not change after initialization, make this slot == first_cpu(cpu_possible_map). Signed-off-by: Nathan Lynch <ntl@pobox.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/block: Use ARRAY_SIZE macro	Tobias Klauser	7	-15/+14
	Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a duplicate of ARRAY_SIZE. Some trailing whitespaces are also removed. drivers/block/acsi* has been left out as it's marked BROKEN. Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] remove semicolons from save_flags()	Andrew Morton	1	-1/+1
	Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Make apm buildable without legacy pm	Dave Jones	2	-1/+5
	APM doesn't _need_ the PM_LEGACY junk, so remove it's dependancy from Kconfig, and ifdef the junk in the code. Whilst the ifdefs are ugly, when the legacy stuff gets ripped out so will the ifdefs. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] i4l: __attribute__((packed)) for the CAPI message structs	Jan Blunck	1	-44/+44
	The CAPI message structs itself should be packed and not the location of single fields in the structure. Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] afs: remove unnecessary __attribute__((packed))	Jan Blunck	1	-3/+1
	Remove the unnecessary __attribute__((packed)) since the enum itself is packed and not the location of it in the structure. Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Eliminate __attribute__ ((packed)) warnings for gcc-4.1	Jan Blunck	15	-341/+341
	Since version 4.1 the gcc is warning about ignored attributes. This patch is using the equivalent attribute on the struct instead of on each of the structure or union members. GCC Manual: "Specifying Attributes of Types packed This attribute, attached to struct or union type definition, specifies that each member of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used. Specifying this attribute for struct and union types is equivalent to specifying the packed attribute on each of the structure or union members." Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] parport: bring back an unused phase for ppdev ioctl	Marko Kohtala	1	-1/+3
	Earlier fix removed unused phase, but that changed the values for other phases. Since these are exposed to userspace through ppdev, it is safer not to change them. Restore the unused phase value. Signed-off-by: Marko Kohtala <marko.kohtala@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] parport_pc: arm build fix	Andrew Morton	1	-3/+3
	free_dma() isn't implemented on ARM unless HAS_DMA is set. Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Add a section about inlining to Documentation/CodingStyle	Arjan van de Ven	1	-3/+31
	Adds a bit of text to Documentation/Codingstyle to state that inlining everything "just because" is a bad idea Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] hw_random: 82801AB PCI Bridge support	Eric Van Buggenhaut	1	-0/+1
	pci.lst lists device 80862430 as another RNG # grep 80862430 /lib/discover/pci.lst 80862430 bridge i810_rng 82801AB PCI Bridge but it's not listed in rng_pci_tbl[] Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fix gcc4.1 build failure on xconfig	Dave Jones	1	-3/+3
	scripts/kconfig/qconf.h:25: error: extra qualification âConfigSettings::â on member âreadSizesâ scripts/kconfig/qconf.h:26: error: extra qualification âConfigSettings::â on member âwriteSizesâ scripts/kconfig/qconf.h:127: error: extra qualification âConfigList::â on member âupdateMenuListâ Signed-off-by: Dave Jones <davej@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] v9fs: handle kthread_create failure, minor bugfixes	Latchesar Ionkov	9	-71/+58
	- remove unnecessary -ENOMEM assignments - return correct value when buf_check_size for second time in a buffer - handle failures when create_workqueue and kthread_create are called - use kzalloc instead of kmalloc/memset 0 - v9fs_str_copy and v9fs_str_compare were buggy, were used only in one place, correct the logic and move it to the place it is used. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] v9fs: zero copy implementation	Latchesar Ionkov	18	-1047/+1083
	Performance enhancement reducing the number of copies in the data and stat paths. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09	powerpc: Fix some #ifndef __KERNEL__ that should be #ifdef	Paul Mackerras	3	-3/+3
	Grrr.... Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-01-08	[PATCH] v9fs: fix fid management in v9fs_create	Latchesar Ionkov	1	-3/+3
	v9fs_create doesn't manage correctly the fids when it is called to create a directory.. The fid created by the create 9P call (newfid) and the one created by walking to already created file (wfidno) are not used consistently. This patch cleans up the usage of newfid and wfidno. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] v9fs: new multiplexer implementation	Latchesar Ionkov	16	-562/+1172
	New multiplexer implementation. Decreases the number of kernel threads required. Better handling when the user process receives a signal. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] v9fs: fix fd_close	Eric Van Hensbergen	1	-2/+2
	If a 9pfs server crashes, v9fs_fd_close() is called. Subsequently, in cleaning up by performing a umount() on the FS that was provided by this server v9fs_fd_close() is called again, and uses the old, freed valus of trans->priv. This patch ensures that trans->priv can be freed only once, otherwise this function bails early. Signed-off-by: Michal Ostrowski <mostrows@watson.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Split out screen_info from tty.h	Brian Gerst	5	-112/+80
	This makes it possible for boot code to use screen_info without dragging in all of tty.h. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] CREDITS update: Eugene Surovegin	Eugene Surovegin	1	-1/+1
	Add EMAC to my CREDITS entry. Signed-off-by: Eugene Surovegin <ebs@ebshome.net Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] SubmittingPatches: diffstat options	Randy Dunlap	1	-0/+3
	Add desired 'diffstat' options to use for kernel patches. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] UFS: inode->i_sem is not released in error path	Evgeniy Polyakov	1	-1/+3
	Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fs/smbfs/proc.c: fix data corruption in smb_proc_setattr_unix()	Maciej W. Rozycki	1	-1/+1
	This patch fixes a data corruption in smb_proc_setattr_unix() (smb_filetype_from_mode() returns an u32, and there are only four bytes reserved for it in data. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] PTRACE_SYSEMU is only for i386 and clashes with other ptrace codes ↵	Paolo 'Blaisorblade' Giarrusso	2	-2/+3
	of other archs PTRACE_SYSEMU{,_SINGLESTEP} is actually arch specific, for now, and the current allocated number clashes with a ptrace code of frv, i.e. PTRACE_GETFDPIC. I should have submitted this much earlier, anyway we get no breakage for this. CC: Daniel Jacobowitz <dan@debian.org> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] kernel/module.c: remove redundant spinlock in resolve_symbol()	Ashutosh Naik	1	-2/+0
	Remove the redundant spinlock in the function resolve_symbol() as we are not altering the module list, and we already hold the semaphore. Signed-off-by: Ashutosh Naik <ashutosh.naik@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] shrink struct page	Andrew Morton	1	-19/+22
	Reduce the size of the pageframe for NR_CPUS>4, CONFIG_PREEMPT back to the minimal size by unionising both ->private and ->mapping with the pagetable lock. It uses an anonymous struct and hence requires gcc-3.x. Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] aio: reorder kiocb structure elements to make sync iocb setup faster	Benjamin LaHaise	1	-5/+7
	Reorder members of the kiocb structure to make sync kiocb setup faster. By setting the elements sequentially, the write combining buffers on the CPU are able to combine the writes into a single burst, which results in fewer cache cycles being consumed, freeing them up for other code. This results in a 10-20KB/s[] increase on the bw_unix part of LMbench on my test system. The improvement varies based on what other patches are in the system, as there are a number of bottlenecks, so this number is not absolutely accurate. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] modules: mark TAINT_FORCED_RMMOD correctly	Akinobu Mita	1	-5/+5
	Currently TAINT_FORCED_RMMOD is totally unused. Because it is marked as TAINT_FORCED_MODULE instead when user forced a module unload. This patch marks it correctly Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] modules: prevent overriding of symbols	Ashutosh Naik	1	-0/+39
	Ensure that an exported symbol does not already exist in the kernel or in some other module's exported symbol table. This is done by checking the symbol tables for the exported symbol at the time of loading the module. Currently this is done after the relocation of the symbol. Signed-off-by: Ashutosh Naik <ashutosh.naik@gmail.com> Signed-off-by: Anand Krishnan <anandhkrishnan@yahoo.co.in> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] sonypi: Enable ACPI events for Sony laptop hotkeys	Ben Collins	1	-0/+45
	Signed-off-by: Ben Collins <bcollins@ubuntu.com> Cc: "Brown, Len" <len.brown@intel.com> Cc: linux-acpi@vger.kernel.org Cc: Stelian Pop <stelian@popies.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Sonypi: convert to the new platform device interface	Dmitry Torokhov	1	-152/+206
	Do not use platform_device_register_simple() as it is going away, implement ->probe() and -remove() functions so manual binding and unbinding will work with this driver. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Cc: Stelian Pop <stelian@popies.net> Cc: Mattia Dongili <malattia@linux.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fs/proc/: function prototypes belong in header files	Adrian Bunk	4	-2/+9
	Function prototypes belong into header files. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] /dev/mem: validate mmap requests	Bjorn Helgaas	3	-51/+124
	Add a hook so architectures can validate /dev/mem mmap requests. This is analogous to validation we already perform in the read/write paths. The identity mapping scheme used on ia64 requires that each 16MB or 64MB granule be accessed with exactly one attribute (write-back or uncacheable). This avoids "attribute aliasing", which can cause a machine check. Sample problem scenario: - Machine supports VGA, so it has uncacheable (UC) MMIO at 640K-768K - efi_memmap_init() discards any write-back (WB) memory in the first granule - Application (e.g., "hwinfo") mmaps /dev/mem, offset 0 - hwinfo receives UC mapping (the default, since memmap says "no WB here") - Machine check abort (on chipsets that don't support UC access to WB memory, e.g., sx1000) In the scenario above, the only choices are - Use WB for hwinfo mmap. Can't do this because it causes attribute aliasing with the UC mapping for the VGA MMIO space. - Use UC for hwinfo mmap. Can't do this because the chipset may not support UC for that region. - Disallow the hwinfo mmap with -EINVAL. That's what this patch does. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] /dev/mem __HAVE_PHYS_MEM_ACCESS_PROT tidy-up	Bjorn Helgaas	1	-9/+14
	Tidy up __HAVE_PHYS_MEM_ACCESS_PROT usage to make mmap_mem() easier to read. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] remove gcc-2 checks	Andrew Morton	32	-194/+37
	Remove various things which were checking for gcc-1.x and gcc-2.x compilers. From: Adrian Bunk <bunk@stusta.de> Some documentation updates and removes some code paths for gcc < 3.2. Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Abandon gcc-2.95.x	Andrew Morton	3	-37/+1
	There's one scsi driver which doesn't compile due to weird __VA_ARGS__ tricks and the rather useful scsi/sd.c is currently getting an ICE. None of the new SAS code compiles, due to extensive use of anonymous unions. The V4L guys are very good at exploiting the gcc-2.95.x macro expansion bug (_why_ does each driver need to implement its own debug macros?) and various people keep on sneaking in anonymous unions, which are rather nice. Plus anonymous unions are rather useful. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] copy_process: error path cleanup	Oleg Nesterov	1	-6/+2
	This patch moves 'fork_out:' under 'bad_fork_free:', and removes now unneeded 'if (retval)' check. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fs/udf/balloc.c: "extern inline" -> "static inline"	Adrian Bunk	1	-1/+1
	"extern inline" doesn't make much sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] block/stat.txt	Andy Isaacson	1	-0/+82
	I couldn't find any docs explaining the contents of /sys/block/<dev>/stat, so I wrote up the following. I'm not completely sure it's accurate - Jens, could you give a yea or nay on this? In particular, the counts of read/write IOs and read/write sectors are incremented in different places - it looks like they both increment as the request is being finished, but I'm not completely sure of that. Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] setpgid: should not accept ptraced childs	Oleg Nesterov	1	-1/+1
	sys_setpgid() allows to change ->pgrp of ptraced childs. 'man setpgid' does not tell anything about that, so I consider this behaviour is a bug. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Oren Laadan <orenl@cs.columbia.edu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] setpgid: should work for sub-threads	Oren Laadan	2	-10/+8
	setsid() does not work unless the calling process is a thread_group_leader(). 'man setpgid' does not tell anything about that, so I consider this behaviour is a bug. Signed-off-by: Oren Laadan <orenl@cs.columbia.edu> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] setpgid: should work for sub-threads	Oleg Nesterov	1	-5/+6
	setpgid(0, pgid) or setpgid(forked_child_pid, pgid) does not work unless the calling process is a thread_group_leader(). 'man setpgid' does not tell anything about that, so I consider this behaviour is a bug. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Oren Laadan <orenl@cs.columbia.edu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fork: fix race in setting child's pgrp and tty	Oren Laadan	1	-6/+3
	In fork, child should recopy parent's pgrp/tty after it has tasklist_lock. Otherwise following a setpgid() on the parent, after copy_signal(), the child will own a stale pgrp (which may be reused); (eg. if copy_mm() sleeps a long while due to memory pressure). Similar issue for the tty. Signed-off-by: Oren Laadan <orenl@cs.columbia.edu> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cciss: adds MSI and MSI-X support	Mike Miller	3	-13/+84
	This creates a new function, cciss_interrupt_mode called from cciss_pci_init. This function determines what type of interrupt vector to use, i.e., MSI, MSI-X, or IO-APIC. One noticeable difference is changing the interrupt field of the controller struct to an array of 4 unsigned ints. The Smart Array HW is capable of generating 4 distinct interrupts depending on the transport method in use during operation. These are: #define DOORBELL_INT 0 Used to notify the contoller of configuration updates. We only use this feature when in polling mode. #define PERF_MODE_INT 0 Used when the controller is in Performant Mode. #define SIMPLE_MODE_INT 2 Used when the controller is in Simple Mode (current Linux implementation). #define MEMQ_INT_MODE 3 Not used. When using IO-APIC interrupts these 4 lines are OR'ed together so when any one fires an interrupt an is generated. In MSI or MSI-X mode this hardware OR'ing is ignored. We must register for our interrupt depending on what mode the controller is running. For Linux we use SIMPLE_MODE_INT exclusively at this time. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tpmdd: remove global event log	Kylene Jo Hall	1	-30/+62
	Remove global event log in the tpm bios event measurement log code that would have caused problems when the code was run concurrently. A log is now allocated and attached to the seq file upon open and destroyed appropriately. Signed-off-by: Kylene Jo Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Don't attempt to power off if power off is not implemented	Eric W. Biederman	4	-0/+17
	The problem. It is expected that /sbin/halt -p works exactly like /sbin/halt, when the kernel does not implement power off functionality. The kernel can do a lot of work in the reboot notifiers and in device_shutdown before we even get to machine_power_off. Some of that shutdown is not safe if you are leaving the power on, and it definitely gets in the way of using sysrq or pressing ctrl-alt-del. Since the shutdown happens in generic code there is no way to fix this in architecture specific code :( Some machines are kernel oopsing today because of this. The simple solution is to turn LINUX_REBOOT_CMD_POWER_OFF into LINUX_REBOOT_CMD_HALT if power_off functionality is not implemented. This has the unfortunate side effect of disabling the power off functionality on architectures that leave pm_power_off to null and still implement something in machine_power_off. And it will break the build on some architectures that don't have a pm_power_off variable at all. On both counts I say tough. For architectures like alpha that don't implement the pm_power_off variable pm_power_off is declared in linux/pm.h and it is a generic part of our power management code, and all architectures should implement it. For architectures like parisc that have a default power off method in machine_power_off if pm_power_off is not implemented or fails. It is easy enough to set the pm_power_off variable. And nothing bad happens there, the machines just stop powering off. The current semantics are impossible without a flag at the top level so we can avoid the problem code if a power off is not implemented. pm_power_off is as good a flag as any with the bonus that it works without modification on at least x86, x86_64, powerpc, and ppc today. Andrew can you pick this up and put this in the mm tree. Kernels that don't compile or don't power off seem saner than kernels that oops or panic. Until we get the arch specific patches for the problem architectures this probably isn't smart to push into the stable kernel. Unfortunately I don't have the time at the moment to walk through every architecture and make them work. And even if I did I couldn't test it :( From: Hirokazu Takata <takata@linux-m32r.org> Add pm_power_off() for build fix of arch/m32r/kernel/process.c. From: Miklos Szeredi <miklos@szeredi.hu> UML build fix Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Hayato Fujiwara <fujiwara@linux-m32r.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fadvise: return ESPIPE on FIFO/pipe	Valentine Barshak	1	-0/+5
	The patch makes posix_fadvise return ESPIPE on FIFO/pipe in order to be fully POSIX-compliant. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] update to the initramfs docs	Rob Landley	1	-1/+71
	Based on questions people have asked me. Repeatedly. Signed-off-by: Rob Landley <rob@landley.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Extend RCU torture module to test tickless idle CPU	Srivatsa Vaddagiri	1	-5/+91
	This patch forces RCU torture threads off various CPUs in the system allowing them to become idle and go tickless. Meant to test support for such tickless idle CPU in RCU. Signed-off-by: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Add tainting for proprietary helper modules	Dave Jones	1	-0/+5
	Kernels that have had Windows drivers loaded into them are undebuggable. I've wasted a number of hours chasing bugs filed in Fedora bugzilla only to find out much later that the user had used such 'helpers', and their problems were unreproducable without them loaded. Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] remove unused blkp field in percpu_data	Eric Dumazet	1	-1/+0
	I found that blkp field was not used in kernel tree. As most of the times NR_CPUS is a power of two and kmalloc() memory blocks too, this extra field basically doubles the memory space allocated in __alloc_percpu() to store the 'struct percpu_data' (for example, if NR_CPUS=8 on i386, kmalloc(4*8+4) returns a 64 bytes block instead of a 32 bytes block after this patch) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fs: remove s_old_blocksize from struct super_block	Pekka Enberg	2	-3/+1
	This patch inlines the single user of struct super_block field s_old_blocksize and removes the field. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Documentation: Small applying-patches.txt update	Jesper Juhl	1	-11/+18
	Minor update to Documentation/applying-patches.txt Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/mfd: header included twice	Nicolas Kaiser	1	-1/+0
	linux/delay.h included twice Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Fix handling of ELF segments with zero filesize	David Gibson	1	-3/+9
	mmap() returns -EINVAL if given a zero length, and thus elf_map() in binfmt_elf.c does likewise if it attempts to map a (page-aligned) ELF segment with zero filesize. Such a situation never arises with the default linker scripts, but there's nothing inherently wrong with zero-filesize (but non-zero memsize) ELF segments. Custom linker scripts can generate them, and the kernel should be able to map them; this patch makes it so. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/connector/cn_proc.c typos	David S. Miller	1	-2/+2
	The parameter to put_cpu_var() is unreferenced by the implementation, and the compiler doesn't try to comprehend comments, so this wouldn't cause any problem, but if bugged me enough to post a fix :-) Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] shrink dentry struct	Eric Dumazet	15	-48/+54
	Some long time ago, dentry struct was carefully tuned so that on 32 bits UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple of memory cache lines. Then RCU was added and dentry struct enlarged by two pointers, with nice results for SMP, but not so good on UP, because breaking the above tuning (128 + 8 = 136 bytes) This patch reverts this unwanted side effect, by using an union (d_u), where d_rcu and d_child are placed so that these two fields can share their memory needs. At the time d_free() is called (and d_rcu is really used), d_child is known to be empty and not touched by the dentry freeing. Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so the previous content of d_child is not needed if said dentry was unhashed but still accessed by a CPU because of RCU constraints) As dentry cache easily contains millions of entries, a size reduction is worth the extra complexity of the ugly C union. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: Ian Kent <raven@themaw.net> Cc: Paul Jackson <pj@sgi.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] nfsroot: do not silently stop parsing on an unknown option	Jorn Dreyer	1	-1/+3
	It would be helpful if the kernel did not silently stop parsing nfs options, but instead warned about any he does not recognize. The attached patch adds one printk to do just that. It took me a couple of hours to find my configuration mistake. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] sigio: cleanup, don't take tasklist twice	Oleg Nesterov	1	-3/+3
	The only user of send_sigio_to_task() already holds tasklist_lock, so it is better not to send the signal via send_group_sig_info() (which takes tasklist recursively) but use group_send_sig_info(). The same change in send_sigurg()->send_sigurg_to_task(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] remove unneeded sig->curr_target recalculation	Oleg Nesterov	1	-2/+0
	This patch removes unneeded sig->curr_target recalculation under 'if (atomic_dec_and_test(&sig->count))' in __exit_signal(). When sig->count == 0 the signal can't be sent to this task and next_thread(tsk) == tsk anyway. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] MAINTAINERS: line duplication	Nicolas Kaiser	1	-1/+0
	uniq -d MAINTAINERS Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] ext3: use sbi instead of EXT3_SB() in resize code.	Glauber de Oliveira Costa	1	-5/+5
	There are places in the resize code in which EXT3_SB() macro is used after an statement like sbi = EXT3_SB(sb) is done. Inside the same function, both sbi and EXT3_SB() are used to reference the super block Altough it is not wrong, keeping it coherent increases legibility, IMHO. Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] ext3: remove trailing newlines from ext3_warning() calls	Glauber de Oliveira Costa	3	-15/+15
	Remove the trailing newlines in calls to ext3_warning(). This function already adds a trailing newline to the end of messages. Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] oprofile: Use vmalloc_node() in alloc_cpu_buffers()	Eric Dumazet	1	-1/+2
	Make oprofile alloc_cpu_buffers() function NUMA aware, allocating each CPU local buffer in its memory node if possible. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Philippe Elie <phil.el@wanadoo.fr> Cc: John Levon <levon@movementarian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] ext3: external journal device as a mount option	Johann Lombardi	2	-10/+49
	The patch below adds a new mount option to allow the external journal device to be specified. The syntax is as follows: # mount -t ext3 -o journal_dev=0x0820 ... where 0x0820 means major=8 and minor=32. Signed-off-by: Johann Lombardi <johann.lombardi@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] shared mounts: cleanup	Miklos Szeredi	4	-4/+5
	Small cleanups in shared mounts code. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Ram Pai <linuxram@us.ibm.com> Cc: <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] pivot_root: add comment	Neil Brown	1	-0/+4
	Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Updated CPU hotplug documentation	Ashok Raj	1	-0/+357
	Thanks to Nathan Lynch for the review and comments. Thanks to Joel Schopp for the pointer to add user space scipts. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] tpm: add bios measurement log	Kylene Jo Hall	5	-0/+531
	According to the TCG specifications measurements or hashes of the BIOS code and data are extended into TPM PCRS and a log is kept in an ACPI table of these extensions for later validation if desired. This patch exports the values in the ACPI table through a security-fs seq_file. Signed-off-by: Seiji Munetoh <munetoh@jp.ibm.com> Signed-off-by: Stefan Berger <stefanb@us.ibm.com> Signed-off-by: Reiner Sailer <sailer@us.ibm.com> Signed-off-by: Kylene Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] little do_group_exit() cleanup	Oleg Nesterov	1	-1/+0
	zap_other_threads() sets SIGNAL_GROUP_EXIT at the very start, do_group_exit() doesn't need to do it. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] do_coredump() should reset group_stop_count earlier	Oleg Nesterov	1	-1/+1
	__group_complete_signal() sets ->group_stop_count in sig_kernel_coredump() path and marks the target thread as ->group_exit_task. So any thread except group_exit_task will go to handle_group_stop()->finish_stop(). However, when group_exit_task actually starts do_coredump(), it sets SIGNAL_GROUP_EXIT, but does not reset ->group_stop_count while killing other threads. If we have not yet stopped threads in the same thread group, they all will spin in kernel mode until group_exit_task sends them SIGKILL, because ->group_stop_count > 0 means: recalc_sigpending_tsk() never clears TIF_SIGPENDING get_signal_to_deliver() goes to handle_group_stop() handle_group_stop() returns when SIGNAL_GROUP_EXIT set syscall_exit/resume_userspace notice TIF_SIGPENDING, call get_signal_to_deliver() again. So we are wasting cpu cycles, and if one of these threads is rt_task() this may be a serious problem. NOTE: do_coredump() holds ->mmap_sem, so not stopped threads can't escape coredumping after clearing ->group_stop_count. See also this thread: http://marc.theaimsgroup.com/?t=112739139900002 Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] kill_proc_info_as_uid: don't use hardcoded constants	Oleg Nesterov	1	-2/+1
	Use symbolic names instead of hardcoded constants. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fix possible PAGE_CACHE_SHIFT overflows	Andrew Morton	8	-17/+17
	We've had two instances recently of overflows when doing 64_bit_value = (32_bit_value << PAGE_CACHE_SHIFT) I did a tree-wide grep of `<<.*PAGE_CACHE_SHIFT' and this is the result. - afs_rxfs_fetch_descriptor.offset is of type off_t, which seems broken. - jfs and jffs are limited to 4GB anyway. - reiserfs map_block_for_writepage() takes an unsigned long for the block - it should take sector_t. (It'll fail for huge filesystems with blocksize<PAGE_CACHE_SIZE) - cramfs_read() needs to use sector_t (I think cramsfs is busted on large filesystems anyway) - affs is limited in file size anyway. - I generally didn't fix 32-bit overflows in directory operations. - arm's __flush_dcache_page() is peculiar. What if the page lies beyond 4G? - gss_wrap_req_priv() needs checking (snd_buf->page_base) Cc: Oleg Drokin <green@linuxhacker.ru> Cc: David Howells <dhowells@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <reiserfs-dev@namesys.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Unchecked alloc_percpu() return in __create_workqueue()	Ben Collins	1	-0/+5
	__create_workqueue() not checking return of alloc_percpu() NULL dereference was possible. Signed-off-by: Ben Collins <bcollins@ubuntu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] nbd: remove duplicate assignment	taneli.vahakangas@netsonic.fi	1	-1/+0
	<stuartm@connecttech.com> Sent by Paul Clements <paul.clements@steeleye.com>, who needs to read Documentation/SubmittingPatches.. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Add block_device_operations.getgeo block device method	Christoph Hellwig	30	-456/+340
	HDIO_GETGEO is implemented in most block drivers, and all of them have to duplicate the code to copy the structure to userspace, as well as getting the start sector. This patch moves that to common code [1] and adds a ->getgeo method to fill out the raw kernel hd_geometry structure. For many drivers this means ->ioctl can go away now. [1] the s390 block drivers are odd in this respect. xpram sets ->start to 4 always which seems more than odd, and the dasd driver shifts the start offset around, probably because of it's non-standard sector size. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@suse.de> Cc: <mike.miller@hp.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo Giarrusso <blaisorblade@yahoo.it> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] docs: updated some code docs	Xose Vazquez Perez	3	-28/+63
	Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] sigaction should clear all signals on SIG_IGN, not just < 32	George Anzinger	2	-2/+49
	While rooting aroung in the signal code trying to understand how to fix the SIG_IGN ploy (set sig handler to SIG_IGN and flood system with high speed repeating timers) I came across what, I think, is a problem in sigaction() in that when processing a SIG_IGN request it flushes signals from 1 to SIGRTMIN and leaves the rest. Attempt to fix this. Signed-off-by: George Anzinger <george@mvista.com> Cc: Roland McGrath <roland@redhat.com> Cc: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] keys: Permit running process to instantiate keys	David Howells	13	-245/+378
	Make it possible for a running process (such as gssapid) to be able to instantiate a key, as was requested by Trond Myklebust for NFS4. The patch makes the following changes: (1) A new, optional key type method has been added. This permits a key type to intercept requests at the point /sbin/request-key is about to be spawned and do something else with them - passing them over the rpc_pipefs files or netlink sockets for instance. The uninstantiated key, the authorisation key and the intended operation name are passed to the method. (2) The callout_info is no longer passed as an argument to /sbin/request-key to prevent unauthorised viewing of this data using ps or by looking in /proc/pid/cmdline. This means that the old /sbin/request-key program will not work with the patched kernel as it will expect to see an extra argument that is no longer there. A revised keyutils package will be made available tomorrow. (3) The callout_info is now attached to the authorisation key. Reading this key will retrieve the information. (4) A new field has been added to the task_struct. This holds the authorisation key currently active for a thread. Searches now look here for the caller's set of keys rather than looking for an auth key in the lowest level of the session keyring. This permits a thread to be servicing multiple requests at once and to switch between them. Note that this is per-thread, not per-process, and so is usable in multithreaded programs. The setting of this field is inherited across fork and exec. (5) A new keyctl function (KEYCTL_ASSUME_AUTHORITY) has been added that permits a thread to assume the authority to deal with an uninstantiated key. Assumption is only permitted if the authorisation key associated with the uninstantiated key is somewhere in the thread's keyrings. This function can also clear the assumption. (6) A new magic key specifier has been added to refer to the currently assumed authorisation key (KEY_SPEC_REQKEY_AUTH_KEY). (7) Instantiation will only proceed if the appropriate authorisation key is assumed first. The assumed authorisation key is discarded if instantiation is successful. (8) key_validate() is moved from the file of request_key functions to the file of permissions functions. (9) The documentation is updated. From: <Valdis.Kletnieks@vt.edu> Build fix. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] keys: Discard duplicate keys from a keyring on link	David Howells	2	-23/+68
	Cause any links within a keyring to keys that match a key to be linked into that keyring to be discarded as a link to the new key is added. The match is contingent on the type and description strings being the same. This permits requests, adds and searches to displace negative, expired, revoked and dead keys easily. After some discussion it was concluded that duplicate valid keys should probably be discarded also as they would otherwise hide the new key. Since request_key() is intended to be the primary method by which keys are added to a keyring, duplicate valid keys wouldn't be an issue there as that function would return an existing match in preference to creating a new key. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] keys: Permit key expiry time to be set	David Howells	5	-1/+63
	Add a new keyctl function that allows the expiry time to be set on a key or removed from a key, provided the caller has attribute modification access. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexander Zangerl <az@bond.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] kmsg_write: don't return printk return value	Guillaume Chazarain	1	-1/+4
	kmsg_write returns with printk, so some programs may be confused by a successful write() with a return value different than the buffer length. # /bin/echo something > /dev/kmsg /bin/echo: write error: Inappropriate ioctl for device The drawbacks is that the printk return value can no more be quickly checked from userspace. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] printk return value: fix it	Guillaume Chazarain	1	-3/+3
	What's the true meaning of the printk return value? Should it include the priority prefix length of 3? and what about the timing information? In both cases it was broken: strace -e write echo 1 > /dev/kmsg => write(1, "1\n", 2) = 5 strace -e write echo "<1>1" > /dev/kmsg => write(1, "<1>1\n", 5) = 8 The returned length was "length of input string + 3", I made it "length of string output to the log buffer". Note that I couldn't find any printk caller in the kernel interested by its return value besides kmsg_write. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Acked-By: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Fix overflow tests for compat_sys_fcntl64 locking	NeilBrown	1	-4/+18
	When making an fctl locking call through compat_sys_fcntl64 (i.e. a 32bit app on a 64bit kernel), the syscall can return a locking range that is in conflict with the queried lock. If some aspect of this range does not fit in the 32bit structure, something needs to be done. The current code is wrong in several respects: - It returns data to userspace even if no conflict was found i.e. it should check l_type for F_UNLCK - It returns -EOVERFLOW too agressively. A lock range covering the last possible byte of the file (start = COMPAT_OFF_T_MAX, len = 1) should be possible, but is rejected with the current test. - A extra-long 'len' should not be a problem. If only that part of the conflicting lock that would be visible to the 32bit app needs to be reported to the 32bit app anyway. This patch addresses those three issues and adds a comment to (hopefully) record it for posterity. Note: this patch mainly affects test-cases. Real applications rarely is ever see the problems. This patch has been tested (LSB test suite), and works. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@debian.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Fix some problems with truncate and mtime semantics.	NeilBrown	5	-23/+17
	SUS requires that when truncating a file to the size that it currently is: truncate and ftruncate should NOT modify ctime or mtime O_TRUNC SHOULD modify ctime and mtime. Currently mtime and ctime are always modified on most local filesystems (side effect of ->truncate) or never modified (on NFS). With this patch: ATTR_CTIME\|ATTR_MTIME are sent with ATTR_SIZE precisely when an update of these times is required whether size changes or not (via a new argument to do_truncate). This allows NFS to do the right thing for O_TRUNC. inode_setattr nolonger forces ATTR_MTIME\|ATTR_CTIME when the ATTR_SIZE sets the size to it's current value. This allows local filesystems to do the right thing for f?truncate. Also, the logic in inode_setattr is changed a bit so there are two return points. One returns the error from vmtruncate if it failed, the other returns 0 (there can be no other failure). Finally, if vmtruncate succeeds, and ATTR_SIZE is the only change requested, we now fall-through and mark_inode_dirty. If a filesystem did not have a ->truncate function, then vmtruncate will have changed i_size, without marking the inode as 'dirty', and I think this is wrong. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Permit multiple inclusion of linux/pagevec.h	David Howells	1	-0/+5
	Make it possible to include linux/pagevec.h multiple times without incurring errors due to duplicate definitions. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] vgacon: Workaround for resize bug in some chipsets	Antonino A. Daplas	1	-7/+21
	Reported from Redhat Bugzilla Bug 170450 "I updated to the development kernel and now during boot only the top of the text is visable. For example the monitor screen the is the lines and I can only see text in the asterisk area.
2006-01-08	[PATCH] vgacon: fix doublescan mode	Samuel Thibault	1	-1/+7
	When doublescan mode is in use, scanlines must be doubled. Thanks to Jason Dravet <dravet@hotmail.com> for reporting and testing. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] udf: remove bogus inode == NULL check in inode_bmap	Christoph Hellwig	1	-5/+0
	inode can never be NULL when calling this function. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] use ptrace_get_task_struct in various places	Christoph Hellwig	12	-241/+105
	The ptrace_get_task_struct() helper that I added as part of the ptrace consolidation is useful in variety of places that currently opencode it. Switch them to the common helpers. Add a ptrace_traceme() helper that needs to be explicitly called, and simplify the ptrace_get_task_struct() interface. We don't need the request argument now, and we return the task_struct directly, using ERR_PTR() for error returns. It's a bit more code in the callers, but we have two sane routines that do one thing well now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: Documentation cleanup, remove obsolete info	Tom Zanussi	1	-23/+34
	librelay and relay-app.h have been retired - update Documentation to reflect that. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: cleanup, change relayfs_file_* to relay_file_*	Tom Zanussi	3	-46/+50
	This patch renames relayfs_file_operations to relay_file_operations, and the file operations themselves from relayfs_XXX to relay_file_XXX, to make it more clear that they refer to relay files. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add Documentation on global relay buffers	Tom Zanussi	1	-1/+20
	Documentation update for creating global buffers. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add support for global relay buffers	Tom Zanussi	2	-11/+32
	This patch adds the optional is_global outparam to the create_buf_file() callback. This can be used by clients to create a single global relayfs buffer instead of the default per-cpu buffers. This was suggested as being useful for certain debugging applications where it's more convenient to be able to get all the data from a single channel without having to go to the bother of dealing with per-cpu files. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add Documentation on relay files in other filesystems	Tom Zanussi	1	-0/+27
	Documentation update for creating relay files in other filesystems. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add support for relay files in other filesystems	Tom Zanussi	3	-3/+63
	This patch adds a couple of callback functions that allow a client to hook into relay_open()/close() and supply the files that will be used to represent the channel buffers; the default implementation if no callbacks are defined is to create the files in relayfs. This is to support the creation and use of relay files in other filesystems such as debugfs, as implied by the fact that relayfs_file_operations are exported. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add Documention for non-relay files	Tom Zanussi	1	-0/+23
	Documentation update for non-relay files. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: remove unused alloc/destroy_inode()	Tom Zanussi	2	-59/+1
	Since we're no longer using relayfs_inode_info, remove relayfs_alloc_inode() and relayfs_destroy_inode() along with the relayfs inode cache. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: use generic_ip for private data	Tom Zanussi	1	-8/+9
	Use inode->u.generic_ip instead of relayfs_inode_info to store pointer to user data. Clients using relayfs_file_create() to create their own files would probably more expect their data to be stored in generic_ip; we also intend in the next set of patches to get rid of relayfs-specific stuff in the file operations, so we might as well do it here. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: add relayfs_remove_file()	Tom Zanussi	2	-0/+13
	This patch adds and exports relayfs_remove_file(), for API symmetry (with relayfs_create_file()). Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: export relayfs_create_file() with fileops param	Tom Zanussi	4	-21/+34
	This patch adds a mandatory fileops param to relayfs_create_file() and exports that function so that clients can use it to create files defined by their own set of file operations, in relayfs. The purpose is to allow relayfs applications to create their own set of 'control' files alongside their relay files in relayfs rather than having to create them in /proc or debugfs for instance. relayfs_create_file() is also used by relay_open_buf() to create the relay files for a channel. In this case, a pointer to relayfs_file_operations is passed in, along with a pointer to the buffer associated with the file. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] relayfs: decouple buffer creation from inode creation	Tom Zanussi	4	-26/+19
	The patch series implementa or fixes 3 things that were specifically requested or suggested by relayfs users: - support for non-relay files (patches 1-6) Currently, the relayfs API only supports the creation of directories (relayfs_create_dir()) and relay files (relay_open()). These patches adds support for non-relay files (relayfs_create_file()). This is so relayfs applications can create 'control files' in relayfs itself rather than in /proc or via a netlink channel, as is currently done in the relay-app examples. Basically what this amounts to is exporting relayfs_create_file() with an additional file_ops param that clients can use to supply file operations for their own special-purpose files in relayfs. - make exported relay file ops useful (patches 7-8) The relayfs relay_file_operations have always been exported, the intent being to make it possible to create relay files in other filesystems such as debugfs. The problem, though, is that currently the file operations are too tightly coupled to relayfs to actually be used for this purpose. This patch fixes that by adding a couple of callback functions that allow a client to hook into relay_open()/close() and supply the files that will be used to represent the channel buffers; the default implementation if no callbacks are defined is to create the files in relayfs. - add an option to create global relay buffer (patches 9-10) The file creation callback also supplies an optional param, is_global, that can be used by clients to create a single global relayfs buffer instead of the default per-cpu buffers. This was suggested as being useful for certain debugging applications where it's more convenient to be able to get all the data from a single channel without having to go to the bother of dealing with per-cpu files. - cleanup, some renaming and Documentation updates (patches 11-12) There were several comments that the use of netlink in the example code was non-intuitive and in fact the whole relay-app business was needlessly confusing. Based on that feedback, the example code has been completely converted over to relayfs control files as supported by this patch, and have also been made completely self-contained. The converted examples along with a couple of new examples that demonstrate using exported relay files can be found in relay-apps tarball: http://prdownloads.sourceforge.net/relayfs/relay-apps-0.9.tar.gz?download This patch: Separate buffer create/destroy from inode create/destroy. We want to be able to associate other data and not just relay buffers with inodes. Buffer create/destroy is moved out of inode.c and into relayfs core code. Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] ipc: expand shm_flags	Andrew Morton	1	-12/+10
	Unobfsucate this struct member Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] ELF: symbol table type additions	Jan Beulich	1	-0/+2
	Needed for the Novell kernel debugger and perhaps some per-cpu data on x86_64 in the future. Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] rcu file: use atomic primitives	Nick Piggin	9	-291/+48
	Use atomic_inc_not_zero for rcu files instead of special case rcuref. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] atomic: dec_and_lock use atomic primitives	Nick Piggin	1	-43/+6
	Convert atomic_dec_and_lock to use new atomic primitives. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] pktcdvd: Use bd_claim to get exclusive access	Peter Osterlund	1	-3/+9
	Use bd_claim() when opening the cdrom device to prevent user space programs such as cdrecord, hald and kded from interfering with the burning process. Signed-off-by: Peter Osterlund <petero2@telia.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] kernel/: small cleanups	Adrian Bunk	4	-2/+5
	This patch contains the following cleanups: - make needlessly global functions static - every file should include the headers containing the prototypes for it's global functions Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] drivers/isdn/: "extern inline" -> "static inline"	Adrian Bunk	3	-5/+4
	"extern inline" -> "static inline" Since there's no pullphone() function this patch removes the dead prototype. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] move rtc_interrupt() prototype to rtc.h	Adrian Bunk	5	-9/+10
	This patch moves the rtc_interrupt() prototype to rtc.h and removes the prototypes from C files. It also renames static rtc_interrupt() functions in arch/arm/mach-integrator/time.c and arch/sh64/kernel/time.c to avoid compile problems. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Paul Gortmaker <p_gortmaker@yahoo.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait)	OGAWA Hirofumi	16	-61/+48
	This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it. See mm/filemap.c: And changes the filemap_write_and_wait() and filemap_write_and_wait_range(). Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite() returns error. However, even if filemap_fdatawrite() returned an error, it may have submitted the partially data pages to the device. (e.g. in the case of -ENOSPC) <quotation> Andrew Morton writes, If filemap_fdatawrite() returns an error, this might be due to some I/O problem: dead disk, unplugged cable, etc. Given the generally crappy quality of the kernel's handling of such exceptions, there's a good chance that the filemap_fdatawait() will get stuck in D state forever. </quotation> So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO. Trond, could you please review the nfs part? Especially I'm not sure, nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not. Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: support a truncate() for expanding size (generic_cont_expand)	OGAWA Hirofumi	3	-18/+76
	This patch changes generic_cont_expand(), in order to share the code with fatfs. - Use vmtruncate() if ->prepare_write() returns a error. Even if ->prepare_write() returns an error, it may already have added some blocks. So, this truncates blocks outside of ->i_size by vmtruncate(). - Add generic_cont_expand_simple(). The generic_cont_expand_simple() assumes that ->prepare_write() can handle the block boundary. With this, we don't need to care the extra byte. And for expanding a file size by truncate(), fatfs uses the added generic_cont_expand_simple(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] export/change sync_page_range/_nolock()	OGAWA Hirofumi	2	-5/+7
	This exports/changes the sync_page_range/_nolock(). The fatfs needs sync_page_range/_nolock() for expanding truncate, and changes "size_t count" to "loff_t count". Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: support ->direct_IO()	OGAWA Hirofumi	4	-16/+89
	This patch add to support of ->direct_IO() for mostly read. The user of this seems to want to use for streaming read. So, current direct I/O has limitation, it can only overwrite. (For write operation, mainly we need to handle the hole etc..) Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: s/EXPORT_SYMBOL/EXPORT_SYMBOL_GPL/	OGAWA Hirofumi	5	-17/+17
	All EXPORT_SYMBOL of fatfs is only for vfat/msdos. _GPL would be proper. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: add the read/writepages()	OGAWA Hirofumi	1	-1/+16
	Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: use sb_find_get_block() instead of sb_getblk()	OGAWA Hirofumi	1	-2/+2
	We don't need to allocate buffer for checking the buffer is uptodate. This use sb_find_get_block() instead, and if it returns NULL it's not uptodate. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fat: move fat_clusters_flush() to write_super()	OGAWA Hirofumi	3	-6/+14
	It is overkill to update the FS_INFO whenever modifying prev_free/free_clusters, because those are just a hint. So, this patch uses ->write_super() for updating FS_INFO instead. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] IRQ type flags	Russell King	14	-61/+80
	Some ARM platforms have the ability to program the interrupt controller to detect various interrupt edges and/or levels. For some platforms, this is critical to setup correctly, particularly those which the setting is dependent on the device. Currently, ARM drivers do (eg) the following: err = request_irq(irq, ...); set_irq_type(irq, IRQT_RISING); However, if the interrupt has previously been programmed to be level sensitive (for whatever reason) then this will cause an interrupt storm. Hence, if we combine set_irq_type() with request_irq(), we can then safely set the type prior to unmasking the interrupt. The unfortunate problem is that in order to support this, these flags need to be visible outside of the ARM architecture - drivers such as smc91x need these flags and they're cross-architecture. Finally, the SA_TRIGGER_* flag passed to request_irq() should reflect the property that the device would like. The IRQ controller code should do its best to select the most appropriate supported mode. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] new driver synclink_gt	Paul Fulghum	4	-1/+4518
	New character device driver for the SyncLink GT and SyncLink AC families of synchronous and asynchronous serial adapters Signed-off-by: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] fix more missing includes	Tim Schmielau	19	-0/+22
	Include fixes for 2.6.14-git11. Should allow to remove sched.h from module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more to come since I haven't yet checked the other archs. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: skip rcu check if task is in root cpuset	Paul Jackson	1	-4/+9
	For systems that aren't using cpusets, but have them CONFIG_CPUSET enabled in their kernel (eventually this may be most distribution kernels), this patch removes even the minimal rcu_read_lock() from the memory page allocation path. Actually, it removes that rcu call for any task that is in the root cpuset (top_cpuset), which on systems not actively using cpusets, is all tasks. We don't need the rcu check for tasks in the top_cpuset, because the top_cpuset is statically allocated, so at no risk of being freed out from underneath us. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: mark number_of_cpusets read_mostly	Paul Jackson	1	-1/+1
	Mark cpuset global 'number_of_cpusets' as __read_mostly. This global is accessed everytime a zone is considered in the zonelist loops beneath __alloc_pages, looking for a free memory page. If number_of_cpusets is just one, then we can short circuit the mems_allowed check. Since this global is read alot on a hot path, and written rarely, it is an excellent candidate for __read_mostly. Thanks to Christoph Lameter for the suggestion. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: use rcu directly optimization	Paul Jackson	1	-10/+30
	Optimize the cpuset impact on page allocation, the most performance critical cpuset hook in the kernel. On each page allocation, the cpuset hook needs to check for a possible change in the current tasks cpuset. It can now handle the common case, of no change, without taking any spinlock or semaphore, thanks to RCU. Convert a spinlock on the current task to an rcu_read_lock(), saving approximately a memory barrier and an atomic op, depending on architecture. This is done by adding rcu_assign_pointer() and synchronize_rcu() calls to the write side of the task->cpuset pointer, in cpuset.c:attach_task(), to delay freeing up a detached cpuset until after any critical sections referencing that pointer. Thanks to Andi Kleen, Nick Piggin and Eric Dumazet for ideas. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: remove test for null cpuset from alloc code path	Paul Jackson	3	-6/+19
	Remove a couple of more lines of code from the cpuset hooks in the page allocation code path. There was a check for a NULL cpuset pointer in the routine cpuset_update_task_memory_state() that was only needed during system boot, after the memory subsystem was initialized, before the cpuset subsystem was initialized, to catch a NULL task->cpuset pointer. Add a cpuset_init_early() routine, just before the mem_init() call in init/main.c, that sets up just enough of the init tasks cpuset structure to render cpuset_update_task_memory_state() calls harmless. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: migrate all tasks in cpuset at once	Paul Jackson	1	-13/+16
	Given the mechanism in the previous patch to handle rebinding the per-vma mempolicies of all tasks in a cpuset that changes its memory placement, it is now easier to handle the page migration requirements of such tasks at the same time. The previous code didn't actually attempt to migrate the pages of the tasks in a cpuset whose memory placement changed until the next time each such task tried to allocate memory. This was undesirable, as users invoking memory page migration exected to happen when the placement changed, not some unspecified time later when the task needed more memory. It is now trivial to handle the page migration at the same time as the per-vma rebinding is done. The routine cpuset.c:update_nodemask(), which handles changing a cpusets memory placement ('mems') now checks for the special case of being asked to write a placement that is the same as before. It was harmless enough before to just recompute everything again, even though nothing had changed. But page migration is a heavy weight operation - moving pages about. So now it is worth avoiding that if asked to move a cpuset to its current location. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: rebind vma mempolicies fix	Paul Jackson	3	-0/+137
	Fix more of longstanding bug in cpuset/mempolicy interaction. NUMA mempolicies (mm/mempolicy.c) are constrained by the current tasks cpuset to just the Memory Nodes allowed by that cpuset. The kernel maintains internal state for each mempolicy, tracking what nodes are used for the MPOL_INTERLEAVE, MPOL_BIND or MPOL_PREFERRED policies. When a tasks cpuset memory placement changes, whether because the cpuset changed, or because the task was attached to a different cpuset, then the tasks mempolicies have to be rebound to the new cpuset placement, so as to preserve the cpuset-relative numbering of the nodes in that policy. An earlier fix handled such mempolicy rebinding for mempolicies attached to a task. This fix rebinds mempolicies attached to vma's (address ranges in a tasks address space.) Due to the need to hold the task->mm->mmap_sem semaphore while updating vma's, the rebinding of vma mempolicies has to be done when the cpuset memory placement is changed, at which time mmap_sem can be safely acquired. The tasks mempolicy is rebound later, when the task next attempts to allocate memory and notices that its task->cpuset_mems_generation is out-of-date with its cpusets mems_generation. Because walking the tasklist to find all tasks attached to a changing cpuset requires holding tasklist_lock, a spinlock, one cannot update the vma's of the affected tasks while doing the tasklist scan. In general, one cannot acquire a semaphore (which can sleep) while already holding a spinlock (such as tasklist_lock). So a list of mm references has to be built up during the tasklist scan, then the tasklist lock dropped, then for each mm, its mmap_sem acquired, and the vma's in that mm rebound. Once the tasklist lock is dropped, affected tasks may fork new tasks, before their mm's are rebound. A kernel global 'cpuset_being_rebound' is set to point to the cpuset being rebound (there can only be one; cpuset modifications are done under a global 'manage_sem' semaphore), and the mpol_copy code that is used to copy a tasks mempolicies during fork catches such forking tasks, and ensures their children are also rebound. When a task is moved to a different cpuset, it is easier, as there is only one task involved. It's mm->vma's are scanned, using the same mpol_rebind_policy() as used above. It may happen that both the mpol_copy hook and the update done via the tasklist scan update the same mm twice. This is ok, as the mempolicies of each vma in an mm keep track of what mems_allowed they are relative to, and safely no-op a second request to rebind to the same nodes. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: number_of_cpusets optimization	Paul Jackson	2	-2/+20
	Easy little optimization hack to avoid actually having to call cpuset_zone_allowed() and check mems_allowed, in the main page allocation routine, __alloc_pages(). This saves several CPU cycles per page allocation on systems not using cpusets. A counter is updated each time a cpuset is created or removed, and whenever there is only one cpuset in the system, it must be the root cpuset, which contains all CPUs and all Memory Nodes. In that case, when the counter is one, all allocations are allowed. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: numa_policy_rebind cleanup	Paul Jackson	3	-15/+30
	Cleanup, reorganize and make more robust the mempolicy.c code to rebind mempolicies relative to the containing cpuset after a tasks memory placement changes. The real motivator for this cleanup patch is to lay more groundwork for the upcoming patch to correctly rebind NUMA mempolicies that are attached to vma's after the containing cpuset memory placement changes. NUMA mempolicies are constrained by the cpuset their task is a member of. When either (1) a task is moved to a different cpuset, or (2) the 'mems' mems_allowed of a cpuset is changed, then the NUMA mempolicies have embedded node numbers (for MPOL_BIND, MPOL_INTERLEAVE and MPOL_PREFERRED) that need to be recalculated, relative to their new cpuset placement. The old code used an unreliable method of determining what was the old mems_allowed constraining the mempolicy. It just looked at the tasks mems_allowed value. This sort of worked with the present code, that just rebinds the -task- mempolicy, and leaves any -vma- mempolicies broken, referring to the old nodes. But in an upcoming patch, the vma mempolicies will be rebound as well. Then the order in which the various task and vma mempolicies are updated will no longer be deterministic, and one can no longer count on the task->mems_allowed holding the old value for as long as needed. It's not even clear if the current code was guaranteed to work reliably for task mempolicies. So I added a mems_allowed field to each mempolicy, stating exactly what mems_allowed the policy is relative to, and updated synchronously and reliably anytime that the mempolicy is rebound. Also removed a useless wrapper routine, numa_policy_rebind(), and had its caller, cpuset_update_task_memory_state(), call directly to the rewritten policy_rebind() routine, and made that rebind routine extern instead of static, and added a "mpol_" prefix to its name, making it mpol_rebind_policy(). Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: implement cpuset_mems_allowed	Paul Jackson	3	-7/+33
	Provide a cpuset_mems_allowed() method, which the sys_migrate_pages() code needed, to obtain the mems_allowed vector of a cpuset, and replaced the workaround in sys_migrate_pages() to call this new method. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: combine refresh_mems and update_mems	Paul Jackson	3	-61/+48
	The important code paths through alloc_pages_current() and alloc_page_vma(), by which most kernel page allocations go, both called cpuset_update_current_mems_allowed(), which in turn called refresh_mems(). -Both- of these latter two routines did a tasklock, got the tasks cpuset pointer, and checked for out of date cpuset->mems_generation. That was a silly duplication of code and waste of CPU cycles on an important code path. Consolidated those two routines into a single routine, called cpuset_update_task_memory_state(), since it updates more than just mems_allowed. Changed all callers of either routine to call the new consolidated routine. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: fork hook fix	Paul Jackson	2	-5/+5
	Fix obscure, never seen in real life, cpuset fork race. The cpuset_fork() call in fork.c was setting up the correct task->cpuset pointer after the tasklist_lock was dropped, which briefly exposed the newly forked process with an unsafe (copied from parent without locks or usage counter increment) cpuset pointer. In theory, that exposed cpuset pointer could have been pointing at a cpuset that was already freed and removed, and in theory another task that had been sitting on the tasklist_lock waiting to scan the task list could have raced down the entire tasklist, found our new child at the far end, and dereferenced that bogus cpuset pointer. To fix, setup up the correct cpuset pointer in the new child by calling cpuset_fork() before the new task is linked into the tasklist, and with that, add a fork failure case, to dereference that cpuset, if the fork fails along the way, after cpuset_fork() was called. Had to remove a BUG_ON() from cpuset_exit(), because it was no longer valid - the call to cpuset_exit() from a failed fork would not have PF_EXITING set. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: update_nodemask code reformat	Paul Jackson	1	-10/+15
	Restructure code layout of the kernel/cpuset.c update_nodemask() routine, removing embedded returns and nested if's in favor of goto completion labels. This is being done in anticipation of adding more logic to this routine, which will favor the goto style structure. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: minor spacing initializer fixes	Paul Jackson	1	-6/+3
	Four trivial cpuset fixes: remove extra spaces, remove useless initializers, mark one __read_mostly. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: remove marker_pid documentation	Paul Jackson	1	-46/+4
	Remove documentation for the cpuset 'marker_pid' feature, that was in the patch "cpuset: change marker for relative numbering" That patch was previously pulled from *-mm at my (pj) request. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: document additional features	Paul Jackson	1	-25/+153
	Document the additional cpuset features: notify_on_release marker_pid memory_pressure memory_pressure_enabled Rearrange and improve formatting of existing documentation for cpu_exclusive and mem_exclusive features. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: memory pressure meter	Paul Jackson	3	-2/+203
	Provide a simple per-cpuset metric of memory pressure, tracking the -rate- that the tasks in a cpuset call try_to_free_pages(), the synchronous (direct) memory reclaim code. This enables batch managers monitoring jobs running in dedicated cpusets to efficiently detect what level of memory pressure that job is causing. This is useful both on tightly managed systems running a wide mix of submitted jobs, which may choose to terminate or reprioritize jobs that are trying to use more memory than allowed on the nodes assigned them, and with tightly coupled, long running, massively parallel scientific computing jobs that will dramatically fail to meet required performance goals if they start to use more memory than allowed to them. This patch just provides a very economical way for the batch manager to monitor a cpuset for signs of memory pressure. It's up to the batch manager or other user code to decide what to do about it and take action. ==> Unless this feature is enabled by writing "1" to the special file /dev/cpuset/memory_pressure_enabled, the hook in the rebalance code of __alloc_pages() for this metric reduces to simply noticing that the cpuset_memory_pressure_enabled flag is zero. So only systems that enable this feature will compute the metric. Why a per-cpuset, running average: Because this meter is per-cpuset, rather than per-task or mm, the system load imposed by a batch scheduler monitoring this metric is sharply reduced on large systems, because a scan of the tasklist can be avoided on each set of queries. Because this meter is a running average, instead of an accumulating counter, a batch scheduler can detect memory pressure with a single read, instead of having to read and accumulate results for a period of time. Because this meter is per-cpuset rather than per-task or mm, the batch scheduler can obtain the key information, memory pressure in a cpuset, with a single read, rather than having to query and accumulate results over all the (dynamically changing) set of tasks in the cpuset. A per-cpuset simple digital filter (requires a spinlock and 3 words of data per-cpuset) is kept, and updated by any task attached to that cpuset, if it enters the synchronous (direct) page reclaim code. A per-cpuset file provides an integer number representing the recent (half-life of 10 seconds) rate of direct page reclaims caused by the tasks in the cpuset, in units of reclaims attempted per second, times 1000. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: mempolicy one more nodemask conversion	Paul Jackson	3	-15/+5
	Finish converting mm/mempolicy.c from bitmaps to nodemasks. The previous conversion had left one routine using bitmaps, since it involved a corresponding change to kernel/cpuset.c Fix that interface by replacing with a simple macro that calls nodes_subset(), or if !CONFIG_CPUSET, returns (1). Signed-off-by: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <christoph@lameter.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] cpuset: better bitmap remap defaults	Paul Jackson	1	-45/+44
	Fix the default behaviour for the remap operators in bitmap, cpumask and nodemask. As previously submitted, the pair of masks <A, B> defined a map of the positions of the set bits in A to the corresponding bits in B. This is still true. The issue is how to map the other positions, corresponding to the unset (0) bits in A. As previously submitted, they were all mapped to the first set bit position in B, a constant map. When I tried to code per-vma mempolicy rebinding using these remap operators, I realized this was wrong. This patch changes the default to map all the unset bit positions in A to the same positions in B, the identity map. For example, if A has bits 4-7 set, and B has bits 9-12 set, then the map defined by the pair <A, B> maps each bit position in the first 32 bits as follows: 0 ==> 0 ... 3 ==> 3 4 ==> 9 ... 7 ==> 12 8 ==> 8 9 ==> 9 ... 31 ==> 31 This now corresponds to the typical behaviour desired when migrating pages and policies from one cpuset to another. The pages on nodes within the original cpuset, and the references in memory policies to nodes within the original cpuset, are migrated to the corresponding cpuset-relative nodes in the destination cpuset. Other pages and node references are left untouched. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] slob: introduce the SLOB allocator	Matt Mackall	5	-1/+440
	configurable replacement for slab allocator This adds a CONFIG_SLAB option under CONFIG_EMBEDDED. When CONFIG_SLAB is disabled, the kernel falls back to using the 'SLOB' allocator. SLOB is a traditional K&R/UNIX allocator with a SLAB emulation layer, similar to the original Linux kmalloc allocator that SLAB replaced. It's signicantly smaller code and is more memory efficient. But like all similar allocators, it scales poorly and suffers from fragmentation more than SLAB, so it's only appropriate for small systems. It's been tested extensively in the Linux-tiny tree. I've also stress-tested it with make -j 8 compiles on a 3G SMP+PREEMPT box (not recommended). Here's a comparison for otherwise identical builds, showing SLOB saving nearly half a megabyte of RAM: $ size vmlinux* text data bss dec hex filename 3336372 529360 190812 4056544 3de5e0 vmlinux-slab 3323208 527948 190684 4041840 3dac70 vmlinux-slob $ size mm/{slab,slob}.o text data bss dec hex filename 13221 752 48 14021 36c5 mm/slab.o 1896 52 8 1956 7a4 mm/slob.o /proc/meminfo: SLAB SLOB delta MemTotal: 27964 kB 27980 kB +16 kB MemFree: 24596 kB 25092 kB +496 kB Buffers: 36 kB 36 kB 0 kB Cached: 1188 kB 1188 kB 0 kB SwapCached: 0 kB 0 kB 0 kB Active: 608 kB 600 kB -8 kB Inactive: 808 kB 812 kB +4 kB HighTotal: 0 kB 0 kB 0 kB HighFree: 0 kB 0 kB 0 kB LowTotal: 27964 kB 27980 kB +16 kB LowFree: 24596 kB 25092 kB +496 kB SwapTotal: 0 kB 0 kB 0 kB SwapFree: 0 kB 0 kB 0 kB Dirty: 4 kB 12 kB +8 kB Writeback: 0 kB 0 kB 0 kB Mapped: 560 kB 556 kB -4 kB Slab: 1756 kB 0 kB -1756 kB CommitLimit: 13980 kB 13988 kB +8 kB Committed_AS: 4208 kB 4208 kB 0 kB PageTables: 28 kB 28 kB 0 kB VmallocTotal: 1007312 kB 1007312 kB 0 kB VmallocUsed: 48 kB 48 kB 0 kB VmallocChunk: 1007264 kB 1007264 kB 0 kB (this work has been sponsored in part by CELF) From: Ingo Molnar <mingo@elte.hu> Fix 32-bitness bugs in mm/slob.c. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] slob: introduce mm/util.c for shared functions	Matt Mackall	3	-38/+40
	Add mm/util.c for functions common between SLAB and SLOB. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] DEBUG_SLAB depends on SLAB	Ingo Molnar	1	-1/+1
	Make DEBUG_SLAB depend on SLAB. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] radix-tree: reduce tree height upon partial truncation	Nick Piggin	1	-10/+36
	Shrink the height of a radix tree when it is partially truncated - we only do shrinkage of full truncation at present. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] radix tree: early termination of tag clearing	Nick Piggin	1	-17/+23
	Correctly determine the tags to be cleared in radix_tree_delete() so we don't keep moving up the tree clearing tags that we don't need to. For example, if a tag is simply not set in the deleted item, nor anywhere up the tree, radix_tree_delete() would attempt to clear it up the entire height of the tree. Also, tag_set() was made conditional so as not to dirty too many cachelines high up in the radix tree. Instead, put this logic into radix_tree_tag_set(). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] radix tree: code consolidation	Nick Piggin	1	-31/+26
	Introduce helper any_tag_set() rather than repeat the same code sequence 4 times. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] remove get_task_struct_rcu()	Paul E. McKenney	1	-12/+0
	The latest set of signal-RCU patches does not use get_task_struct_rcu(). Attached is a patch that removes it. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08	[PATCH] Simpler signal-exit concurrency handling	Paul E. McKenney	1	-5/+6
	Some simplification in checking signal delivery against concurrent exit. Instead of using get_task_struct_rcu(), which increments the task_struct reference count, check the reference count after acquiring sighand lock. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>