aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86_64
AgeCommit message (Collapse)AuthorFilesLines
2006-03-14Revert "[PATCH] x86-64: Fix up handling of non canonical user RIPs"Linus Torvalds1-11/+18
This reverts commit c33d4568aca9028a22857f94f5e0850012b6444b. Andrew Clayton and Hugh Dickins report that it's broken for them and causes strange page table and slab corruption, and spontaneous reboots. Let's get it right next time. Cc: Andrew Clayton <andrew@rootshell.co.uk> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-12[PATCH] x86-64: Fix up handling of non canonical user RIPsAndi Kleen1-18/+11
EM64T CPUs have somewhat weird error reporting for non canonical RIPs in SYSRET. We can't handle any exceptions there because the exception handler would end up running on the user stack which is unsafe. To avoid problems any code that might end up with a user touched pt_regs should return using int_ret_from_syscall. int_ret_from_syscall ends up using IRET, which allows safe exceptions. Cc: Ernie Petrides <petrides@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-08[PATCH] fix kexec asmMichael Matz1-1/+1
While testing kexec and kdump we hit problems where the new kernel would freeze or instantly reboot. The easiest way to trigger it was to kexec a kernel compiled for CONFIG_M586 on an athlon cpu. Compiling for CONFIG_MK7 instead would work fine. The patch fixes a few problems with the kexec inline asm. Signed-off-by: Chris Mason <mason@suse.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-27Revert "[PATCH] x86_64: Only do the clustered systems have unsynchronized ↵Linus Torvalds1-8/+1
TSC assumption on IBM systems" This reverts commit 13a229abc25640813f1480c0478dfc6bdbc1c19e. Quoth Andi: "After some consideration and feedback from various people it turns out this wasn't that good an idea. It has some problems and needs more work. Since it was only an optimization anyways it's best to just back it out again for now." Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26Make Kprobes depend on modulesLinus Torvalds1-0/+1
Commit 9ec4b1f356b3bad928ae8e2aa9caebfa737d52df made kprobes not compile without module support, so just make that clear in the Kconfig file. Also, since it's marked EXPERIMENTAL, make that dependency explicit too. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] fix build on x86_64 with !CONFIG_HOTPLUG_CPUBrian Magnuson1-1/+1
The commit e2c0388866dc12bef56b178b958f9b778fe6c687 added setup_additional_cpus to setup.c but this is only defined if CONFIG_HOTPLUG_CPU is set. This patch changes the #ifdef to reflect that. Signed-off-by: Brian Magnuson <magnuson@rcn.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: Better ATI timer fixAndi Kleen1-17/+29
The previous experiment for using apicmaintimer on ATI systems didn't work out very well. In particular laptops with C2/C3 support often don't let it tick during idle, which makes it useless. There were also some other bugs that made the apicmaintimer often not used at all. I tried some other experiments - running timer over RTC and some other things but they didn't really work well neither. I rechecked the specs now and it turns out this simple change is actually enough to avoid the double ticks on the ATI systems. We just turn off IRQ 0 in the 8254 and only route it directly using the IO-APIC. I tested it on a few ATI systems and it worked there. In fact it worked on all chipsets (NVidia, Intel, AMD, ATI) I tried it on. According to the ACPI spec routing should always work through the IO-APIC so I think it's the correct thing to do anyways (and most of the old gunk in check_timer should be thrown away for x86-64). But for 2.6.16 it's best to do a fairly minimal change: - Use the known to be working everywhere-but-ATI IRQ0 both over 8254 and IO-APIC setup everywhere - Except on ATI disable IRQ0 in the 8254 - Remove the code to select apicmaintimer on ATI chipsets - Add some boot options to allow to override this (just paranoia) In 2.6.17 I hope to switch the default over to this for everybody. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: Move the SMP time selection earlierAndi Kleen2-13/+11
SMP time selection originally ran after all CPUs were brought up because it needed to know the number of CPUs to decide if it needs an MP safe timer or not. This is not needed anymore because we know present CPUs early. This fixes a couple of problems: - apicmaintimer didn't always work because it relied on state that was set up time_init_gtod too late. - The output for the used timer in early kernel log was misleading because time_init_gtod could actually change it later. Now always print the final timer choice Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: Fix the additional_cpus=.. optionAndi Kleen2-1/+7
It didn't set up the CPU possible map early enough, so the option didn't actually work. Noticed by Heiko Carstens Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: Fix NMI watchdog on x460Chris McDermott1-1/+1
[description from AK] Old check for the IO-APIC watchdog during the timer check was wrong - it obviously should only drop into this if the IO-APIC watchdog is used. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86-64/i386: Use common X86_PM_TIMER option and make it EMBEDDEDAndi Kleen1-15/+0
This makes x86-64 use the common X86_PM_TIMER Kconfig entry in drivers/acpi And since PM timer is needed for correct timing on a lot of systems now (e.g. AMD dual cores) and we often get bug reports from people who forgot to set it make it depend on CONFIG_EMBEDDED. x86-64 had this change before and it's a good thing. I also fixed the description slightly to make this more clear. Cc: len.brown@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: Only do the clustered systems have unsynchronized TSC ↵Andi Kleen1-1/+8
assumption on IBM systems Big Unisys systems have multiple clusters too, but they have an synchronized TSC. I'm using the SMBIOS to check for vendor == IBM. Cc: Chris McDermott <lcm@us.ibm.com> Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-26[PATCH] x86_64: no_iommu removal in pci-gart.cJon Mason2-18/+6
In previous versions of pci-gart.c, no_iommu was used to determine if IOMMU was disabled in the GART DMA mapping functions. This changed in 2.6.16 and now gart_xxx() functions are only called if gart is enabled. Therefore, uses of no_iommu in the GART code are no longer necessary and can be removed. Also, it removes double deceleration of no_iommu and force_iommu in pci.h and proto.h, by removing the deceleration in pci.h. Lastly, end_pfn off by one error. Tested (along with patch 1/2) on dual opteron with gart enabled, iommu=soft, and iommu=off. Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-25[PATCH] x86-64: react to new topology.c locationDave Jones1-1/+1
Commit 9c869edac591977314323a4eaad5f7633fca684f moved the i386 topology.c file. That change broke x86-64 compiles, as it uses the same file. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-20[PATCH] x86_64: Don't set CONFIG_DEBUG_INFO in defconfigAndi Kleen1-3/+3
Undo setting of CONFIG_DEBUG_INFO in the previous defconfig update. It will make every build much slower and need more disk space and isn't a good default. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] Remove KERN_INFO from middle of printk lineTim Hockin1-2/+2
Don't print KERN_INFO in the middle of a printk line. printk(KERN_INFO "OEM ID: %s ",str); is just above this. This is already fixed up in i386 copy. Signed-off-by: Martin J. Bligh <mbligh@google.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Always pass full number of nodes to NUMA hash computationAndi Kleen2-2/+2
Previously the numa hash code would be confused by holes in the node space and stop early. This is the first part of the fix for the non boot issue with empty nodes on Opterons. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Relax SRAT covers all memory check a bitAndi Kleen1-1/+2
Code was refusing good SRATs because about 12K got lost somewhere. Allow less than 1MB of difference before rejecting it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Resolve the RIP of an early exception using kallsymsAndi Kleen1-0/+7
But do it after everything else to risk less from recursive crashes. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Disable tsc when apicpmtimer is activeAndi Kleen2-2/+2
Otherwise it has no effect anyways. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Don't enable ATI apicmaintimer workaround when the machine ↵Andi Kleen1-0/+16
has C2 or C3 Many laptops have problems with ticking the local APIC timer in C2/C3. The code added earlier to use it by default on ATI didn't really work for them. Don't enable it when the system supports C2/C3. This doesn't fix the problem fully, but at least it's not worse than before. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Don't call do_exit with interrupts disabled after IRET exceptionAndi Kleen1-0/+1
This caused a sigreturn with bad argument on a preemptible kernel to complain with Debug: sleeping function called from invalid context at /home/lsrc/quilt/linux/include/linux/rwsem.h:43 in_atomic():0, irqs_disabled():1 Call Trace: {__might_sleep+190} {profile_task_exit+21} {__do_exit+34} {do_wait+0} Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: make touch_nmi_watchdog() not touch impossible cpus' private ↵Jan Beulich1-8/+11
data Along with that, also suppress the memory touching altogether when the watchdog is not running, to eliminate needless crosstalk. Plus ad a call to it to make things consistent (one could also consider removing the call in enable_timer_nmi_watchdog()). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-17[PATCH] x86_64: Update defconfigAndi Kleen1-9/+33
... and enable 1394 by default. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-15[PATCH] x86_64: early initialization of cpu_to_nodeDaniel Yeisley1-1/+1
The early initialization of cpu_to_node code as it is now only updates the cpu_to_node array, and does not update cpu_pda()->nodemember. This will cause numa_node_id() to return 0 on systems where CPU 0 is not on Node 0. This leads to a kernel panic in slab.c. I've tested the patch below on a 16 processor x86_64 ES7000-600 server, and no longer see the panic I saw with the original 2.6.16-rc3. Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-12[PATCH] x86_64: GART DMA merging fixAndi Kleen1-4/+2
Don't touch the non DMA members in the sg list in dma_map_sg in the IOMMU Some drivers (in particular ST) ran into problems because they reused the sg lists after passing them to pci_map_sg(). The merging procedure in the K8 GART IOMMU corrupted the state. This patch changes it to only touch the dma* entries during merging, but not the other fields. Approach suggested by Dave Miller. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-12[PATCH] arch/x86_64/kernel/traps.c PTRACE_SINGLESTEP oopsJohn Blackwood1-1/+17
We found a problem with x86_64 kernels with preemption enabled, where having multiple tasks doing ptrace singlesteps around the same time will cause the system to 'oops'. The problem seems that a task can get preempted out of the do_debug() processing while it is running on the DEBUG_STACK stack. If another task on that same cpu then enters do_debug() and uses the same per-cpu DEBUG_STACK stack, the previous preempted tasks's stack contents can be corrupted, and the system will oops when the preempted task is context switched back in again. The typical oops looks like the following: Unable to handle kernel paging request at ffffffffffffffae RIP: <ffffffff805452a1>{thread_return+34} PGD 103027 PUD 102429067 PMD 0 Oops: 0002 [1] PREEMPT SMP CPU 0 Modules linked in: Pid: 3786, comm: ssdd Not tainted 2.6.15.2 #1 RIP: 0010:[<ffffffff805452a1>] <ffffffff805452a1>{thread_return+34} RSP: 0018:ffffffff80824058 EFLAGS: 000136c2 RAX: ffff81017e12cea0 RBX: 0000000000000000 RCX: 00000000c0000100 RDX: 0000000000000000 RSI: ffff8100f7856e20 RDI: ffff81017e12cea0 RBP: 0000000000000046 R08: ffff8100f68a6000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff81017e12cea0 R12: ffff81000c2d53e8 R13: ffff81017f5b3be8 R14: ffff81000c0036e0 R15: 000001056cbfc899 FS: 00002aaaaaad9b00(0000) GS:ffffffff80883800(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffffffffae CR3: 00000000f6fcf000 CR4: 00000000000006e0 Process ssdd (pid: 3786, threadinfo ffff8100f68a6000, task ffff8100f7856e20) Stack: ffffffff808240d8 ffffffff8012a84a ffff8100055f6c00 0000000000000020 0000000000000001 ffff81000c0036e0 ffffffff808240b8 0000000000000000 0000000000000000 0000000000000000 Call Trace: <#DB> <ffffffff8012a84a>{try_to_wake_up+985} <ffffffff8012c0d3>{kick_process+87} <ffffffff8013b262>{signal_wake_up+48} <ffffffff8013b5ce>{specific_send_sig_info+179} <ffffffff80546abc>{_spin_unlock_irqrestore+27} <ffffffff8013b67c>{force_sig_info+159} <ffffffff801103a0>{do_debug+289} <ffffffff80110278>{sync_regs+103} <ffffffff8010ed9a>{paranoid_userspace+35} Unable to handle kernel paging request at 00007fffffb7d000 RIP: <ffffffff8010f2e4>{show_trace+465} PGD f6f25067 PUD f6fcc067 PMD f6957067 PTE 0 Oops: 0000 [2] PREEMPT SMP This patch disables preemptions for the task upon entry to do_debug(), before interrupts are reenabled, and then disables preemption before exiting do_debug(), after disabling interrupts. I've noticed that the task can be preempted either at the end of an interrupt, or on the call to force_sig_info() on the spin_unlock_irqrestore() processing. It might be better to attempt to code a fix in entry.S around the code that calls do_debug(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-11[PATCH] x86-64: Fix HPET timer on x460Chris McDermott2-4/+10
[description from AK] The IBM Summit 3 chipset doesn't implement the HPET timer replacement option. Since the current Linux code relies on it use a mixed mode with both PIT for the interrupt and HPET counters for the time keeping. That was already implemented, but didn't work properly because it was still using the last interrupt offset in HPET. This resulted in x460 not booting. Fix this up by using the free running HPET counter. Shouldn't affect any other machine because they either use full HPET mode or no HPET at all. TBD needs a similar 32bit fix. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com> Cc: Bob Picco <bob.picco@hp.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-11[PATCH] fstatat64 supportUlrich Drepper2-1/+23
The *at patches introduced fstatat and, due to inusfficient research, I used the newfstat functions generally as the guideline. The result is that on 32-bit platforms we don't have all the information needed to implement fstatat64. This patch modifies the code to pass up 64-bit information if __ARCH_WANT_STAT64 is defined. I renamed the syscall entry point to make this clear. Other archs will continue to use the existing code. On x86-64 the compat code is implemented using a new sys32_ function. this is what is done for the other stat syscalls as well. This patch might break some other archs (those which define __ARCH_WANT_STAT64 and which already wired up the syscall). Yet others might need changes to accomodate the compatibility mode. I really don't want to do that work because all this stat handling is a mess (more so in glibc, but the kernel is also affected). It should be done by the arch maintainers. I'll provide some stand-alone test shortly. Those who are eager could compile glibc and run 'make check' (no installation needed). The patch below has been tested on x86 and x86-64. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-08[PATCH] x86-64: Add sys_unshareAndi Kleen1-0/+3
Add unshare syscall for x86-64 ppoll/pselect are not ready yet, but add reservations. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07[PATCH] arch/x86_64/pci/mmconfig.c NULL noise removalAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07[PATCH] amd64 time.c __iomem annotationsAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07[PATCH] drive_info removal outside of arch/i386Al Viro2-7/+0
drive_info is used only by hd.c and that happens under #ifdef __i386__. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07[PATCH] x86_64: Fix the node cpumask of a cpu going downRavikiran G Thirumalai1-0/+3
Currently, x86_64 and ia64 arches do not clear the corresponding bits in the node's cpumask when a cpu goes down or cpu bring up is cancelled. This is buggy since there are pieces of common code where the cpumask is checked in the cpu down code path to decide on things (like in the slab down path). PPC does the right thing, but x86_64 and ia64 don't (This was the reason Sonny hit upon a slab bug during cpu offline on ppc and could not reproduce on other arches). This patch fixes it for x86_64. I won't attempt ia64 as I cannot test it. Credit for spotting this should go to Alok. (akpm: this was applied, then reverted. But it's OK now because we now use for_each_cpu() in the right places). Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05Revert "[PATCH] x86_64: Fix the node cpumask of a cpu going down"Linus Torvalds1-3/+0
This reverts commit 10f4dc8b27ac42f930ac55adb8c521264dc997f8. Quoth Andi Kleen: "Kiran decided that it makes the problem worse than it was before. Fixing it fully requires more work which is too much for 2.6.16. So please revert that commit for now." Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: IOMMU printk cleanupJon Mason4-10/+11
This patch contains a printk reorder to remove the current problem of displaying "PCI-DMA: Disabling IOMMU." and then "PCI-DMA: using GART IOMMU" 20 lines later in dmesg. It also constains a printk reorder in swiotlb to state swiotlb enablement prior to describing the location of the bounce buffers, and a printk reorder to state gart enablement prior to describing the aperature. Also constains a whitespace cleanup in arch/x86_64/kernel/setup.c Tested (along with patch 2/2) on dual opteron with gart enabled, iommu=soft, and iommu=off. Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Let impossible CPUs point to reference per cpu dataAndi Kleen1-3/+5
Hack for 2.6.16. In 2.6.17 all code that uses NR_CPUs should be audited and changed to only touch possible CPUs. Don't mark the reference per cpu data init data (so it stays around after boot) and point all impossible CPUs to it. This way they reference some valid - although shared memory. Usually this is only initialization like INIT_LIST_HEADs and there won't be races because these CPUs never run. Still somewhat hackish. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] i386/x86-64: Don't ack the APIC for bad interrupts when the APIC is ↵Andi Kleen1-0/+20
not enabled It's bad juju to touch the APIC when it hasn't been enabled. I also moved ack_bad_irq for x86-64 out of line following i386. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: minor odering correction to dump_pagetable()Jan Beulich1-2/+1
Checking of the validity of pointers should be consistently done before dereferencing the pointer. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: small fix for CFI annotationsJan Beulich1-0/+4
Conditionalize two unwind directives to match other similarly conditional code. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Cc: Jim Houston <jim.houston@ccur.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Calibrate APIC timer using PM timerAndi Kleen2-6/+47
On some broken motherboards (at least one NForce3 based AMD64 laptop) the PIT timer runs at a incorrect frequency. This patch adds a new option "apicpmtimer" that allows to use the APIC timer and calibrate it using the PMTimer. It requires the earlier patch that allows to run the main timer from the APIC. Specifying apicpmtimer implies apicmaintimer. The option defaults to off for now. I tested it on a few systems and the resulting APIC timer frequencies were usually a bit off, but always <1%, which should be tolerable. TBD figure out heuristic to enable this automatically on the affected systems TBD perhaps do it on all NForce3s or using DMI? Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Don't allow kprobes on __switch_toAndi Kleen1-1/+3
kprobes cannot deal with the funny calling conventions when it runs on a different stack when it returns. If someone wants to instrument context switch they can add a probe to schedule() instead. Cc: jkenisto@us.ibm.com, prasanna@in.ibm.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: align per-cpu section to configured cache bytesZach Brown1-1/+1
Align the start of the per-cpu section to the configured number of bytes in a cache line. This stops a BUG_ON() from triggering in load_module() when DEFINE_PER_CPU() is used in a module and the section isn't cacheline-aligned. Rusty also found this and sent a patch in a while ago (http://lkml.org/lkml/2004/10/19/17), I don't know what came of that. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: When allocation of merged SG lists fails in the IOMMU don't ↵Kevin VanMaren1-3/+6
merge [ AK: I redid Kevin's fix to be simpler, but the idea and original analysis of the problem is from Kevin] This avoid allocation failures on some SATA systems like Nvidia CK8 when the IOMMU gets fragmented. Modern SATA devices have quite large queues (128 entries) and the FS with ext2/3 is good enough now that it often passes whole 128 page sg lists down to the driver. These require 512K of continuous free space in the IOMMU aperture to map when merged. When the IOMMU is fragmented this could lead to spurious IO errors due to failing mappings. Short term fix is to just try to map the SG list again unmerged page by page - this way fragmentation doesn't matter anymore. The code for that was already there, but it just wasn't enabled for the merge case. According to Kevin at least the Nvidia device doesn't seem to benefit from merging much anyways, so the only slowdown is from trying to do an unnecessary merge attempt. Kevin plans to implement better fragmentation avoidance in the future, but that wouldn't be 2.6.16 material. TBD: should add some statistic counters to count how often that really happens. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Fix zero mcfg entry workaround on x86-64Andi Kleen1-1/+1
I broke this earlier when moving the patch from i386 to x86-64. Need to return the virtual address here, not the physical address. This fixes some boot time crashes on x86-64. Cc: gregkh@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Do more checking in the SRAT header codeAndi Kleen1-4/+15
- Check if the processor/memory affinity entries are long enough according to the ACPI 3.0 spec. - Ignore memory affinity entries that define a zero length region. All based on BIOS issues found in the field @) Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: data/functions wrongly marked as __init with cpu hotplug.Ashok Raj1-1/+1
attached patch is 2 more cases i found via running the reference_init.pl script. These were easy to spot just knowing the file names. There is one another about init/main.c that i cant exactly zero in. (partly because i dont know how to interpret the data thats spewed out of the tool). Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: mark two routines as __cpuinitShaohua Li2-2/+2
SIgned-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsingAndi Kleen1-6/+20
Might fix boot failures on systems with empty PXMs in SRAT Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Remove CONFIG_INIT_DEBUGAndi Kleen1-7/+0
It has been enabled by default for some time now and is cheap enough so it doesn't matter anyways. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Fix the node cpumask of a cpu going downRavikiran G Thirumalai1-0/+3
Currently, x86_64 and ia64 arches do not clear the corresponding bits in the node's cpumask when a cpu goes down or cpu bring up is cancelled. This is buggy since there are pieces of common code where the cpumask is checked in the cpu down code path to decide on things (like in the slab down path). PPC does the right thing, but x86_64 and ia64 don't (This was the reason Sonny hit upon a slab bug during cpu offline on ppc and could not reproduce on other arches). This patch fixes it for x86_64. I won't attempt ia64 as I cannot test it. Credit for spotting this should go to Alok. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Undo the earlier changes to remove unrolled copy/memset ↵Andi Kleen6-23/+542
functions They cause quite bad performance regressions on Netburst This is temporary until we can get new optimized functions for these CPUs. This undoes changes that were done in 2.6.15 and in 2.6.16-rc1, essentially bringing the code back to 2.6.14 level. Only change is I renamed the X86_FEATURE_K8_C flag to X86_FEATURE_REP_GOOD and fixed the check for the flag and also fixed some comments. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Fix swiotlb dma_alloc_coherent fallbackAndi Kleen1-0/+3
This avoids BUG_ONs in the low level allocator when an illegal GFP mask is added. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: [PATCH] timer resumeShaohua Li2-0/+17
At resume time, TSC's value or something similar might be changed a lot against suspend time. This could make system gets a very big lost ticks. See http://bugzilla.kernel.org/show_bug.cgi?id=5825 Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Automatically enable apicmaintimer on ATI boardsAndi Kleen1-0/+8
They all have problems with IRQ 0 routing, so just use the APIC on them. Can be overwritten with "noapicmaintimer" Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Allow to run main time keeping from the local APIC interruptAndi Kleen2-9/+70
Another piece from the no-idle-tick patch. This can be enabled with the "apicmaintimer" option. This is mainly useful when the PIT/HPET interrupt is unreliable. Note there are some systems that are known to stop the APIC timer in C3. For those it will never work, but this case should be automatically detected. It also only works with PM timer right now. When HPET is used the way the main timer handler computes the delay doesn't work. It should be a bit more efficient because there is one less regular interrupt to process on the boot processor. Requires earlier bugfix from Venkatesh Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Disallow kprobes on NMI handlersAndi Kleen3-13/+18
A kprobe executes IRET early and that could cause NMI recursion and stack corruption. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-04[PATCH] x86_64: Update defconfigAndi Kleen1-7/+14
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01Merge branch 'release' of ↵Linus Torvalds4-9/+86
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
2006-02-01[PATCH] Add faster __iowrite32_copy routine for x86_64Bryan O'Sullivan2-1/+27
This assembly version is measurably faster than the generic version in lib/iomap_copy.c. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01[PATCH] x86_64: compat_sys_futimesat fixAndrew Morton1-1/+1
We need to use the compat function here. Pointer out by Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-31[PATCH] PCI: handle bogus MCFG entriesAndi Kleen1-5/+14
Handle more bogus MCFG entries Some Asus P4 boards seem to have broken MCFG tables with only a single entry for busses 0-0. Special case these and assume they mean all busses can be accessed. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-24[ACPI] merge 3549 4320 4485 4588 4980 5483 5651 acpica asus fops pnpacpi ↵Len Brown4-9/+86
branches into release Signed-off-by: Len Brown <len.brown@intel.com>
2006-01-18[PATCH] vfs: *at functions: x86_64Ulrich Drepper1-0/+13
Wire up the x86_64 syscalls. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18[PATCH] x86_64: Fix MCE exception stack for boot CPUJan Beulich1-1/+1
Fix a typo/mis-merge in one of the previous patches. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18[PATCH] e1000: Added disable packet split capabilityJesse Brandeburg1-0/+1
Adds the ability to disability packet split at compile time and use the legacy receive path on PCI express hardware. Made this a CONFIG option and modified the Kconfig, to reflect the new option. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: John Ronciak <john.ronciak@intel.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2006-01-16[PATCH] x86_64: add x86-64 support for memory hot-addMatt Tolentino2-33/+134
Add x86-64 specific memory hot-add functions, Kconfig options, and runtime kernel page table update functions to make hot-add usable on x86-64 machines. Also, fixup the nefarious conditional locking and exports pointed out by Andi. Tested on Intel and IBM x86-64 memory hot-add capable systems. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bitAndi Kleen4-2/+111
Another try at this. For 32bit follow the 32bit implementation from Ingo - mappings are growing down from the end of stack now and vary randomly by 1GB. Randomized mappings for 64bit just vary the normal mmap break by 1TB. I didn't bother implementing full flex mmap for 64bit because it shouldn't be needed there. Cc: mingo@elte.hu Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Remove elf32_map in 32bit ELF loaderAndi Kleen1-17/+0
It's identical to the standard elf_map. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: eliminate empty_bad_{page,{pte,pmd}_table}Jan Beulich1-71/+37
... as they are no longer needed. Since there were hard-coded numbers in the file, the patch also adds a mechanism to avoid these (otherwise potential future changes would again and again require adjusting these numbers). Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Update defconfigAndi Kleen1-4/+17
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16x86-64: fix initrd freeingLinus Torvalds1-1/+1
The comparison of the initrd start address against "&_end" is unnecessary and incorrect. Make it match the x86 code that just compares the passed-in arguments. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Don't try to put kernel page tables beyond ZONE_DMA32.Andi Kleen1-13/+5
For not fully explained reasons it broke mem=... on several setups. Also minor cleanup. Cc: axboe@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: set do_not_nx as cpuinitdataAndi Kleen1-1/+1
'check_efer' uses 'do_not_nx'. Hotpluged CPU could wrongly disable NX. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: lapic resume uses correct base addressShaohua Li1-4/+1
uses correct lapic base address. The set_fixmap appears useless. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Only let user select PM timer support when EMBEDDEDAndi Kleen1-1/+1
To avoid mistakes. I got a few reports where people got broken timing because they didn't have the PMTIMER fallback. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-16[PATCH] x86_64: Allow nesting of int3 by default for kprobesAndi Kleen2-10/+7
This unbreaks recursive kprobes which didn't work anymore due to an earlier patch which converted the debug entry point to use an IST. This also allows nesting of the debug entry point too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] amd64: task_stack_page()Al Viro2-4/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] amd64: task_pt_regs()Al Viro5-18/+10
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12[PATCH] amd64: task_thread_info()Al Viro5-6/+6
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix SMP bootup with CONFIG_KDUMP enabledVivek Goyal1-2/+9
o This fix was posted for i386 long back. Posting it for x86_64. http://marc.theaimsgroup.com/?l=linux-kernel&m=110380103229830&w=2 o This patch fixes the problem of secondary cpus boot up. This situation is faced when kernel is built for default locations like 16MB and onwards. In this configuration, only primary cpu (BP) comes and secondary cpus don't boot. o Problem occurs because in trampoline code, lgdt is not able to load the GDT as it happens to be situated beyond 16MB. This is due to the fact that cpu is still in real mode and default operand size is 16bit. o This patch uses lgdtl instead of lgdt to force operand size to 32 instead of 16. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't confuse noapic with noapictimerAndi Kleen1-1/+3
Handling common prefixes is tricky. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: don't copy command line twiceJan Beulich1-4/+0
... reducing the amount of changes Xen has to do. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] i386/x86-64: make setup_early_printk() usage consistentJan Beulich2-4/+2
The explicit and implicit calls to setup_early_printk() were passing inconsistent arguments. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Allow kernel page tables upto the end of memoryAndi Kleen1-2/+14
Previously they would be only allocated before the kernel text at 1MB. This limited the maximum supported memory to 128GB. Now allow the e820 allocator to put them everywhere. Try to put them beyond any DMA zones to avoid filling them up. This should free some GFP_DMA memory compared to earlier kernels. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Use safe_smp_processor_id in MCE handlerAndi Kleen1-1/+2
hard_smp_processor_id would return the local APIC id instead of the Linux processor id. On big systems they are often not identical. safe_smp_processor_id is just a wrapper around it that does the necessary conversions. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Some housekeeping in local APIC codeAndi Kleen4-64/+48
Remove support for obsolete hardware and cleanup. - Remove checks for non integrated APICs - Replace apic_write_around with apic_write. - Remove apic_read_around - Remove APIC version reads used by old workarounds - Remove old workaround for Simics - Fix indentation Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Display meaningful part of filename during BUG()Jan Beulich2-7/+17
When building in a separate objtree, file names produced by BUG() & Co. can get fairly long; printing only the first 50 characters may thus result in (almost) no useful information. The following change makes it so that rather the last 50 characters of the filename get printed. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Reduce screen space needed by stack traceJan Beulich1-13/+12
Especially under Xen, where the console cannot be adjusted to more than 25 lines, it is fairly important that the information displayed during a panic is as compact as possible. Below adjustments work towards that. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix get_cmos_time()Jan Beulich1-7/+4
Due to a broken condition, the body of the loop that is intended to wait for the Update-In-Progress bit to get set and then cleared again was never entered; in fact, the entire loop was optimized out by the compiler. Here is a change to fix the condition (and to also move the initialization of locals out of the spin lock protected region). Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: No need to export get_cmos_time anymoreAndi Kleen2-4/+1
It was only needed for APM Pointed out by Jan Beulich Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unused AMD K8 C stepping flagAndi Kleen1-6/+0
X86_FEATURE_K8_C was a synthetic Linux CPUID flag that was used for some code optimizations in Opteron C stepping or later. But support for pre C stepping optimizations has been removed, so this isn't needed anymore. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: sparse warning cleanupsStephen Hemminger1-1/+1
Fix some trivial sparse warnings in x86_64 code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Move NUMA page_to_pfn/pfn_to_page functions out of lineAndi Kleen1-0/+36
Saves about ~18K .text in defconfig There would be more optimization potential, but that's for later. Suggestion originally from Bill Irwin. Fix from Andy Whitcroft. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unused segmentsAndi Kleen1-3/+2
They used to be used by the reboot code, but not anymore. Noticed by Jan Beulich Cc: JBeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: ioapic virtual wire mode fixVivek Goyal1-1/+2
o Currently, during kexec reboot, IOAPIC is re-programmed back to virtual wire mode if there was an i8259 connected to it. This enables getting timer interrupts in second kernel in legacy mode. o After putting into virtual wire mode, IOAPIC delivers the i8259 interrupts to CPU0. This works well for kexec but not for kdump as we might crash on a different CPU and second kernel will not see timer interrupts. o This patch modifies the redirection table entry to deliver the timer interrupts to the cpu we are rebooting (instead of hardcoding to zero). This ensures that second kernel receives timer interrupts even on a non-boot cpu. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Inclusion of ScaleMP vSMP architecture patches - vsmp_archRavikiran G Thirumalai3-0/+64
Introduce vSMP arch to the kernel. This patch: 1. Adds CONFIG_X86_VSMP 2. Adds machine specific macros for local_irq_disabled, local_irq_enabled and irqs_disabled 3. Writes to the vSMP CTL device to indicate kernel compiled with CONFIG_VSMP Signed-off-by: Ravikiran Thirumalai <kiran@scalemp.com> Signed-off-by: Shai Fultheim <shai@scalemp.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Memorize location of i8259 for reboots.Eric W. Biederman1-36/+108
Currently we attempt to restore virtual wire mode on reboot, which only works if we can figure out where the i8259 is connected. This is very useful when we are kexec another kernel and likely helpful to an peculiar BIOS that make assumptions about how the system is setup. Since the acpi MADT table does not provide the location where the i8259 is connected we have to look at the hardware to figure it out. Most systems have the i8259 connected the local apic of the cpu so won't be affected but people running Opteron and some serverworks chipsets should be able to use kexec now. In addition this patch removes the hard coded assumption that the io_apic that delivers isa interrups is always known to the kernel as io_apic 0. There does not appear to be anything to guarantee that assumption is true. And From: Vivek Goyal <vgoyal@in.ibm.com> A minor fix to the patch which remembers the location of where i8259 is connected. Now counter i has been replaced by apic. counter i is having some junk value which was leading to non-detection of i8259 connected to IOAPIC. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: allow setting RF in EFLAGSChuck Ebbert2-6/+12
Setting RF (resume flag) allows a debugger to resume execution after a code breakpoint without tripping the breakpoint again. It is reset by the CPU after executing one instruction. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: "invalid operand" -> "invalid opcode"Chuck Ebbert1-1/+1
The manual says Int 6 is "invalid opcode", not "invalid operand". Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Sparse warnings fix.Luiz Fernando Capitulino1-2/+2
Fixes the following sparse warnings: arch/x86_64/kernel/mce_amd.c:321:29: warning: Using plain integer as NULL pointer arch/x86_64/kernel/mce_amd.c:410:41: warning: Using plain integer as NULL pointer Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove useless KDB vectorAndi Kleen4-17/+1
It was set as an NMI, but the NMI bit always forces an interrupt to end up at vector 2. So it was never used. Remove. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Tell user to enable GART_IOMMU when neededAndi Kleen1-2/+3
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix warning in nmi.c on uniprocessor kernelsAndi Kleen1-0/+2
Fix CC arch/x86_64/kernel/nmi.o linux/arch/x86_64/kernel/nmi.c: In function ???check_nmi_watchdog???: linux/arch/x86_64/kernel/nmi.c:155: warning: statement with no effect on Uniprocessor builds. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Allocate PDAs in the local nodeRavikiran G Thirumalai3-1/+22
Patch uses a static PDA array early at boot and reallocates processor PDA with node local memory when kmalloc is ready, just before pda_init. The boot_cpu_pda is needed since the cpu_pda is used even before pda_init for that cpu is called (to set the static per-cpu areas offset table etc) Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Node local pda take 2 -- cpu_pda preparationRavikiran G Thirumalai7-18/+17
Helper patch to change cpu_pda users to use macros to access cpu_pda instead of the cpu_pda[] array. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Early initialization of cpu_to_nodeRavikiran Thirumalai2-0/+27
Patch enables early intialization of cpu_to_node. apicid_to_node is built by reading the SRAT table, from acpi_numa_init with ACPI_NUMA and k8_scan_nodes with K8_NUMA. x86_cpu_to_apicid is built by parsing the ACPI MADT table, from acpi_boot_init. We combine these two tables and setup cpu_to_node. Early intialization helps the static per_cpu_areas in getting pages from correct node. Change since last release: Do not initialize early init_cpu_to_node for faking node cases. Patch tested on TYAN dual core 4P board with K8 only, ACPI_NUMA. Tested on EM64T NUMA. Also tested with numa=off, numa=fake, and running a kernel compiled with NUMA on a regular EM64 2 way SMP. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix up white space in time.cAndi Kleen1-39/+32
No functional changes. And remove one redundant prototype. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Use standard __always_inline in vsyscall.cAndi Kleen1-6/+5
Replacing the old home brewn __force_inline. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: On Intel CPUs don't do an additional CPU sync before RDTSCAndi Kleen3-21/+9
RDTSC serialization using cpuid is not needed for Intel platforms. This increases gettimeofday performance. Cc: vojtech@suse.cz Cc: rohit.seth@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Support alternative() in vsyscallsAndi Kleen1-2/+10
The real vsyscall .text addresses are not mapped when the alternative() replacement runs early, so use some black magic to access them using the direct mapping. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't try to synchronize the TSC over CPUs on Intel CPUs at ↵Andi Kleen2-3/+9
boot. They already do this in hardware and the Linux algorithm actually adds errors. Cc: mingo@elte.hu Cc: rohit.seth@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: x86_64 write apic id fixVivek Goyal1-1/+1
o Apic id is in most significant 8 bits of APIC_ID register. Current code is trying to write apic id to least significant 8 bits. This patch fixes it. o This fix enables booting uni kdump capture kernel on a cpu with non-zero apic id. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove duplicate exportsBrian Gerst1-22/+0
Remove exports that are already exported from the object's source file. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: unexport pci_*_consistentBrian Gerst1-5/+0
These functions are inlines and shouldn't be exported. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Make it clear in machine checks that it's an hardware problemAndi Kleen1-0/+4
Hopefully the users will take the hint. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Clean up copy_*_userAndi Kleen1-221/+23
- Remove optimization for old B stepping Opteron - Make the fast path for copies with a multiple of eight length faster. - Minor instruction rearrangement to hopefully avoid a pipeline stall or two. - Add comment about errata to consider. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Use function pointers to call DMA mapping functionsMuli Ben-Yehuda8-451/+477
AK: I hacked Muli's original patch a lot and there were a lot of changes - all bugs are probably to blame on me now. There were also some changes in the fall back behaviour for swiotlb - in particular it doesn't try to use GFP_DMA now anymore. Also all DMA mapping operations use the same core dma_alloc_coherent code with proper fallbacks now. And various other changes and cleanups. Known problems: iommu=force swiotlb=force together breaks needs more testing. This patch cleans up x86_64's DMA mapping dispatching code. Right now we have three possible IOMMU types: AGP GART, swiotlb and nommu, and in the future we will also have Xen's x86_64 swiotlb and other HW IOMMUs for x86_64. In order to support all of them cleanly, this patch: - introduces a struct dma_mapping_ops with function pointers for each of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb (software IOMMU) and nommu (no IOMMU). - gets rid of: if (swiotlb) return swiotlb_xxx(); - PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set This makes swiotlb faster by avoiding double copying in some cases. Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org> Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Reject SRAT tables that don't cover all memoryAndi Kleen1-0/+33
Broken BIOS on Iwill 8way systems reports these and it causes the bootmem allocator to crash. Add a sanity check if all the PXMs in the SRAT table cover all memory as reported by e820. If the sanity check fails the SRAT is rejected and the code will fall back to discover the NUMA topology using the K8 northbridge registers when applicable. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Add idle notifiersAndi Kleen6-0/+60
This adds a new notifier chain that is called with IDLE_START when a CPU goes idle and IDLE_END when it goes out of idle. The context can be idle thread or interrupt context. Since we cannot rely on MONITOR/MWAIT existing the idle end check currently has to be done in all interrupt handlers. They were originally inspired by the similar s390 implementation. They have a variety of applications: - They will be needed for CONFIG_NO_IDLE_HZ - They can be used for oprofile to fix up the missing time in idle when performance counters don't tick. - They can be used for better C state management in ACPI - They could be used for microstate accounting. This is just infrastructure so far, no users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Clean up some printks in NUMA codeAndi Kleen1-4/+4
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix up coding style in numa.cAndi Kleen1-2/+2
No functional changes Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Fix off by one in IOMMU checkAndi Kleen4-4/+6
Fix off by one when checking if the machine has enougn memory to need IOMMU This caused the IOMMUs to be needlessly enabled for mem=4G Based on a patch from Jon Mason Signed-off-by: jdmason@us.ibm.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Handle missing local APIC timer interrupts on C3 stateVenkatesh Pallipadi2-2/+58
Whenever we see that a CPU is capable of C3 (during ACPI cstate init), we disable local APIC timer and switch to using a broadcast from external timer interrupt (IRQ 0). Patch below adds the code for x86_64. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] i386/x86-64: Remove sub jiffy profile timer supportVenkatesh Pallipadi1-51/+2
Remove the finer control of local APIC timer. We cannot provide a sub-jiffy control like this when we use broadcast from external timer in place of local APIC. Instead of removing this only on systems that may end up using broadcast from external timer (due to C3), I am going the "I'm feeling lucky" way to remove this fully. Basically, I am not sure about usefulness of this code today. Few other architectures also don't seem to support this today. If you are using profiling and fine grained control and don't like this going away in normal case, yell at me right now. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Report hardware breakpoints in user space when triggered by ↵John Blackwood1-4/+2
the kernel I would like to throw out a suggestion for a possible change in the way that the debug register traps are handled in do_debug() when the trap occurs in kernel-mode. In the x86_64 version of do_debug(), the code will skip around sending a SIGTRAP to the current task if the trap occurred while in kernel mode. On the i386-side of things, if the access happens to occur in kernel mode (say during a read(2) of user's buffer that matches the address of a debug register trap), then the do_debug() routine for i386 will go ahead and call send_sigtrap() and send the SIGTRAP signal. The send_sigtrap() code will also set the info.si_addr to NULL in this case (even though I don't understand why, since the SIGTRAP siginfo processing doesn't use the si_addr field...). So I would like to suggest that the x86_64 do_debug() routine also follow this type of behavior and have it go ahead and send the SIGTRAP signal to the current task, even if the debug register trap happens to have occurred in kernel mode. I have taken a stab at a patch for this change below. (It includes the i386-ish change for setting si_addr to NULL when the trap occurred in kernel mode.) It seems like a useful feature to be able to 'watch' a user location that might also be modified in the kernel via a system service call, and have the debugger report that information back to the user, rather than to just silently ignore the trap. Additionally, I realize that users that pull in a kernel debugger such as KGDB into their kernel might want to remove this change below when they add in KGDB support. However, they could alternatively look at the current task's thread.debugreg[] values to see if the trap occurred due to KGDB or instead because of a user-space debugger trap, and still honor the user SIGTRAP processing (instead of the KGDB breakpoint processing) if the trap matches up with the thread.debugreg[] registers. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Convert page fault error codes to symbolic constants.Andi Kleen1-17/+17
Much better to deal with these than with the magic numbers. And remove the comment describing the bits - kernel source is no replacement for an architecture manual. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Implement is_compat_task the right wayAndi Kleen3-0/+7
By setting a flag during a 32bit system call only Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unnecessary case from the page fault handlerAndi Kleen1-4/+7
Don't need to do the vmalloc check for the module range because its PML4 is shared with the kernel text. Also removed an unnecessary TLB flush. Pointed out by Jan Beulich Cc: jbeulich@novell.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Align and pad x86_64 GDT on page boundaryRavikiran G Thirumalai4-10/+16
This patch is on the same lines as Zachary Amsden's i386 GDT page alignemnt patch in -mm, but for x86_64. Patch to align and pad x86_64 GDT on page boundries. [AK: some minor cleanups and fixed incorrect TLS initialization in CPU init.] Signed-off-by: Nippun Goel <nippung@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Allow compilation on a 32bit biarch toolchainAndi Kleen5-0/+8
This might help on distributions that use a 32bit biarch compiler. First pass -m64 by default. Secondly add some more .code32s because at least the Ubuntu biarch 32bit as called by gcc doesn't seem to handle -m64 -m32 as generated by the Makefile without such assistance. And finally make sure the linker script can be preprocessed with a 32bit cpp. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Make udelay more accurateRoss Biro1-1/+1
The attempt to avoid overflow in __delay caused varying precision on different CPUs depending on differences in the CPU speed. We should be able to do this multiplication with out overflowing provided the cpu is running at less than about 128 GHz. xloops < 20000 * 0x10c6. loops_per_jiffy * HZ <= cpu_clock_speed. So if the cpu clock speed < 2^64/(20000 * 0x10c6) = 2^64/ 51E6CC0 < 2^64/2^27 = 2^37 = 128G we will not overflow the calculation. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Return -1 for unknown PCI bus affinityAndi Kleen1-3/+4
When we don't know the node a PCI bus is connected to return -1. This matches the generic code. Noticed by Ravikiran G Thirumalai <kiran@scalex86.org> Cc: Ravikiran G Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Validate SLIT tableAndi Kleen1-0/+27
A lot of Opteron BIOS just pass 10 in all SLIT entries (10 is the normalized unit). This is actually worse than the default heuristic because it leads to pci_distance not knowing the difference between local and remote nodes anymore. This messes up some NUMA heuristics in generic code. In this case it's better to fall back to the default heuristic which just does nodea == nodeb ? 10 : 20. This patch does some basic sanity checking on the SLIT and only accepts the SLIT when it passes. Invariants enforced are: - Node to itself shall be 10 - Any other distance shouldn't be 10 - Distances smaller than 10 are illegal Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Generalize DMI and enable for x86-64Andi Kleen3-1/+17
Some people need it now on 64bit so reuse the i386 code for x86-64. This will be also useful for future bug workarounds. It is a bit simplified there because there is no need to do it very early on x86-64. This means it doesn't need early ioremap et.al. We run it as a core initcall right now. I hope it's not needed for early setup. I added a general CONFIG_DMI symbol in case IA64 or someone else wants to reuse the code later too. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove bogus file in arch/x86_64/pciAndi Kleen1-22/+0
This was a backup file that somehow made it into the official tree. Never used for anything. Remove. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Add missing newline in IOMMU error messageAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: fix page fault from show_trace()Jan Beulich1-7/+5
The introduction of call_softirq switching to the interrupt stack several releases earlier resulted in a problem with the code in show_trace, which assumes that it can pick the previous stack pointer from the end of the interrupt stack. Cc: Andi Kleen <ak@muc.de> Cc: Arjan van de Ven <arjanv@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: fix single step handling for 32bit processesPeter Beutner1-19/+7
Be more careful with TF handling to fix some copy protection codes in wine patch originally for i386 by Linus, then ported to x86_64 by Andi Kleen see: [PATCH] x86_64: Some fixes for single step handling commit: be61bff789fe44bfb6d9282d8f7eccc860bdcfb6 But it was never applied to the ia32 emulation code which breaks some copy-protection schemes under wine when running on x86_64. Signed-off-by: Peter Beutner <p.beutner@gmx.net> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] i386/x86-64: Don't IPI to offline cpus on shutdownEric W. Biederman1-4/+6
So why are we calling smp_send_stop from machine_halt? We don't. Looking more closely at the bug report the problem here is that halt -p is called which triggers not a halt but an attempt to power off. machine_power_off calls machine_shutdown which calls smp_send_stop. If pm_power_off is set we should never make it out machine_power_off to the call of do_exit. So pm_power_off must not be set in this case. When pm_power_off is not set we expect machine_power_off to devolve into machine_halt. So how do we fix this? Playing too much with smp_send_stop is dangerous because it must also be safe to be called from panic. It looks like the obviously correct fix is to only call machine_shutdown when pm_power_off is defined. Doing that will make Andi's assumption about not scheduling true and generally simplify what must be supported. This turns machine_power_off into a noop like machine_halt when pm_power_off is not defined. If the expected behavior is that sys_reboot(LINUX_REBOOT_CMD_POWER_OFF) becomes sys_reboot(LINUX_REBOOT_CMD_HALT) if pm_power_off is NULL this is not quite a comprehensive fix as we pass a different parameter to the reboot notifier and we set system_state to a different value before calling device_shutdown(). Unfortunately any fix more comprehensive I can think of is not obviously correct. The core problem is that there is no architecture independent way to detect if machine_power will become a noop, without calling it. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64/i386: Remove preempt disable calls in lowlevel IPIZwane Mwaikambo2-10/+5
I noticed that some lowlevel send_IPI_mask helpers had a hotplug/preempt race whereupon the cpu_online_map was read before disabling preemption; ... cpumask_t mask = cpu_online_map; int cpu = get_cpu(); cpu_clear(cpu, mask); ... But then i realised that there is no need for these lowlevel functions to be going through all this trouble when all the callers are already made hotplug/preempt safe. Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: increase MCE bank countsShaohua Li1-11/+14
There is one CPU here whose MCE bank count is 6. This patch increases x86_64's MCE bank count. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: another mb() for smpboot.cBenjamin LaHaise1-0/+1
The following is probably a good idea given that the atomic_set() isn't a barrier here either. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Move int 3 handler to debug stack and allow to increase it.Jan Beulich4-11/+76
This - switches the INT3 handler to run on an IST stack (to cope with breakpoints set by a kernel debugger on places where the kernel's %gs base hasn't been set up, yet); the IST stack used is shared with the INT1 handler's [AK: this also allows setting a kprobe on the interrupt/exception entry points] - allows nesting of INT1/INT3 handlers so that one can, with a kernel debugger, debug (at least) the user-mode portions of the INT1/INT3 handling; the nesting isn't actively enabled here since a kernel- debugger-free kernel doesn't need it Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't confuse apic=... command line option with apicAndi Kleen1-1/+3
Previously apic was foced with apic=logopt was specified. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Dont't disable early PCI scan with apicAndi Kleen1-3/+0
It might be still needed for non APIC related issues. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] i386/x86-64: Update AMD CPUID flagsAndi Kleen1-6/+9
Print bits for RDTSCP, SVM, CR8-LEGACY. Also now print power flags on i386 like x86-64 always did. This will add a new line in the 386 cpuinfo, but that shouldn't be an issue - did that in the past too and I haven't heard of any breakage. I shrunk some of the fields in the i386 cpuinfo_x86 to chars to make up for the new int "x86_power" field. Overall it's smaller than before. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] i386/x86-64: Generalize X86_FEATURE_CONSTANT_TSC flagAndi Kleen1-2/+3
Define it for i386 too. This is a synthetic flag that signifies that the CPU's TSC runs at a constant P state invariant frequency. Fix up the logic on x86-64/i386 to set it on all known CPUs. Use the AMD defined bit to set it on future AMD CPUs. Cc: venkatesh.pallipadi@intel.com Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove enable/disable_hltAndi Kleen1-30/+9
Was only used by the floppy driver to work around some ancient hardware bug that should never occur on any 64bit system. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Don't reserve hotplug CPUs by defaultAndi Kleen1-7/+4
Most users don't need it so no need to waste memory. This means an user has to specify the appropiate number of hotplug CPUs on the command line with additional_cpus=... or fix their BIOS to follow the convention in Documentation/x86-64/cpu-hotplug-spec Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: No need to remove NT during CPU setupAndi Kleen1-7/+0
head.S already clears EFLAGS completely. Following an i386 patch from Zachary Amsden. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Adjust page fault handlingJan Beulich1-3/+4
Adjust page fault protection error check before considering it to be a vmalloc synchronization candidate. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Remove unprotected iretJan Beulich1-1/+1
Make sure no iret can fault without attached recovery code. Cannot happen in the normal case, but might be useful with kernel debuggers Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Clean up double fault handlingJan Beulich1-1/+17
Since a double fault always implies that kernel data structures are corrupt, this fault should neither be handed to user mode handling, nor should the handler allow resuming the faulting code stream (since architecturally this isn't a fault, but an abort). Note that this slightly depends on the previously submitted patch adjusting the prototype of notify_die() (a compiler warning will result without that other patch). AK: Removed obsolete CONFIG_CHECKING code, added comments Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: make trap information available to die notification handlersJan Beulich3-16/+27
This adjusts things so that handlers of the die() notifier will have sufficient information about the trap currently being handled. It also adjusts the notify_die() prototype to (again) match that of i386. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Removing unused function die_if_kernel().Jan Beulich1-5/+0
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: fix bound check IDT gateJan Beulich1-2/+2
Other than apparently commonly assumed, the bound instruction does not require the corresponding IDT entry to have DPL 3. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Separate CONFIG_UNWIND_INFO from CONFIG_DEBUG_INFOJan Beulich2-2/+4
As a follow-up to the introduction of CONFIG_UNWIND_INFO, this separates the generation of frame unwind information for x86-64 from that of full debug information. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Support constant TSC feature in future AMD CPUs.Andi Kleen1-0/+6
Based on the documentation recently posted by Richard Brunner. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: More CFI fixes for 32bit entry codeJan Beulich1-5/+28
Frame unwind information was still incorrect for ia32_ptregs_common (sorry, my fault), and could be improved for some of the other entry points. Signed-Off-By: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] x86_64: Update defconfigAndi Kleen1-47/+69
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] capable/capability.h (arch/)Randy Dunlap2-0/+2
arch: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11[PATCH] kprobes: fix race in recovery of reentrant probeKeshavamurthy Anil S1-0/+9
There is a window where a probe gets removed right after the probe is hit on some different cpu. In this case probe handlers can't find a matching probe instance related to break address. In this case we need to read the original instruction at break address to see if that is not a break/int3 instruction and recover safely. Previous code had a bug where we were not checking for the above race in case of reentrant probes and the below patch fixes this race. Tested on IA64, Powerpc, x86_64. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds3-41/+6
Fix up some trivial conflicts in {i386|ia64}/Makefile
2006-01-10[PATCH] kprobes: fix build breakageAnanth N Mavinakayanahalli1-3/+3
The following patch (against 2.6.15-rc5-mm3) fixes a kprobes build break due to changes introduced in the kprobe locking in 2.6.15-rc5-mm3. In addition, the patch reverts back the open-coding of kprobe_mutex. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kprobes: arch_remove_kprobeAnil S Keshavamurthy1-1/+3
Currently arch_remove_kprobes() is only implemented/required for x86_64 and powerpc. All other architecture like IA64, i386 and sparc64 implementes a dummy function which is being called from arch independent kprobes.c file. This patch removes the dummy functions and replaces it with #define arch_remove_kprobe(p, s) do { } while(0) Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kprobes-changed-from-using-spinlock-to-mutex fixKeshavamurthy Anil S1-2/+2
Based on some feedback from Oleg Nesterov, I have made few changes to previously posted patch. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kprobes: changed from using spinlock to mutexAnil S Keshavamurthy1-5/+2
Since Kprobes runtime exception handlers is now lock free as this code path is now using RCU to walk through the list, there is no need for the register/unregister{_kprobe} to use spin_{lock/unlock}_isr{save/restore}. The serialization during registration/unregistration is now possible using just a mutex. In the above process, this patch also fixes a minor memory leak for x86_64 and powerpc. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] don't include ioctl32.h in driversChristoph Hellwig1-1/+0
These days ioctl32.h is only used for communication of fs/compat.c and fs/compat_ioctl.c and doesn't contain anything of interest to drivers. Remove inclusion in various drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] sanitize building of fs/compat_ioctl.cChristoph Hellwig2-35/+1
Now that all these entries in the arch ioctl32.c files are gone [1], we can build fs/compat_ioctl.c as a normal object and kill tons of cruft. We need a special do_ioctl32_pointer handler for s390 so the compat_ptr call is done. This is not needed but harmless on all other architectures. Also remove some superflous includes in fs/compat_ioctl.c Tested on ppc64. [1] parisc still had it's PPP handler left, which is not fully correct for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd kick in for all netdevice users. We can introduce a proper handler in one of the next patch series by adding a compat_ioctl method to struct net_device but for now let's just kill it - parisc doesn't compile in mainline anyway and I don't want this to block this patchset. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] move rtc compat ioctl handling to fs/compat_ioctl.cChristoph Hellwig1-47/+0
This patch implements generic handling of RTC_IRQP_READ32, RTC_IRQP_SET32, RTC_EPOCH_READ32 and RTC_EPOCH_SET32 in fs/compat_ioctl.c. It's based on the x86_64 code which needed a little massaging to be endian-clean. parisc used COMPAT_IOCTL or generic w_long handlers for these whichce is wrong and can't work because the ioctls encode sizeof(unsigned long) in their ioctl number. parisc also duplicated COMPAT_IOCTL entries for other rtc ioctls which I remove in this patch, too. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthew Wilcox <matthew@wil.cx> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] common compat_sys_timer_createChristoph Hellwig2-20/+1
The comment in compat.c is wrong, every architecture provides a get_compat_sigevent() for the IPC compat code already. This basically moves the x86_64 version to common code and removes all the others. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kexec: change CONFIG_PHYSICAL_START dependencyManeesh Soni1-12/+19
I have heard some complaints about people not finding CONFIG_CRASH_DUMP option and also some objections about its dependency on CONFIG_EMBEDDED. The following patch ends that dependency. I thought of hiding it under CONFIG_KEXEC, but CONFIG_PHYSICAL_START could also be used for some reasons other than kexec/kdump and hence left it visible. I will also update the documentation accordingly. o Following patch removes the config dependency of CONFIG_PHYSICAL_START on CONFIG_EMBEDDED. The reason being CONFIG_CRASH_DUMP option for kdump needs CONFIG_PHYSICAL_START which makes CONFIG_CRASH_DUMP depend on CONFIG_EMBEDDED. It is not always obvious for kdump users to choose CONFIG_EMBEDDED. o It also shifts the palce where this option appears, to make it closer to kexec and kdump options. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Haren Myneni <haren@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: read previous kernel's memoryVivek Goyal2-0/+48
- Moving the crash_dump.c file to arch dependent part as kmap_atomic_pfn is specific to i386 and highmem may not exist in other archs. - Use ioremap for x86_64 to map the previous kernel memory. - In copy_oldmem_page(), we now directly copy to the user/kernel buffer and avoid the unneccesary copy to a kmalloc'd page. Signed-off-by: Rachita Kothiyal <rachita@in.ibm.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: x86_64 save cpu registers upon crashVivek Goyal2-0/+77
- Saving the cpu registers of all cpus before booting in to the crash kernel. - crash_setup_regs will save the registers of the cpu on which panic has occured. One of the concerns ppc64 folks raised is that after capturing the register states, one should not pop the current call frame and push new one. Hence it has been inlined. More call frames later get pushed on to stack (machine_crash_shutdown() and machine_kexec()), but one will not want to backtrace those. - Not very sure about the CFI annotations. With this patch I am getting decent backtrace with gdb. Assuming, compiler has generated enough debugging information for crash_kexec(). Coding crash_setup_regs() in pure assembly makes it tricky because then it can not be inlined and we don't want to return back after capturing register states we don't want to pop this call frame. - Saving the non-panicing cpus registers will be done in the NMI handler while shooting down them in machine_crash_shutdown. - Introducing CRASH_DUMP option in Kconfig for x86_64. Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: x86_64 kexec on panicakpm@osdl.org1-1/+85
) From: Vivek Goyal <vgoyal@in.ibm.com> - Implementing the machine_crash_shutdown for x86_64 which will be called by crash_kexec (called in case of a panic, sysrq etc.). Here we do things similar to i386. Disable the interrupts, shootdown the cpus and shutdown LAPIC and IOAPIC. Changes in this version: - As the Eric's APIC initialization patches are reverted back, reintroducing LAPIC and IOAPIC shutdown. - Added some comments on CPU hotplug, modified code as suggested by Andi kleen. Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: x86_64: add elfcorehdr command line optionVivek Goyal1-0/+9
- elfcorehdr= specifies the location of elf core header stored by the crashed kernel. This command line option will be passed by the kexec-tools to capture kernel. Changes in this version : - Added more comments in kernel-parameters.txt and in code. Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: x86_64: add memmmap command line optionakpm@osdl.org2-0/+48
) From: Vivek Goyal <vgoyal@in.ibm.com> - This patch introduces the memmap option for x86_64 similar to i386. - memmap=exactmap enables setting of an exact E820 memory map, as specified by the user. Changes in this version: - Used e820_end_of_ram() to find the max_pfn as suggested by Andi kleen. - removed PFN_UP & PFN_DOWN macros - Printing the user defined map also. Signed-off-by: Murali M Chakravarthy <muralim@in.ibm.com> Signed-off-by: Hariprasad Nellitheertha <nharipra@gmail.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10[PATCH] kdump: dynamic per cpu allocation of memory for saving cpu registersVivek Goyal1-2/+0
- In case of system crash, current state of cpu registers is saved in memory in elf note format. So far memory for storing elf notes was being allocated statically for NR_CPUS. - This patch introduces dynamic allocation of memory for storing elf notes. It uses alloc_percpu() interface. This should lead to better memory usage. - Introduced based on Andi Kleen's and Eric W. Biederman's suggestions. - This patch also moves memory allocation for elf notes from architecture dependent portion to architecture independent portion. Now crash_notes is architecture independent. The whole idea is that size of memory to be allocated per cpu (MAX_NOTE_BYTES) can be architecture dependent and allocation of this memory can be architecture independent. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[CRYPTO] Allow AES C/ASM implementations to coexistHerbert Xu1-0/+2
As the Crypto API now allows multiple implementations to be registered for the same algorithm, we no longer have to play tricks with Kconfig to select the right AES implementation. This patch sets the driver name and priority for all the AES implementations and removes the Kconfig conditions on the C implementation for AES. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-01-09[CRYPTO] Use standard byte order macros wherever possibleHerbert Xu1-12/+11
A lot of crypto code needs to read/write a 32-bit/64-bit words in a specific gender. Many of them open code them by reading/writing one byte at a time. This patch converts all the applicable usages over to use the standard byte order macros. This is based on a previous patch by Denis Vlasenko. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-01-09kbuild: drop vmlinux dependency from "make install"H. Peter Anvin3-41/+6
This removes the dependency from vmlinux to install, thus avoiding the current situation where "make install" has a nasty tendency to leave root-turds in the working directory. It also updates x86-64 to be in sync with i386. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-01-08[PATCH] tiny: Make *[ug]id16 support optionalMatt Mackall1-5/+0
Configurable 16-bit UID and friends support This allows turning off the legacy 16 bit UID interfaces on embedded platforms. text data bss dec hex filename 3330172 529036 190556 4049764 3dcb64 vmlinux-baseline 3328268 529040 190556 4047864 3dc3f8 vmlinux From: Adrian Bunk <bunk@stusta.de> UID16 was accidentially disabled for !EMBEDDED. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Split out screen_info from tty.hBrian Gerst2-40/+1
This makes it possible for boot code to use screen_info without dragging in all of tty.h. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] use ptrace_get_task_struct in various placesChristoph Hellwig1-33/+11
The ptrace_get_task_struct() helper that I added as part of the ptrace consolidation is useful in variety of places that currently opencode it. Switch them to the common helpers. Add a ptrace_traceme() helper that needs to be explicitly called, and simplify the ptrace_get_task_struct() interface. We don't need the request argument now, and we return the task_struct directly, using ERR_PTR() for error returns. It's a bit more code in the callers, but we have two sane routines that do one thing well now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] move rtc_interrupt() prototype to rtc.hAdrian Bunk1-2/+0
This patch moves the rtc_interrupt() prototype to rtc.h and removes the prototypes from C files. It also renames static rtc_interrupt() functions in arch/arm/mach-integrator/time.c and arch/sh64/kernel/time.c to avoid compile problems. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Paul Gortmaker <p_gortmaker@yahoo.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp ↵Ravikiran G Thirumalai1-1/+1
macros ____cacheline_maxaligned_in_smp is currently used to align critical structures and avoid false sharing. It uses per-arch L1_CACHE_SHIFT_MAX and people find L1_CACHE_SHIFT_MAX useless. However, we have been using ____cacheline_maxaligned_in_smp to align structures on the internode cacheline size. As per Andi's suggestion, following patch kills ____cacheline_maxaligned_in_smp and introduces INTERNODE_CACHE_SHIFT, which defaults to L1_CACHE_SHIFT for all arches. Arches needing L3/Internode cacheline alignment can define INTERNODE_CACHE_SHIFT in the arch asm/cache.h. Patch replaces ____cacheline_maxaligned_in_smp with ____cacheline_internodealigned_in_smp With this patch, L1_CACHE_SHIFT_MAX can be killed Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Swap Migration V5: sys_migrate_pages interfaceChristoph Lameter1-0/+1
sys_migrate_pages implementation using swap based page migration This is the original API proposed by Ray Bryant in his posts during the first half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org. The intent of sys_migrate is to migrate memory of a process. A process may have migrated to another node. Memory was allocated optimally for the prior context. sys_migrate_pages allows to shift the memory to the new node. sys_migrate_pages is also useful if the processes available memory nodes have changed through cpuset operations to manually move the processes memory. Paul Jackson is working on an automated mechanism that will allow an automatic migration if the cpuset of a process is changed. However, a user may decide to manually control the migration. This implementation is put into the policy layer since it uses concepts and functions that are also needed for mbind and friends. The patch also provides a do_migrate_pages function that may be useful for cpusets to automatically move memory. sys_migrate_pages does not modify policies in contrast to Ray's implementation. The current code here is based on the swap based page migration capability and thus is not able to preserve the physical layout relative to it containing nodeset (which may be a cpuset). When direct page migration becomes available then the implementation needs to be changed to do a isomorphic move of pages between different nodesets. The current implementation simply evicts all pages in source nodeset that are not in the target nodeset. Patch supports ia64, i386 and x86_64. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-07Pull pnpacpi into acpica branchLen Brown17-33/+151
2006-01-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds1-1/+0
2006-01-06gitignore: ignore shared objectsBrian Gerst1-1/+0
Many arches make shared objects for VDSOs. Generally exclude them. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-01-06[PATCH] cpu hotplug/x86_64: disable interrupt in play_deadShaohua Li1-2/+3
With physical CPU hotplug, the CPU is hot removed and it should not receive any interrupts. Disabling interrupt is much safer. This basically is what we do in ia64 & x86. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06[PATCH] x86/x86_64: mark rodata section read-only: make some datastructures ↵Arjan van de Ven2-2/+2
const Mark some key kernel datastructures readonly. This patch was previously posted on Jun 28th but was back then not merged because nothing was enforcing rodata anyway.. well that changed now :) Patch by Christoph Lameter <christoph@lameter.com> and Dave Jones <davej@redhat.com> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06[PATCH] x86/x86_64: mark rodata section read-only: x86-64 supportArjan van de Ven2-0/+33
x86-64 specific parts to make the .rodata section read only Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06[PATCH] x86/x86_64: mark rodata section read only: generic x86-64 bugfixArjan van de Ven1-2/+7
Bug fix required for the .rodata work on x86-64: when change_page_attr() and friends need to break up a 2Mb page into 4Kb pages, it always set the NX bit on the PMD, which causes the cpu to consider the entire 2Mb region to be NX regardless of the actual PTE perms. This is fine in general, with one big exception: the 2Mb page that covers the last part of the kernel .text! The fix is to not invent a new permission for the new PMD entry, but to just inherit the existing one minus the PSE bit. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuildLinus Torvalds3-0/+5
2006-01-04Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds1-1/+5
2006-01-01gitignore: x86_64 filesBrian Gerst3-0/+5
Add filters for x86_64 generated files. Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-12-29[PATCH] x86_64: Fix incorrect node_present_pages on NUMARavikiran G Thirumalai1-1/+1
Currently, we do not pass the correct start_pfn to e820_hole_size, to calculate holes. Following patch fixes that. The bug results in incorrect number of node_present_pages for each pgdat and causes ugly output in /sys and probably VM inbalances. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Sighed-off-by: Shair Fultheim <shai@scalex86.org> Sighed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-28[ACPI] acpi_register_gsi() fix needed for ACPICA 20051021Len Brown1-1/+1
Use the #define for ACPI_LEVEL_SENSITIVE instead of assuming non-zero, because ACPICA 20051021 changes its value to zero. Also, use uniform variable names: edge_level -> triggering active_high_low -> polarity Signed-off-by: Len Brown <len.brown@intel.com>