- 13 Feb, 2007 29 commits
-
-
Zachary Amsden authored
Kprobes bugfix for paravirt compatibility - RPL on the CS when inserting BPs must match running kernel. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> CC: Eric Biederman <ebiederm@xmission.com>
-
Zachary Amsden authored
Profile_pc was broken when using paravirtualization because the assumption the kernel was running at CPL 0 was violated, causing bad logic to read a random value off the stack. The only way to be in kernel lock functions is to be in kernel code, so validate that assumption explicitly by checking the CS value. We don't want to be fooled by BIOS / APM segments and try to read those stacks, so only match KERNEL_CS. I moved some stuff in segment.h to make it prettier. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de>
-
Zachary Amsden authored
VMI timer code. It works by taking over the local APIC clock when APIC is configured, which requires a couple hooks into the APIC code. The backend timer code could be commonized into the timer infrastructure, but there are some pieces missing (stolen time, in particular), and the exact semantics of when to do accounting for NO_IDLE need to be shared between different hypervisors as well. So for now, VMI timer is a separate module. [Adrian Bunk: cleanups] Subject: VMI timer patches Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Zachary Amsden authored
Fairly straightforward implementation of VMI backend for paravirt-ops. [Adrian Bunk: some cleanups] Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Zachary Amsden authored
Add VMI SMP boot hook. We emulate a regular boot sequence and use the same APIC IPI initiation, we just poke magic values to load into the CPU state when the startup IPI is received, rather than having to jump through a real mode trampoline. This is all that was needed to get SMP to work. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Zachary Amsden authored
I found a clever way to make the extra IOPL switching invisible to non-paravirt compiles - since kernel_rpl is statically defined to be zero there, and only non-zero rpl kernel have a problem restoring IOPL, as popf does not restore IOPL flags unless run at CPL-0. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Zachary Amsden authored
The VMI ROM has a mode where hypercalls can be queued and batched. This turns out to be a significant win during context switch, but must be done at a specific point before side effects to CPU state are visible to subsequent instructions. This is similar to the MMU batching hooks already provided. The same hooks could be used by the Xen backend to implement a context switch multicall. To explain a bit more about lazy modes in the paravirt patches, basically, the idea is that only one of lazy CPU or MMU mode can be active at any given time. Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of multiple PTE updates (say, inside a remap loop), but to avoid keeping some kind of state machine about when to flush cpu or mmu updates, we just allow one or the other to be active. Although there is no real reason a more comprehensive scheme could not be implemented, there is also no demonstrated need for this extra complexity. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Zachary Amsden authored
The VMI backend uses explicit page type notification to track shadow page tables. The allocation of page table roots is especially tricky. We need to clone the root for non-PAE mode while it is protected under the pgd lock to correctly copy the shadow. We don't need to allocate pgds in PAE mode, (PDPs in Intel terminology) as they only have 4 entries, and are cached entirely by the processor, which makes shadowing them rather simple. For base page table level allocation, pmd_populate provides the exact hook point we need. Also, we need to allocate pages when splitting a large page, and we must release pages before returning the page to any free pool. Despite being required with these slightly odd semantics for VMI, Xen also uses these hooks to determine the exact moment when page tables are created or released. AK: All nops for other architectures Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Adrian Bunk authored
Every file should #include the headers containing the prototypes for its global functions. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
-
Catalin Marinas authored
It makes more sense to end the stack trace with ULONG_MAX only if nr_entries < max_entries. Otherwise, we lose one entry in the long stack traces and cannot know whether the trace was complete or not. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Karsten Weiss authored
- add SWIOTLB config help text - mention Documentation/x86_64/boot-options.txt in Documentation/kernel-parameters.txt - remove the duplication of the iommu kernel parameter documentation. - Better explanation of some of the iommu kernel parameter options. - "32MB<<order" instead of "32MB^order". - Mention the default "order" value. - list the four existing PCI-DMA mapping implementations of arch x86_64 - group the iommu= option keywords by PCI-DMA mapping implementation. - Distinguish iommu= option keywords from number arguments. - Explain the meaning of DAC and SAC. Signed-off-by: Karsten Weiss <knweiss@science-computing.de> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Eric Dumazet authored
ARCH_HAVE_XTIME_LOCK is used by x86_64 arch . This arch needs to place a read only copy of xtime_lock into vsyscall page. This read only copy is named __xtime_lock, and xtime_lock is defined in arch/x86_64/kernel/vmlinux.lds.S as an alias. So the declaration of xtime_lock in kernel/timer.c was guarded by ARCH_HAVE_XTIME_LOCK define, defined to true on x86_64. We can get same result with _attribute__((weak)) in the declaration. linker should do the job. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
OGAWA Hirofumi authored
This is just cleanup. It moves to e820 check into pci_mmcfg_reject_broken(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
-
OGAWA Hirofumi authored
Currently, unreachable_devices() compares value of mmconfig and value of conf1. But it doesn't check the device is reachable or not. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
-
OGAWA Hirofumi authored
This just cleans up. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
-
OGAWA Hirofumi authored
MMCONFIG_APER_XXX is unneeded in arch/x86_64/pci/mmconfig.c. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
-
OGAWA Hirofumi authored
This rejects broken MCFG tables on Asus. When the table looks bogus just disable mmconfig Arjan and Andi suggested this. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
-
OGAWA Hirofumi authored
Current mmconfig has some problems of remapped range. a) In the case of broken MCFG tables on Asus etc., we need to remap 256M range, but currently only remap 1M. b) The base address always corresponds to bus number 0, but currently we are assuming it corresponds to start bus number. This patch fixes the above problems. (akpm: Arjan suggests that if the MCFG table is broken we just shouldn't use it, rather than try to work around things). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Olivier Galibert authored
Put back the resource reservation as per 4c6e052a but use it *only* when the range(s) come from a chipset probe instead of the bios. Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Olivier Galibert authored
It seems that the only way to reliably support mmconfig in the presence of funky biosen is to detect the hostbridge and read where the window is mapped from its registers. Do that for the E7520 and the 945G/GZ/P/PL for a start. Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Olivier Galibert authored
unreachable_devices compares between the results of pci configuration accesses through type1 and mmconfig, so it should be called only if type1 actually works in the first place. Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Olivier Galibert authored
i386 and x86-64 pci mmconfig code have a lot in common. So share what's shareable between the two. Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Maciej W. Rozycki authored
The "fasteoi" IRQ handler is named "fasteio" incorrectly. This is a fix. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Jeremy Fitzhardinge authored
Convert the PDA code to use %fs rather than %gs as the segment for per-processor data. This is because some processors show a small but measurable performance gain for reloading a NULL segment selector (as %fs generally is in user-space) versus a non-NULL one (as %gs generally is). On modern processors the difference is very small, perhaps undetectable. Some old AMD "K6 3D+" processors are noticably slower when %fs is used rather than %gs; I have no idea why this might be, but I think they're sufficiently rare that it doesn't matter much. This patch also fixes the math emulator, which had not been adjusted to match the changed struct pt_regs. [frederik.deweerdt@gmail.com: fixit with gdb] [mingo@elte.hu: Fix KVM too] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Ian Campbell <Ian.Campbell@XenSource.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Amul Shah authored
- Removed an extraneous debug message from allocate_cachealigned_map - Changed extract_lsb_from_nodes to return 63 for the case where there was only one memory node. The prevents the creation of the dynamic hashmap. - Changed extract_lsb_from_nodes to use only the starting memory address of a node. On an ES7000, our nodes overlap the starting and ending address, meaning, that we see nodes like 00000 - 10000 10000 - 20000 But other systems have nodes whose start and end addresses do not overlap. For example: 00000 - 0FFFF 10000 - 1FFFF In this case, using the ending address will result in an LSB much lower than what is possible. In this case an LSB of 1 when in reality it should be 16. Cc: Andi Kleen <ak@suse.de> Cc: Rohit Seth <rohitseth@google.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>
-
Amul Shah authored
Remove the statically allocated memory to NUMA node hash map in favor of a dynamically allocated memory to node hash map (it is cache aligned). This patch has the nice side effect in that it allows the hash map to grow for systems with large amounts of memory (256GB - 1TB), but suffer from having small PCI space tacked onto the boot node (which is somewhere between 192MB to 512MB on the ES7000). Signed-off-by: Amul Shah <amul.shah@unisys.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Rohit Seth <rohitseth@google.com> Signed-off-by: Andrew Morton <akpm@osdl.org>
-
Andi Kleen authored
This does user copies in fs write() into the page cache with write combining. This pushes the destination out of the CPU's cache, but allows higher bandwidth in some case. The theory is that the page cache data is usually not touched by the CPU again and it's better to not pollute the cache with it. Also it is a little faster. Signed-off-by: Andi Kleen <ak@suse.de>
-
Andi Kleen authored
Signed-off-by: Andi Kleen <ak@suse.de>
-
Andi Kleen authored
Signed-off-by: Andi Kleen <ak@suse.de>
-
- 12 Feb, 2007 11 commits
-
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC]: Re-export saved_command_line to modules. [SPARC64]: Increase command line size to 2048 like other arches. [SPARC64]: We do not need ZONE_DMA.
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits) [XFRM]: Fix OOPSes in xfrm_audit_log(). [TCP]: cleanup of htcp (resend) [TCP]: Use read mostly for CUBIC parameters. [NETFILTER]: nf_conntrack_tcp: make sysctl variables static [NETFILTER]: ip6t_mh: drop piggyback payload packet on MH packets [NETFILTER]: Fix whitespace errors [NETFILTER]: Kconfig: improve dependency handling [NETFILTER]: xt_mac/xt_CLASSIFY: use IPv6 hook names for IPv6 registration [NETFILTER]: nf_conntrack: change nf_conntrack_l[34]proto_unregister to void [NETFILTER]: nf_conntrack: properly use RCU for nf_conntrack_destroyed callback [NETFILTER]: ip_conntrack: properly use RCU for ip_conntrack_destroyed callback [NETFILTER]: nf_conntrack: fix invalid conntrack statistics RCU assumption [NETFILTER]: ip_conntrack: fix invalid conntrack statistics RCU assumption [NETFILTER]: nf_conntrack: properly use RCU API for nf_ct_protos/nf_ct_l3protos arrays [NETFILTER]: ip_conntrack: properly use RCU API for ip_ct_protos array [NETFILTER]: nf_nat: properly use RCU API for nf_nat_protos array [NETFILTER]: ip_nat: properly use RCU API for ip_nat_protos array [NETFILTER]: nf_log: minor cleanups [NETFILTER]: nf_log: switch logger registration/unregistration to mutex [NETFILTER]: nf_log: make nf_log_unregister_pf return void ...
-
David S. Miller authored
This reverts some bogosity from the dynamic command-line changes made on sparc32 and sparc64. Drivers such as drivers/sbus/char/openprom.c reference saved_command_line, and can be modular. The boot_command_line is __initdata, yet the dynamic command-line changes add modular exports of that symbol, obviously wrong. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Torvalds authored
Since we look in both source and object directories for localversion* files, we accidentally ended up getting them twice. Use 'sort -u' to avoid that. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David S. Miller authored
Make sure that this function is called correctly, and add BUG() checking to ensure the arguments are sane. Based upon a patch by Joy Latten. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Minor non-invasive cleanups: * white space around operators and line wrapping * use const * use __read_mostly Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
These module parameters should be in the read mostly area to avoid cache pollution. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Patrick McHardy authored
sysctls are registered by the protocol module itself since 2.6.19, no need to have them visible to others. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Masahide NAKAMURA authored
Regarding RFC3775, MH payload proto field should be IPPROTO_NONE. Otherwise it must be discarded (and the receiver should send ICMP error). We assume filter should drop such piggyback everytime to disallow slipping through firewall rules, even the final receiver will discard it. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-