Commits · 4c690a1e8667a84b61a6114a4ad293681f32cb11 · linux / linux-davinci-2.6.23

03 May, 2007 40 commits

KVM: Allow passing 64-bit values to the emulated read/write API · 4c690a1e

Avi Kivity authored Apr 22, 2007

This simplifies the API somewhat (by eliminating the special-case
cmpxchg8b on i386).
Signed-off-by: Avi Kivity <avi@qumranet.com>

4c690a1e

KVM: Per-vcpu statistics · 1165f5fe

Avi Kivity authored Apr 19, 2007

Make the exit statistics per-vcpu instead of global.  This gives a 3.5%
boost when running one virtual machine per core on my two socket dual core
(4 cores total) machine.
Signed-off-by: Avi Kivity <avi@qumranet.com>

1165f5fe

KVM: VMX: Avoid unnecessary vcpu_load()/vcpu_put() cycles · 3fca0365

Yaozu Dong authored Apr 25, 2007

By checking if a reschedule is needed, we avoid dropping the vcpu.

[With changes by me, based on Anthony Liguori's observations]
Signed-off-by: Avi Kivity <avi@qumranet.com>

3fca0365

KVM: MMU: Avoid heavy ASSERT at non debug mode. · d6c69ee9
Yaozu Dong authored Apr 25, 2007
```
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
d6c69ee9

KVM: VMX: Only save/restore MSR_K6_STAR if necessary · 4d56c8a7

Avi Kivity authored Apr 19, 2007

Intel hosts only support syscall/sysret in long more (and only if efer.sce
is enabled), so only reload the related MSR_K6_STAR if the guest will
actually be able to use it.

This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup.
Signed-off-by: Avi Kivity <avi@qumranet.com>

4d56c8a7

KVM: Fold drivers/kvm/kvm_vmx.h into drivers/kvm/vmx.c · 35cc7f97
Avi Kivity authored Apr 19, 2007
```
No meat in that file.
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
35cc7f97

KVM: VMX: Don't switch 64-bit msrs for 32-bit guests · e38aea3e

Avi Kivity authored Apr 19, 2007

Some msrs are only used by x86_64 instructions, and are therefore
not needed when the guest is legacy mode.  By not bothering to switch
them, we reduce vmexit latency by 2400 cycles (from about 8800) when
running a 32-bt guest on a 64-bit host.
Signed-off-by: Avi Kivity <avi@qumranet.com>

e38aea3e

KVM: VMX: Reduce unnecessary saving of host msrs · 2345df8c

Avi Kivity authored Apr 17, 2007

THe automatically switched msrs are never changed on the host (with
the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save
them on every vm entry.

This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%)
on x86_64.
Signed-off-by: Avi Kivity <avi@qumranet.com>

2345df8c

KVM: Handle guest page faults when emulating mmio · c9047f53

Avi Kivity authored Apr 17, 2007

Usually, guest page faults are detected by the kvm page fault handler,
which detects if they are shadow faults, mmio faults, pagetable faults,
or normal guest page faults.

However, in ceratin circumstances, we can detect a page fault much later.
One of these events is the following combination:

- A two memory operand instruction (e.g. movsb) is executed.
- The first operand is in mmio space (which is the fault reported to kvm)
- The second operand is in an ummaped address (e.g. a guest page fault)

The Windows 2000 installer does such an access, an promptly hangs.  Fix
by adding the missing page fault injection on that path.
Signed-off-by: Avi Kivity <avi@qumranet.com>

c9047f53

KVM: SVM: Report hardware exit reason to userspace instead of dmesg · 364b625b
Avi Kivity authored Apr 16, 2007
```
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
364b625b
KVM: Retry sleeping allocation if atomic allocation fails · 8c438502
Avi Kivity authored Apr 16, 2007
```
This avoids -ENOMEM under memory pressure.
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
8c438502

KVM: Use slab caches to allocate mmu data structures · b5a33a75

Avi Kivity authored Apr 15, 2007

Better leak detection, statistics, memory use, speed -- goodness all
around.
Signed-off-by: Avi Kivity <avi@qumranet.com>

b5a33a75

KVM: Handle partial pae pdptr · 417726a3

Avi Kivity authored Apr 12, 2007

Some guests (Solaris) do not set up all four pdptrs, but leave some invalid.
kvm incorrectly treated these as valid page directories, pinning the
wrong pages and causing general confusion.

Fix by checking the valid bit of a pae pdpte.  This closes sourceforge bug
1698922.
Signed-off-by: Avi Kivity <avi@qumranet.com>

417726a3

KVM: Initialize cr0 to indicate an fpu is present · d917a6b9

Avi Kivity authored Apr 12, 2007

Solaris panics if it sees a cpu with no fpu, and it seems to rely on this
bit.  Closes sourceforge bug 1698920.
Signed-off-by: Avi Kivity <avi@qumranet.com>

d917a6b9

KVM: Fix overflow bug in overflow detection code · 3964994b

Eric Sesterhenn / Snakebyte authored Apr 09, 2007

The expression

   sp - 6 < sp

where sp is a u16 is undefined in C since 'sp - 6' is promoted to int,
and signed overflow is undefined in C.  gcc 4.2 actually warns about it.
Replace with a simpler test.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>

3964994b

KVM: Use kernel-standard types · 5008fdf5
Avi Kivity authored Apr 02, 2007
```
Noted by Joerg Roedel.
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
5008fdf5

KVM: SVM: enable LBRV virtualization if available · 80b7706e

Joerg Roedel authored Mar 30, 2007

This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

80b7706e

KVM: Add fpu get/set operations · b8836737

Avi Kivity authored Apr 01, 2007

These are really helpful when migrating an floating point app to another
machine.
Signed-off-by: Avi Kivity <avi@qumranet.com>

b8836737

KVM: Add physical memory aliasing feature · e8207547

Avi Kivity authored Mar 30, 2007

With this, we can specify that accesses to one physical memory range will
be remapped to another. This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.
Signed-off-by: Avi Kivity <avi@qumranet.com>

e8207547

KVM: Simply gfn_to_page() · 954bbbc2

Avi Kivity authored Mar 30, 2007

Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.
Signed-off-by: Avi Kivity <avi@qumranet.com>

954bbbc2

KVM: Add mmu cache clear function · e0fa826f

Dor Laor authored Mar 30, 2007

Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory.  Both the mmu cache and the processor tlb
are cleared.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

e0fa826f

KVM: x86 emulator: fix bit string operations operand size · df513e2c

Avi Kivity authored Mar 28, 2007

On x86, bit operations operate on a string of bits that can reside in
multiple words. For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.

The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1). This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.

Now, a 32-bit guest goes and fork()s a process. It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).

The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry. The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.

Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.

Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.
Signed-off-by: Avi Kivity <avi@qumranet.com>

df513e2c

KVM: Remove debug message · afeb1f14
Avi Kivity authored Mar 27, 2007
```
No longer interesting.
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
afeb1f14

KVM: Use list_move() · 36868f7b

Avi Kivity authored Mar 26, 2007

Use list_move() where possible.  Noticed by Dor Laor.
Signed-off-by: Avi Kivity <avi@qumranet.com>

36868f7b

KVM: Remove unused function · 55bf4028

Michal Piotrowski authored Mar 25, 2007

Remove unused function

CC      drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

55bf4028

KVM: SVM: Ensure timestamp counter monotonicity · 0cc5064d

Avi Kivity authored Mar 25, 2007

When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.

As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.
Signed-off-by: Avi Kivity <avi@qumranet.com>

0cc5064d

KVM: MMU: Fix hugepage pdes mapping same physical address with different access · d28c6cfb

Avi Kivity authored Mar 23, 2007

The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page. This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page. If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
Signed-off-by: Avi Kivity <avi@qumranet.com>

d28c6cfb

KVM: SVM: forbid guest to execute monitor/mwait · 916ce236

Joerg Roedel authored Mar 21, 2007

This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

916ce236

KVM: Handle writes to MCG_STATUS msr · 0e5bf0d0

Sergey Kiselev authored Mar 22, 2007

Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.
Signed-off-by: Sergey Kiselev <sergey.kiselev@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

0e5bf0d0

KVM: Remove unused and write-only variables · fcd34108
Avi Kivity authored Mar 21, 2007
```
Trivial cleanup.
Signed-off-by: Avi Kivity <avi@qumranet.com>
```
fcd34108

KVM: Don't allow the guest to turn off the cpu cache · 6da63cf9

Avi Kivity authored Mar 21, 2007

The cpu cache is a host resource; the guest should not be able to turn
it off (even for itself).
Signed-off-by: Avi Kivity <avi@qumranet.com>

6da63cf9

KVM: Hack real-mode segments on vmx from KVM_SET_SREGS · 038881c8

Avi Kivity authored Mar 21, 2007

As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.
Signed-off-by: Avi Kivity <avi@qumranet.com>

038881c8

KVM: Modify guest segments after potentially switching modes · 024aa1c0

Avi Kivity authored Mar 21, 2007

The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers.  Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.
Signed-off-by: Avi Kivity <avi@qumranet.com>

024aa1c0

KVM: Remove set_cr0_no_modeswitch() arch op · f6528b03