summaryrefslogtreecommitdiffstats
path: root/Documentation/virtual/kvm/msr.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/virtual/kvm/msr.txt')
-rw-r--r--Documentation/virtual/kvm/msr.txt225
1 files changed, 0 insertions, 225 deletions
diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
deleted file mode 100644
index 96b41bd..0000000
--- a/Documentation/virtual/kvm/msr.txt
+++ /dev/null
@@ -1,225 +0,0 @@
-KVM-specific MSRs.
-Glauber Costa <glommer@redhat.com>, Red Hat Inc, 2010
-=====================================================
-
-KVM makes use of some custom MSRs to service some requests.
-
-Custom MSRs have a range reserved for them, that goes from
-0x4b564d00 to 0x4b564dff. There are MSRs outside this area,
-but they are deprecated and their use is discouraged.
-
-Custom MSR list
---------
-
-The current supported Custom MSR list is:
-
-MSR_KVM_WALL_CLOCK_NEW: 0x4b564d00
-
- data: 4-byte alignment physical address of a memory area which must be
- in guest RAM. This memory is expected to hold a copy of the following
- structure:
-
- struct pvclock_wall_clock {
- u32 version;
- u32 sec;
- u32 nsec;
- } __attribute__((__packed__));
-
- whose data will be filled in by the hypervisor. The hypervisor is only
- guaranteed to update this data at the moment of MSR write.
- Users that want to reliably query this information more than once have
- to write more than once to this MSR. Fields have the following meanings:
-
- version: guest has to check version before and after grabbing
- time information and check that they are both equal and even.
- An odd version indicates an in-progress update.
-
- sec: number of seconds for wallclock.
-
- nsec: number of nanoseconds for wallclock.
-
- Note that although MSRs are per-CPU entities, the effect of this
- particular MSR is global.
-
- Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
- leaf prior to usage.
-
-MSR_KVM_SYSTEM_TIME_NEW: 0x4b564d01
-
- data: 4-byte aligned physical address of a memory area which must be in
- guest RAM, plus an enable bit in bit 0. This memory is expected to hold
- a copy of the following structure:
-
- struct pvclock_vcpu_time_info {
- u32 version;
- u32 pad0;
- u64 tsc_timestamp;
- u64 system_time;
- u32 tsc_to_system_mul;
- s8 tsc_shift;
- u8 flags;
- u8 pad[2];
- } __attribute__((__packed__)); /* 32 bytes */
-
- whose data will be filled in by the hypervisor periodically. Only one
- write, or registration, is needed for each VCPU. The interval between
- updates of this structure is arbitrary and implementation-dependent.
- The hypervisor may update this structure at any time it sees fit until
- anything with bit0 == 0 is written to it.
-
- Fields have the following meanings:
-
- version: guest has to check version before and after grabbing
- time information and check that they are both equal and even.
- An odd version indicates an in-progress update.
-
- tsc_timestamp: the tsc value at the current VCPU at the time
- of the update of this structure. Guests can subtract this value
- from current tsc to derive a notion of elapsed time since the
- structure update.
-
- system_time: a host notion of monotonic time, including sleep
- time at the time this structure was last updated. Unit is
- nanoseconds.
-
- tsc_to_system_mul: a function of the tsc frequency. One has
- to multiply any tsc-related quantity by this value to get
- a value in nanoseconds, besides dividing by 2^tsc_shift
-
- tsc_shift: cycle to nanosecond divider, as a power of two, to
- allow for shift rights. One has to shift right any tsc-related
- quantity by this value to get a value in nanoseconds, besides
- multiplying by tsc_to_system_mul.
-
- With this information, guests can derive per-CPU time by
- doing:
-
- time = (current_tsc - tsc_timestamp)
- time = (time * tsc_to_system_mul) >> tsc_shift
- time = time + system_time
-
- flags: bits in this field indicate extended capabilities
- coordinated between the guest and the hypervisor. Availability
- of specific flags has to be checked in 0x40000001 cpuid leaf.
- Current flags are:
-
- flag bit | cpuid bit | meaning
- -------------------------------------------------------------
- | | time measures taken across
- 0 | 24 | multiple cpus are guaranteed to
- | | be monotonic
- -------------------------------------------------------------
- | | guest vcpu has been paused by
- 1 | N/A | the host
- | | See 4.70 in api.txt
- -------------------------------------------------------------
-
- Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
- leaf prior to usage.
-
-
-MSR_KVM_WALL_CLOCK: 0x11
-
- data and functioning: same as MSR_KVM_WALL_CLOCK_NEW. Use that instead.
-
- This MSR falls outside the reserved KVM range and may be removed in the
- future. Its usage is deprecated.
-
- Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
- leaf prior to usage.
-
-MSR_KVM_SYSTEM_TIME: 0x12
-
- data and functioning: same as MSR_KVM_SYSTEM_TIME_NEW. Use that instead.
-
- This MSR falls outside the reserved KVM range and may be removed in the
- future. Its usage is deprecated.
-
- Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
- leaf prior to usage.
-
- The suggested algorithm for detecting kvmclock presence is then:
-
- if (!kvm_para_available()) /* refer to cpuid.txt */
- return NON_PRESENT;
-
- flags = cpuid_eax(0x40000001);
- if (flags & 3) {
- msr_kvm_system_time = MSR_KVM_SYSTEM_TIME_NEW;
- msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK_NEW;
- return PRESENT;
- } else if (flags & 0) {
- msr_kvm_system_time = MSR_KVM_SYSTEM_TIME;
- msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK;
- return PRESENT;
- } else
- return NON_PRESENT;
-
-MSR_KVM_ASYNC_PF_EN: 0x4b564d02
- data: Bits 63-6 hold 64-byte aligned physical address of a
- 64 byte memory area which must be in guest RAM and must be
- zeroed. Bits 5-2 are reserved and should be zero. Bit 0 is 1
- when asynchronous page faults are enabled on the vcpu 0 when
- disabled. Bit 2 is 1 if asynchronous page faults can be injected
- when vcpu is in cpl == 0.
-
- First 4 byte of 64 byte memory location will be written to by
- the hypervisor at the time of asynchronous page fault (APF)
- injection to indicate type of asynchronous page fault. Value
- of 1 means that the page referred to by the page fault is not
- present. Value 2 means that the page is now available. Disabling
- interrupt inhibits APFs. Guest must not enable interrupt
- before the reason is read, or it may be overwritten by another
- APF. Since APF uses the same exception vector as regular page
- fault guest must reset the reason to 0 before it does
- something that can generate normal page fault. If during page
- fault APF reason is 0 it means that this is regular page
- fault.
-
- During delivery of type 1 APF cr2 contains a token that will
- be used to notify a guest when missing page becomes
- available. When page becomes available type 2 APF is sent with
- cr2 set to the token associated with the page. There is special
- kind of token 0xffffffff which tells vcpu that it should wake
- up all processes waiting for APFs and no individual type 2 APFs
- will be sent.
-
- If APF is disabled while there are outstanding APFs, they will
- not be delivered.
-
- Currently type 2 APF will be always delivered on the same vcpu as
- type 1 was, but guest should not rely on that.
-
-MSR_KVM_STEAL_TIME: 0x4b564d03
-
- data: 64-byte alignment physical address of a memory area which must be
- in guest RAM, plus an enable bit in bit 0. This memory is expected to
- hold a copy of the following structure:
-
- struct kvm_steal_time {
- __u64 steal;
- __u32 version;
- __u32 flags;
- __u32 pad[12];
- }
-
- whose data will be filled in by the hypervisor periodically. Only one
- write, or registration, is needed for each VCPU. The interval between
- updates of this structure is arbitrary and implementation-dependent.
- The hypervisor may update this structure at any time it sees fit until
- anything with bit0 == 0 is written to it. Guest is required to make sure
- this structure is initialized to zero.
-
- Fields have the following meanings:
-
- version: a sequence counter. In other words, guest has to check
- this field before and after grabbing time information and make
- sure they are both equal and even. An odd version indicates an
- in-progress update.
-
- flags: At this point, always zero. May be used to indicate
- changes in this structure in the future.
-
- steal: the amount of time in which this vCPU did not run, in
- nanoseconds. Time during which the vcpu is idle, will not be
- reported as steal time.