Chapter 9. Miscellaneous administration tasks
This chapter contain useful hints and tips to improve virtualization performance, scale and stability.
9.1. Automatically starting guests
This section covers how to make guests start automatically during the host system's boot phase.
This example uses virsh
to set a guest, TestServer
, to automatically start when the host boots.
# virsh autostart TestServer
Domain TestServer marked as autostarted
The guest now automatically starts with the host.
To stop a guest automatically booting use the --disable
parameter
# virsh autostart --disable TestServer
Domain TestServer unmarked as autostarted
The guest no longer automatically starts with the host.
9.2. Guest memory allocation
The following procedure shows how to allocate memory for a guest. This allocation and assignement works only at boot time and any changes to any of the memory values will not take effect until the next reboot.
Valid memory units include:
b
or bytes
for bytes
KB
for kilobytes (103 or blocks of 1,000 bytes)
k
or KiB
for kibibytes (210 or blocks of 1024 bytes)
MB
for megabytes (106 or blocks of 1,000,000 bytes)
M
or MiB
for mebibytes (220 or blocks of 1,048,576 bytes)
GB
for gigabytes (109 or blocks of 1,000,000,000 bytes)
G
or GiB
for gibibytes (230 or blocks of 1,073,741,824 bytes)
TB
for terabytes (1012 or blocks of 1,000,000,000,000 bytes)
T
or TiB
for tebibytes (240 or blocks of 1,099,511,627,776 bytes)
Note that all values will be rounded up to the nearest kibibyte by libvirt, and may be further rounded to the granularity supported by the hypervisor. Some hypervisors also enforce a minimum, such as 4000KiB (or 4000 x 210 or 4,096,000 bytes). The units for this value are determined by the optional attribute memory unit
, which defaults to the kibibytes (KiB) as a unit of measure where the value given is multiplied by 210 or blocks of 1024 bytes.
In the cases where the guest crashes the optional attribute dumpCore
can be used to control whether the guest's memory should be included in the generated coredump (dumpCore='on'
) or not included (dumpCore='off'
). Note that the default setting is on
so if the parameter is not set to off
, the guest memory will be included in the coredump file.
The currentMemory
attribute determines the actual memory allocation for a guest. This value can be less than the maximum allocation, to allow for ballooning up the guests memory on the fly. If this is omitted, it defaults to the same value as the memory element. The unit attribute behaves the same as for memory.
In all cases for this section, the domain XML needs to be altered as follows:
<domain> <memory unit='KiB' dumpCore='off'>524288</memory> <!-- changes the memory unit to KiB and does not allow the guest's memory to be included in the generated coredump file --> <currentMemory unit='KiB'>524288</currentMemory> <!-- makes the current memory unit 524288 KiB --> ...</domain>
The qemu-img
command line tool is used for formatting, modifying and verifying various file systems used by KVM. qemu-img
options and usages are listed below.
# qemu-img check [-f format
] filename
Only the qcow2
and vdi
formats support consistency checks.
# qemu-img commit [-f fmt
] [-t cache
] filename
Command format:
# qemu-img convert [-c] [-p] [-f fmt
] [-t cache
] [-O output_fmt
] [-o options
] [-S sparse_size
] filename
output_filename
The -p
parameter shows the progress of the command (optional and not for every command) and -S
indicates the consecutive number of bytes that must contain only zeros for qemu-img
to create a sparse image during conversion.
Convert the disk image filename
to disk image output_filename
using format output_format
. The disk image can be optionally compressed with the -c
option, or encrypted with the -o
option by setting -o encryption
. Note that the options available with the -o
parameter differ with the selected format.
Only the qcow2
format supports encryption or compression. qcow2
encryption uses the AES format with secure 128-bit keys. qcow2
compression is read-only, so if a compressed sector is converted from qcow2
format, it is written to the new format as uncompressed data.
Image conversion is also useful to get a smaller image when using a format which can grow, such as qcow
or cow
. The empty sectors are detected and suppressed from the destination image.
# qemu-img create [-f format
] [-o options
] filename
[size
]
If a base image is specified with -o backing_file=filename
, the image will only record differences between itself and the base image. The backing file will not be modified unless you use the commit
command. No size needs to be specified in this case.
# qemu-img info [-f format
] filename
This command is often used to discover the size reserved on disk which can be different from the displayed size. If snapshots are stored in the disk image, they are displayed also.
# qemu-img rebase [-f fmt
] [-t cache
] [-p] [-u] -b backing_file
[-F backing_fmt
] filename
The backing file is changed to backing_file
and (if the format of filename
supports the feature), the backing file format is changed to backing_format
.
Only the qcow2
format supports changing the backing file (rebase).
There are two different modes in which rebase
can operate: Safe and Unsafe.
Safe mode is used by default and performs a real rebase operation. The new backing file may differ from the old one and the qemu-img rebase
command will take care of keeping the guest-visible content of filename
unchanged. In order to achieve this, any clusters that differ between backing_file
and old backing file of filename
are merged into filename
before making any changes to the backing file.
Note that safe mode is an expensive operation, comparable to converting an image. The old backing file is required for it to complete successfully.
Unsafe mode is used if the -u
option is passed to qemu-img rebase
. In this mode, only the backing file name and format of filename
is changed, without any checks taking place on the file contents. Make sure the new backing file is specified correctly or the guest-visible content of the image will be corrupted.
This mode is useful for renaming or moving the backing file. It can be used without an accessible old backing file. For instance, it can be used to fix an image whose backing file has already been moved or renamed.
Use the following to set the size of the disk image filename
to size
bytes:
# qemu-img resize filename
size
You can also resize relative to the current size of the disk image. To give a size relative to the current size, prefix the number of bytes with +
to grow, or -
to reduce the size of the disk image by that number of bytes. Adding a unit suffix allows you to set the image size in kilobytes (K), megabytes (M), gigabytes (G) or terabytes (T).
# qemu-img resize filename
[+|-]size
[K|M|G|T]
Before using this command to shrink a disk image, you must use file system and partitioning tools inside the VM itself to reduce allocated file systems and partition sizes accordingly. Failure to do so will result in data loss.
After using this command to grow a disk image, you must use file system and partitioning tools inside the VM to actually begin using the new space on the device.
# qemu-img snapshot [ -l | -a snapshot
| -c snapshot
| -d snapshot
] filename
-l
lists all snapshots associated with the specified disk image. The apply option, -a
, reverts the disk image (filename
) to the state of a previously saved snapshot
. -c
creates a snapshot (snapshot
) of an image (filename
). -d
deletes the specified snapshot.
-
raw
Raw disk image format (default). This can be the fastest file-based format. If your file system supports holes (for example in ext2 or ext3 on Linux or NTFS on Windows), then only the written sectors will reserve space. Use qemu-img info
to obtain the real size used by the image or ls -ls
on Unix/Linux. Although Raw images give optimal performance, only very basic features are available with a Raw image (no snapshots etc.).
-
qcow2
QEMU image format, the most versatile format with the best feature set. Use it to have optional AES encryption, zlib-based compression, support of multiple VM snapshots, and smaller images, which are useful on file systems that do not support holes (non-NTFS file systems on Windows). Note that this expansive feature set comes at the cost of performance.
Although only the formats above can be used to run on a guest or host machine, qemu-img also recognizes and supports the following formats in order to convert from them into either raw
or qcow2
format. The format of an image is usually detected automatically. In addition to converting these formats into raw
or qcow2
, they can be converted back from raw
or qcow2
to the original format.
bochs
Bochs disk image format.
cloop
Linux Compressed Loop image, useful only to reuse directly compressed CD-ROM images present for example in the Knoppix CD-ROMs.
cow
User Mode Linux Copy On Write image format. The cow
format is included only for compatibility with previous versions. It does not work with Windows.
dmg
Mac disk image format.
nbd
Network block device.
parallels
Parallels virtualization disk image format.
qcow
Old QEMU image format. Only included for compatibility with older versions.
vdi
Oracle VM VirtualBox hard disk image format.
vmdk
VMware 3 and 4 compatible image format.
vpc
Windows Virtual PC disk image format. Also referred to as vhd
, or Microsoft virtual hard disk image format.
vvfat
Virtual VFAT disk image format.
9.4. Verifying virtualization extensions
Use this section to determine whether your system has the hardware virtualization extensions. Virtualization extensions (Intel VT-x or AMD-V) are required for full virtualization.
Run the following command to verify the CPU virtualization extensions are available:
$ grep -E 'svm|vmx' /proc/cpuinfo
Analyze the output.
The following output contains a vmx
entry indicating an Intel processor with the Intel VT-x extension:
flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cplvmx est tm2 cx16 xtpr lahf_lm
The following output contains an svm
entry indicating an AMD processor with the AMD-V extensions:
flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflushmmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cx16lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc
If any output is received, the processor has the hardware virtualization extensions. However in some circumstances manufacturers disable the virtualization extensions in BIOS.
The "flags:
" output content may appear multiple times, once for each hyperthread, core or CPU on the system.
Ensure KVM subsystem is loaded
As an additional check, verify that the kvm
modules are loaded in the kernel:
# lsmod | grep kvm
If the output includes kvm_intel
or kvm_amd
then the kvm
hardware virtualization modules are loaded and your system meets requirements.
If the libvirt package is installed, the virsh
command can output a full list of virtualization system capabilities. Run virsh capabilities
as root to receive the complete list.
9.5. Setting KVM processor affinities
libvirt refers to a NUMA node as a cell.
This section covers setting processor and processing core affinities with libvirt and KVM guests.
By default, libvirt provisions guests using the hypervisor's default policy. For most hypervisors, the policy is to run guests on any available processing core or CPU. There are times when an explicit policy may be better, particularly for systems with a NUMA (Non-Uniform Memory Access) architecture. A guest on a NUMA system can be pinned to a processing core so that its memory allocations are always local to the node it is running on. This avoids cross-node memory transports which have less bandwidth and can significantly degrade performance.
On non-NUMA systems some form of explicit placement across the hosts� sockets, cores and hyperthreads may be more efficient.
# virsh nodeinfoCPU model: x86_64CPU(s): 8CPU frequency: 1000 MHzCPU socket(s): 2Core(s) per socket: 4Thread(s) per core: 1NUMA cell(s): 2Memory size: 8179176 kB
This output shows that the system has eight CPU cores and two sockets. Each CPU socket has four cores. This splitting of CPU cores across multiple sockets suggests that the system has Non-Uniform Memory Access (NUMA) architecture.
NUMA architecture can be more complex than other architectures. Use the virsh capabilities
command to get additional output data about the CPU configuration.
# virsh capabilities<capabilities> <host> <cpu> <arch>x86_64</arch> </cpu> <migration_features> <live/> <uri_transports> <uri_transport>tcp</uri_transport> </uri_transports> </migration_features> <topology> <cells num='2'> <cell id='0'> <cpus num='4'> <cpu id='0'/> <cpu id='1'/> <cpu id='2'/> <cpu id='3'/> </cpus> </cell> <cell id='1'> <cpus num='4'> <cpu id='4'/> <cpu id='5'/> <cpu id='6'/> <cpu id='7'/> </cpus> </cell> </cells> </topology> <secmodel> <model>selinux</model> <doi>0</doi> </secmodel> </host> [ Additional XML removed ]</capabilities>
This output shows two NUMA nodes (also know as NUMA cells), each containing four logical CPUs (four processing cores). This system has two sockets, therefore it can be inferred that each socket is a separate NUMA node. For a guest with four virtual CPUs, it is optimal to lock the guest to physical CPUs 0 to 3, or 4 to 7, to avoid accessing non-local memory, which is significantly slower than accessing local memory.
If a guest requires eight virtual CPUs, you could run two sets of four virtual CPU guests and split the work between them, since each NUMA node only has four physical CPUs. Running across multiple NUMA nodes significantly degrades performance for physical and virtualized tasks.
# virsh freecell --all0: 2203620 kB1: 3354784 kB
If a guest requires 3 GB of RAM allocated, then the guest should be run on NUMA node (cell) 1. Node 0 only has 2.2GB free which may not be sufficient for certain guests.
Extract from the virsh capabilities
output.
<topology> <cells num='2'> <cell id='0'> <cpus num='4'> <cpu id='0'/> <cpu id='1'/> <cpu id='2'/> <cpu id='3'/> </cpus> </cell> <cell id='1'> <cpus num='4'> <cpu id='4'/> <cpu id='5'/> <cpu id='6'/> <cpu id='7'/> </cpus> </cell> </cells></topology>
Observe that the node 1, <cell id='1'>
, uses physical CPUs 4 to 7.
The guest can be locked to a set of CPUs by appending the cpuset
attribute to the configuration file.
While the guest is offline, open the configuration file with virsh edit
.
Locate the guest's virtual CPU count, defined in the vcpus
element.
<vcpus>4</vcpus>
The guest in this example has four CPUs.
Add a cpuset
attribute with the CPU numbers for the relevant NUMA cell.
<vcpus cpuset='4-7'>4</vcpus>
Save the configuration file and restart the guest.
The guest has been locked to CPUs 4 to 7.
The cpuset
option for virt-install
can use a CPU set of processors or the parameter auto
. The auto
parameter automatically determines the optimal CPU locking using the available NUMA data.
For a NUMA system, use the --cpuset=auto
with the virt-install
command when creating new guests.
The virsh vcpuinfo
command gives up to date information about where each virtual CPU is running.
In this example, guest1
is a guest with four virtual CPUs is running on a KVM host.
# virsh vcpuinfo guest1
VCPU: 0CPU: 3State: runningCPU time: 0.5sCPU Affinity: yyyyyyyyVCPU: 1CPU: 1State: runningCPU Affinity: yyyyyyyyVCPU: 2CPU: 1State: runningCPU Affinity: yyyyyyyyVCPU: 3CPU: 2State: runningCPU Affinity: yyyyyyyy
The virsh vcpuinfo
output (the yyyyyyyy
value of CPU Affinity
) shows that the guest can presently run on any CPU.
To lock the virtual CPUs to the second NUMA node (CPUs four to seven), run the following commands.
# virsh vcpupin guest1
0 4# virsh vcpupin guest1
1 5# virsh vcpupin guest1
2 6# virsh vcpupin guest1
3 7
The virsh vcpuinfo
command confirms the change in affinity.
# virsh vcpuinfo guest1
VCPU: 0CPU: 4State: runningCPU time: 32.2sCPU Affinity: ----y---VCPU: 1CPU: 5State: runningCPU time: 16.9sCPU Affinity: -----y--VCPU: 2CPU: 6State: runningCPU time: 11.9sCPU Affinity: ------y-VCPU: 3CPU: 7State: runningCPU time: 14.6sCPU Affinity: -------y
9.6. Generating a new unique MAC address
In some cases you will need to generate a new and unique MAC address for a guest. There is no command line tool available to generate a new MAC address at the time of writing. The script provided below can generate a new MAC address for your guests. Save the script to your guest as macgen.py
. Now from that directory you can run the script using ./macgen.py
and it will generate a new MAC address. A sample output would look like the following:
$ ./macgen.py 00:16:3e:20:b0:11
#!/usr/bin/python# macgen.py script to generate a MAC address for guests#import random#def randomMAC():mac = [ 0x00, 0x16, 0x3e,random.randint(0x00, 0x7f),random.randint(0x00, 0xff),random.randint(0x00, 0xff) ]return ':'.join(map(lambda x: "%02x" % x, mac))#print randomMAC()
# echo 'import virtinst.util ; print\ virtinst.util.uuidToString(virtinst.util.randomUUID())' | python# echo 'import virtinst.util ; print virtinst.util.randomMAC()' | python
The script above can also be implemented as a script file as seen below.
#!/usr/bin/env python# -*- mode: python; -*-print ""print "New UUID:"import virtinst.util ; print virtinst.util.uuidToString(virtinst.util.randomUUID())print "New MAC:"import virtinst.util ; print virtinst.util.randomMAC()print ""
9.7. Improving guest response time
Guests can sometimes be slow to respond with certain workloads and usage patterns. Examples of situations which may cause slow or unresponsive guests:
Severely overcommitted memory.
Overcommitted memory with high processor usage
Other (not qemu-kvm
processes) busy or stalled processes on the host.
These types of workload may cause guests to appear slow or unresponsive. Usually, the guest's memory is eventually fully loaded into the host's main memory from swap. Once the guest is loaded in main memory, the guest will perform normally. Note, the process of loading a guest from swap to main memory may take several seconds per gigabyte of RAM assigned to the guest, depending on the type of storage used for swap and the performance of the components.
KVM guests function as Linux processes. Linux processes are not permanently kept in main memory (physical RAM). The kernel scheduler swaps process memory into virtual memory (swap). Swap, with conventional hard disk drives, is thousands of times slower than main memory in modern computers. If a guest is inactive for long periods of time, the guest may be placed into swap by the kernel.
KVM guests processes may be moved to swap regardless of whether memory is overcommitted or overall memory usage.
Using unsafe overcommit levels or overcommitting with swap turned off guest processes or other critical processesis not recommended. Always ensure the host has sufficient swap space when overcommitting memory.
Virtual memory allows a Linux system to use more memory than there is physical RAM on the system. Underused processes are swapped out which allows active processes to use memory, improving memory utilization. Disabling swap reduces memory utilization as all processes are stored in physical RAM.
If swap is turned off, do not overcommit guests. Overcommitting guests without any swap can cause guests or the host system to crash.
The swapoff
command can disable all swap partitions and swap files on a system.
# swapoff -a
To make this change permanent, remove swap
lines from the /etc/fstab
file and restart the host system.
Using RAID arrays, faster disks or separate drives dedicated to swap may also improve performance.
9.8. Disable SMART disk monitoring for guests
SMART disk monitoring can be safely disabled as virtual disks and the physical storage devices are managed by the host.
# service smartd stop# chkconfig --del smartd
9.9. Configuring a VNC Server
To configure a VNC server, use the application in . Alternatively, you can run the vino-preferences
command.
Use the following step set up a dedicated VNC server session:
If needed, Create and then Edit the ~/.vnc/xstartup
file to start a GNOME session whenever vncserver is started. The first time you run the vncserver script it will ask you for a password you want to use for your VNC session. For more information on vnc server files refer to the Red Hat Enterprise Linux Installation Guide.
9.10. Gracefully shutting down guests
Installing virtualized Red Hat Enterprise Linux 6 guests with the Minimal installation
option will not install the acpid package.
Without the acpid package, the Red Hat Enterprise Linux 6 guest does not shut down when the virsh shutdown
command is executed. The virsh shutdown
command is designed to gracefully shut down guests.
Using virsh shutdown
is easier and safer for system administration. Without graceful shut down with the virsh shutdown
command a system administrator must log into a guest manually or send the Ctrl-Alt-Del key combination to each guest.
Other virtualized operating systems may be affected by this issue. The virsh shutdown
command requires that the guest operating system is configured to handle ACPI shut down requests. Many operating systems require additional configuration on the guest operating system to accept ACPI shut down requests.
Procedure 9.1. Workaround for Red Hat Enterprise Linux 6
Install the acpid package
The acpid
service listen and processes ACPI requests.
Log into the guest and install the acpid package on the guest:
# yum install acpid
Enable the acpid service
Set the acpid
service to start during the guest boot sequence and start the service:
# chkconfig acpid on# service acpid start
The guest is now configured to shut down when the virsh shutdown
command is used.
9.11. Virtual machine timer management with libvirt
Accurate time keeping on guests is a key challenge for virtualization platforms. Different hypervisors attempt to handle the problem of time keeping in a variety of ways. Libvirt provides hypervisor independent configuration settings for time management, using the <clock> and <timer> elements in the domain XML. The domain XML can be edited using the
virsh edit
command. See
Editing a guest's configuration file for details.
Table 9.1. Offset attribute values
Value | Description |
---|
utc | The guest clock will be synchronized to UTC when booted. |
localtime | The guest clock will be synchronized to the host's configured timezone when booted, if any. |
timezone | The guest clock will be synchronized to a given timezone, specified by the timezone attribute. |
variable | The guest clock will be synchronized to an arbitrary offset from UTC. The delta relative to UTC is specified in seconds, using the adjustment attribute. The guest is free to adjust the Real Time Clock (RTC) over time and expect that it will be honored following the next reboot. This is in contrast to utc mode, where any RTC adjustments are lost at each reboot. |
The value utc is set as the clock offset in a virtual machine by default. However, if the guest clock is run with the localtime value, the clock offset needs to be changed to a different value in order to have the guest clock synchronized with the host clock.
Example 9.1. Always synchronize to UTC
Example 9.2. Always synchronize to the host timezone
<clock offset="localtime" />
Example 9.3. Synchronize to an arbitrary timezone
<clock offset="timezone" timezone="Europe/Paris" />
Example 9.4. Synchronize to UTC + arbitrary offset
<clock offset="variable" adjustment="123456" />
Table 9.2. name attribute values
Value | Description |
---|
platform | The master virtual time source which may be used to drive the policy of other time sources. |
pit | Programmable Interval Timer - a timer with periodic interrupts. |
rtc | Real Time Clock - a continuously running timer with periodic interrupts. |
hpet | High Precision Event Timer - multiple timers with periodic interrupts. |
tsc | Time Stamp Counter - counts the number of ticks since reset, no interrupts. |
kvmclock | KVM clock - recommended clock source for KVM guests. KVM pvclock, or kvm-clock lets guests read the host�s wall clock time. |
Table 9.3. track attribute values
Value | Description |
---|
boot | Corresponds to old host option, this is an unsupported tracking option. |
guest | RTC always tracks guest time. |
wall | RTC always tracks host time. |
Table 9.4. tickpolicy attribute values
Value | Description |
---|
delay | Continue to deliver at normal rate (i.e. ticks are delayed). |
catchup | Deliver at a higher rate to catch up. |
merge | Ticks merged into one single tick. |
discard | All missed ticks are discarded. |
Table 9.5. mode attribute values
Value | Description |
---|
auto | Native if TSC is unstable, otherwise allow native TSC access. |
native | Always allow native TSC access. |
emulate | Always emulate TSC. |
smpsafe | Always emulate TSC and interlock SMP |
Table 9.6. present attribute values
Value | Description |
---|
yes | Force this timer to the visible to the guest. |
no | Force this timer to not be visible to the guest. |
Example 9.5. Clock synchronizing to local time with RTC and PIT timers, and the HPET timer disabled
<clock offset="localtime"><timer name="rtc" tickpolicy="catchup" track="guest" /><timer name="pit" tickpolicy="delay" /><timer name="hpet" present="no" /></clock>
9.12. Using PMU to monitor guest performance
In Red Hat Enterprise Linux 6.4, vPMU (virtual PMU )was introduced as technical-preview. vPMU is based on Intel's PMU (Performance Monitoring Units) and may only be used on Intel machines. PMU allows the tracking of statistics which indicate how a guest virtual machine is functioning.
Using performance monitoring, allows developers to use the CPU's PMU counter while using the performance tool for profiling. The virtual performance monitoring unit feature allows virtual machine users to identify sources of possible performance problems in their guest virtual machines, thereby improving the ability to profile a KVM guest virtual machine.
To enable the feature, the -cpu host
flag must be set.
This feature is only supported with guests running Red Hat Enterprise Linux 6 and is disabled by default. This feature only works using the Linux perf tool. Make sure the perf package is installed using the command:
# yum install perf
.
See the man page on perf
for more information on the perf commands.
9.13. Guest virtual machine power management
It is possible to forcibly enable or disable BIOS advertisements to the guest virtual machine's operating system by changing the following parameters in the Domain XML for Libvirt:
... <pm> <suspend-to-disk enabled='no'/> <suspend-to-mem enabled='yes'/> </pm> ...
The element pm
enables ('yes') or disables ('no') BIOS support for S3 (suspend-to-disk) and S4 (suspend-to-mem) ACPI sleep states. If nothing is specified, then the hypervisor will be left with its default value.
9.14. QEMU Guest Agent Protocol
The QEMU guest agent protocol (QEMU-ga), uses the same protocol as QMP. qemu-ga is provided as technical-preview for Red Hat Enterprise Linux 6.4. There are a couple issues regarding its isa-serial/virtio-serial transport, and the following caveats have been noted:
There is no way for qemu-ga to detect whether or not a client has connected to the channel.
There is no way for a client to detect whether or not qemu-ga has disconnected or reconnected to the backend.
If the virtio-serial device resets and qemu-ga has not connected to the channel as a result, (generally caused by a reboot or hotplug), data from the client will be dropped.
If qemu-ga has connected to the channel following a virtio-serial device reset, data from the client will be queued (and eventually throttled if available buffers are exhausted), regardless of whether or not qemu-ga is still running/connected.
qemu-ga uses the guest-sync or guest-sync-delimited command to address the problem of re-synchronizing the channel after re-connection or client-side timeouts. These are described below.
The guest-sync request/response exchange is simple. The client provides a unique numerical token, the agent sends it back in a response:
> { "execute": "guest-sync", "arguments": { "id": 123456 } } < { "return": 123456}
A successful exchange guarantees that the channel is now in sync and no unexpected data/responses will be sent. Note that for the reasons mentioned above there's no guarantee this request will be answered, so a client should implement a timeout and re-issue this periodically until a response is received for the most recent request.
This alone does not handle synchronization issues in all cases. For example, if qemu-ga's parser previously received a partial request from a previous client connection, subsequent attempts to issue the guest-sync request can be misconstrued as being part of the previous partial request. Eventually qemu-ga will hit it's recursion or token size limit and flush its parser state, at which point it will begin processing the backlog of requests, but there's no guarantee this will occur before the channel is throttled due to exhausting all available buffers. Thus, there is a potential for a deadlock situation occurring for certain instances.
To avoid this, qemu-ga/QEMU's JSON parser has special handling for the 0xFF byte, which is an invalid UTF-8 character. Client requests should precede the guest-sync request with to ensure that qemu-ga flushes it's parser state as soon as possible. As long as all clients abide by this, the deadlock state should be reliably avoidable.
9.14.2. guest-sync-delimited
If qemu-ga attempts to communicate with a client, and the client receives a partial response from a previous qemu-ga instance, the client might misconstrue responses to guest-sync as being part of this previous request. For client implementations that treat newlines as a delimiter for qemu-ga responses, use guest-synch-delimited
.
Even in some cases where there are JSON stream-based implementations that do not rely on newline delimiters, it may be considered invasive to implement a client's response/JSON handling, as it is the same deadlock scenario described previously. Using the guest-sync-delimited
on the client, tells qemu-ga to place the same 0xFF character in front of the response, thereby preventing confusion.
> { "execute": "guest-sync-delimited", "arguments": { "id": 123456 } }< { "return": 123456}
Actual hex values sent:
> 7b 27 65 78 65 63 75 74 65 27 3a 27 67 75 65 73 74 2d 73 79 6e 63 2d 64 65 6c 69 6d 69 74 65 64 27 2c 27 61 72 67 75 6d 65 6e 74 73 27 3a 7b 27 69 64 27 3a 31 32 33 34 35 36 7d 7d 0a< ff 7b 22 72 65 74 75 72 6e 22 3a 20 31 32 33 34 35 36 7d 0a
As stated above, the request should also be preceded with a 0xFF to flush qemu-ga's parser state.
9.15. Setting a limit on device redirection
To filter out certain devices from redirection, pass the filter property to -device usb-redir
. The filter property takes a string consisting of filter rules, the format for a rule is:
<class>:<vendor>:<product>:<version>:<allow>
Use the value -1
to designate it to accept any value for a particular field. You may use multiple rules on the same command line using | as a separator. Note that if a device matches none of the passed in rules, redirecting it will not be allowed!
Example 9.6. An example of limiting redirection with a windows guest virtual machine
Prepare a Windows 7 guest.
Add the following code excerpt to the guest's' domain xml file:
<redirdev bus='usb' type='spicevmc'> <alias name='redir0'/> <address type='usb' bus='0' port='3'/> </redirdev> <redirfilter> <usbdev class='0x08' vendor='0x1234' product='0xBEEF' version='2.0' allow='yes'/> <usbdev class='-1' vendor='-1' product='-1' version='-1' allow='no'/> </redirfilter>
Start the guest and confirm the setting changes by running the following:
#ps -ef | grep $guest_name
-device usb-redir,chardev=charredir0,id=redir0,
/filter=0x08:0x1234:0xBEEF:0x0200:1|-1:-1:-1:-1:0,bus=usb.0,port=3
Plug a USB device into host, and use virt-viewer to connect to the guest.
Click in the menu, which will produce the following message: "Some USB devices are blocked by host policy". Click to confirm and continue.
The filter takes effect.
To make sure that the filter captures properly check the USB device vendor and product, then make the following changes in the host's domain XML to allow for USB redirection.
<redirfilter> <usbdev class='0x08' vendor='0x0951' product='0x1625' version='2.0' allow='yes'/> <usbdev allow='no'/> </redirfilter>
Restart the guest, then use virt-viewer to connect to the guest. The USB device will now redirect traffic to the guest.
9.16. Dynamically changing a host or a network bridge that is attached to a virtual NIC
This section demonstrates how to move the vNIC of a guest from one bridge to another while the guest is running without compromising the guest
Prepare guest with a configuration similar to the following:
<interface type='bridge'> <mac address='52:54:00:4a:c9:5e'/> <source bridge='virbr0'/> <model type='virtio'/></interface>
Prepare an XML file for interface update:
# cat br1.xml
<interface type='bridge'> <mac address='52:54:00:4a:c9:5e'/> <source bridge='virbr1'/> <model type='virtio'/></interface>
Start the guest, confirm the guest's network functionality, and check that the guest's vnetX is connected to the bridge you indicated.
# brctl show
bridge name bridge id STP enabled interfacesvirbr0 8000.5254007da9f2 yes virbr0-nicvnet0virbr1 8000.525400682996 yes virbr1-nic
Update the guest's network with the new interface parameters with the following command:
# virsh update-device test1 br1.xml
Device updated successfully
On the guest, run service network restart
. The guest gets a new IP address for virbr1. Check the guest's vnet0 is connected to the new bridge(virbr1)
# brctl show
bridge name bridge id STP enabled interfacesvirbr0 8000.5254007da9f2 yes virbr0-nicvirbr1 8000.525400682996 yes virbr1-nic vnet0