diff options
author | Richard Jones <rjones@redhat.com> | 2009-12-31 12:26:04 +0000 |
---|---|---|
committer | Richard Jones <rjones@redhat.com> | 2009-12-31 12:26:04 +0000 |
commit | 8980c01b46eafcf4b5dc127e4696c2cbe1bff09f (patch) | |
tree | bc8dc1c3867f44cd0b5d5650202d832b102a6010 /src/guestfs.pod | |
parent | 500ab86509e819cdf9c379f0bf6f0076bdd2ab68 (diff) | |
download | libguestfs-8980c01b46eafcf4b5dc127e4696c2cbe1bff09f.tar.gz libguestfs-8980c01b46eafcf4b5dc127e4696c2cbe1bff09f.tar.xz libguestfs-8980c01b46eafcf4b5dc127e4696c2cbe1bff09f.zip |
Move guestfs(3) and guestfish(1) man pages into subdirectories.
These manual pages have for a very long time 'lived' in the top
source directory.
Clean up this situation by moving those manual pages (plus associated
generated files) into the src/ and fish/ subdirectories respectively.
Diffstat (limited to 'src/guestfs.pod')
-rw-r--r-- | src/guestfs.pod | 1300 |
1 files changed, 1300 insertions, 0 deletions
diff --git a/src/guestfs.pod b/src/guestfs.pod new file mode 100644 index 00000000..30759602 --- /dev/null +++ b/src/guestfs.pod @@ -0,0 +1,1300 @@ +=encoding utf8 + +=head1 NAME + +guestfs - Library for accessing and modifying virtual machine images + +=head1 SYNOPSIS + + #include <guestfs.h> + + guestfs_h *handle = guestfs_create (); + guestfs_add_drive (handle, "guest.img"); + guestfs_launch (handle); + guestfs_mount (handle, "/dev/sda1", "/"); + guestfs_touch (handle, "/hello"); + guestfs_sync (handle); + guestfs_close (handle); + +=head1 DESCRIPTION + +Libguestfs is a library for accessing and modifying guest disk images. +Amongst the things this is good for: making batch configuration +changes to guests, getting disk used/free statistics (see also: +virt-df), migrating between virtualization systems (see also: +virt-p2v), performing partial backups, performing partial guest +clones, cloning guests and changing registry/UUID/hostname info, and +much else besides. + +Libguestfs uses Linux kernel and qemu code, and can access any type of +guest filesystem that Linux and qemu can, including but not limited +to: ext2/3/4, btrfs, FAT and NTFS, LVM, many different disk partition +schemes, qcow, qcow2, vmdk. + +Libguestfs provides ways to enumerate guest storage (eg. partitions, +LVs, what filesystem is in each LV, etc.). It can also run commands +in the context of the guest. Also you can access filesystems over FTP. + +Libguestfs is a library that can be linked with C and C++ management +programs (or management programs written in OCaml, Perl, Python, Ruby, Java +or Haskell). You can also use it from shell scripts or the command line. + +You don't need to be root to use libguestfs, although obviously you do +need enough permissions to access the disk images. + +Libguestfs is a large API because it can do many things. For a gentle +introduction, please read the L</API OVERVIEW> section next. + +=head1 API OVERVIEW + +This section provides a gentler overview of the libguestfs API. We +also try to group API calls together, where that may not be obvious +from reading about the individual calls below. + +=head2 HANDLES + +Before you can use libguestfs calls, you have to create a handle. +Then you must add at least one disk image to the handle, followed by +launching the handle, then performing whatever operations you want, +and finally closing the handle. So the general structure of all +libguestfs-using programs looks like this: + + guestfs_h *handle = guestfs_create (); + + /* Call guestfs_add_drive additional times if there are + * multiple disk images. + */ + guestfs_add_drive (handle, "guest.img"); + + /* Most manipulation calls won't work until you've launched + * the handle. You have to do this _after_ adding drives + * and _before_ other commands. + */ + guestfs_launch (handle); + + /* Now you can examine what partitions, LVs etc are available. + */ + char **partitions = guestfs_list_partitions (handle); + char **logvols = guestfs_lvs (handle); + + /* To access a filesystem in the image, you must mount it. + */ + guestfs_mount (handle, "/dev/sda1", "/"); + + /* Now you can perform filesystem actions on the guest + * disk image. + */ + guestfs_touch (handle, "/hello"); + + /* You only need to call guestfs_sync if you have made + * changes to the guest image. + */ + guestfs_sync (handle); + + /* Close the handle. */ + guestfs_close (handle); + +The code above doesn't include any error checking. In real code you +should check return values carefully for errors. In general all +functions that return integers return C<-1> on error, and all +functions that return pointers return C<NULL> on error. See section +L</ERROR HANDLING> below for how to handle errors, and consult the +documentation for each function call below to see precisely how they +return error indications. + +=head2 DISK IMAGES + +The image filename (C<"guest.img"> in the example above) could be a +disk image from a virtual machine, a L<dd(1)> copy of a physical hard +disk, an actual block device, or simply an empty file of zeroes that +you have created through L<posix_fallocate(3)>. Libguestfs lets you +do useful things to all of these. + +You can add a disk read-only using C<guestfs_add_drive_ro>, in which +case libguestfs won't modify the file. + +Be extremely cautious if the disk image is in use, eg. if it is being +used by a virtual machine. Adding it read-write will almost certainly +cause disk corruption, but adding it read-only is safe. + +You must add at least one disk image, and you may add multiple disk +images. In the API, the disk images are usually referred to as +C</dev/sda> (for the first one you added), C</dev/sdb> (for the second +one you added), etc. + +Once C<guestfs_launch> has been called you cannot add any more images. +You can call C<guestfs_list_devices> to get a list of the device +names, in the order that you added them. See also L</BLOCK DEVICE +NAMING> below. + +=head2 MOUNTING + +Before you can read or write files, create directories and so on in a +disk image that contains filesystems, you have to mount those +filesystems using C<guestfs_mount>. If you already know that a disk +image contains (for example) one partition with a filesystem on that +partition, then you can mount it directly: + + guestfs_mount (handle, "/dev/sda1", "/"); + +where C</dev/sda1> means literally the first partition (C<1>) of the +first disk image that we added (C</dev/sda>). If the disk contains +Linux LVM2 logical volumes you could refer to those instead (eg. C</dev/VG/LV>). + +If you are given a disk image and you don't know what it contains then +you have to find out. Libguestfs can do that too: use +C<guestfs_list_partitions> and C<guestfs_lvs> to list possible +partitions and LVs, and either try mounting each to see what is +mountable, or else examine them with C<guestfs_file>. But you might +find it easier to look at higher level programs built on top of +libguestfs, in particular L<virt-inspector(1)>. + +To mount a disk image read-only, use C<guestfs_mount_ro>. There are +several other variations of the C<guestfs_mount_*> call. + +=head2 FILESYSTEM ACCESS AND MODIFICATION + +The majority of the libguestfs API consists of fairly low-level calls +for accessing and modifying the files, directories, symlinks etc on +mounted filesystems. There are over a hundred such calls which you +can find listed in detail below in this man page, and we don't even +pretend to cover them all in this overview. + +Specify filenames as full paths including the mount point. + +For example, if you mounted a filesystem at C<"/"> and you want to +read the file called C<"etc/passwd"> then you could do: + + char *data = guestfs_cat (handle, "/etc/passwd"); + +This would return C<data> as a newly allocated buffer containing the +full content of that file (with some conditions: see also +L</DOWNLOADING> below), or C<NULL> if there was an error. + +As another example, to create a top-level directory on that filesystem +called C<"var"> you would do: + + guestfs_mkdir (handle, "/var"); + +To create a symlink you could do: + + guestfs_ln_s (handle, "/etc/init.d/portmap", + "/etc/rc3.d/S30portmap"); + +Libguestfs will reject attempts to use relative paths. There is no +concept of a current working directory. Libguestfs can return errors +in many situations: for example if the filesystem isn't writable, or +if a file or directory that you requested doesn't exist. If you are +using the C API (documented here) you have to check for those error +conditions after each call. (Other language bindings turn these +errors into exceptions). + +File writes are affected by the per-handle umask, set by calling +C<guestfs_umask> and defaulting to 022. + +=head2 PARTITIONING + +Libguestfs contains API calls to read, create and modify partition +tables on disk images. + +In the common case where you want to create a single partition +covering the whole disk, you should use the C<guestfs_part_disk> +call: + + const char *parttype = "mbr"; + if (disk_is_larger_than_2TB) + parttype = "gpt"; + guestfs_part_disk (g, "/dev/sda", parttype); + +Obviously this effectively wipes anything that was on that disk image +before. + +In general MBR partitions are both unnecessarily complicated and +depend on archaic details, namely the Cylinder-Head-Sector (CHS) +geometry of the disk. C<guestfs_sfdiskM> can be used to +create more complex arrangements where the relative sizes are +expressed in megabytes instead of cylinders, which is a small win. +C<guestfs_sfdiskM> will choose the nearest cylinder to approximate the +requested size. There's a lot of crazy stuff to do with IDE and +virtio disks having different, incompatible CHS geometries, that you +probably don't want to know about. + +My advice: make a single partition to cover the whole disk, then use +LVM on top. + +=head2 LVM2 + +Libguestfs provides access to a large part of the LVM2 API, such as +C<guestfs_lvcreate> and C<guestfs_vgremove>. It won't make much sense +unless you familiarize yourself with the concepts of physical volumes, +volume groups and logical volumes. + +This author strongly recommends reading the LVM HOWTO, online at +L<http://tldp.org/HOWTO/LVM-HOWTO/>. + +=head2 DOWNLOADING + +Use C<guestfs_cat> to download small, text only files. This call +is limited to files which are less than 2 MB and which cannot contain +any ASCII NUL (C<\0>) characters. However it has a very simple +to use API. + +C<guestfs_read_file> can be used to read files which contain +arbitrary 8 bit data, since it returns a (pointer, size) pair. +However it is still limited to "small" files, less than 2 MB. + +C<guestfs_download> can be used to download any file, with no +limits on content or size (even files larger than 4 GB). + +To download multiple files, see C<guestfs_tar_out> and +C<guestfs_tgz_out>. + +=head2 UPLOADING + +It's often the case that you want to write a file or files to the disk +image. + +For small, single files, use C<guestfs_write_file>. This call +currently contains a bug which limits the call to plain text files +(not containing ASCII NUL characters). + +To upload a single file, use C<guestfs_upload>. This call has no +limits on file content or size (even files larger than 4 GB). + +To upload multiple files, see C<guestfs_tar_in> and C<guestfs_tgz_in>. + +However the fastest way to upload I<large numbers of arbitrary files> +is to turn them into a squashfs or CD ISO (see L<mksquashfs(8)> and +L<mkisofs(8)>), then attach this using C<guestfs_add_drive_ro>. If +you add the drive in a predictable way (eg. adding it last after all +other drives) then you can get the device name from +C<guestfs_list_devices> and mount it directly using +C<guestfs_mount_ro>. Note that squashfs images are sometimes +non-portable between kernel versions, and they don't support labels or +UUIDs. If you want to pre-build an image or you need to mount it +using a label or UUID, use an ISO image instead. + +=head2 COPYING + +There are various different commands for copying between files and +devices and in and out of the guest filesystem. These are summarised +in the table below. + +=over 4 + +=item B<file> to B<file> + +Use L</guestfs_cp> to copy a single file, or +L</guestfs_cp_a> to copy directories recursively. + +=item B<file or device> to B<file or device> + +Use L</guestfs_dd> which efficiently uses L<dd(1)> +to copy between files and devices in the guest. + +Example: duplicate the contents of an LV: + + guestfs_dd (g, "/dev/VG/Original", "/dev/VG/Copy"); + +The destination (C</dev/VG/Copy>) must be at least as large as the +source (C</dev/VG/Original>). + +=item B<file on the host> to B<file or device> + +Use L</guestfs_upload>. See L</UPLOADING> above. + +=item B<file or device> to B<file on the host> + +Use L</guestfs_download>. See L</DOWNLOADING> above. + +=back + +=head2 LISTING FILES + +C<guestfs_ll> is just designed for humans to read (mainly when using +the L<guestfish(1)>-equivalent command C<ll>). + +C<guestfs_ls> is a quick way to get a list of files in a directory +from programs, as a flat list of strings. + +C<guestfs_readdir> is a programmatic way to get a list of files in a +directory, plus additional information about each one. It is more +equivalent to using the L<readdir(3)> call on a local filesystem. + +C<guestfs_find> can be used to recursively list files. + +=head2 RUNNING COMMANDS + +Although libguestfs is a primarily an API for manipulating files +inside guest images, we also provide some limited facilities for +running commands inside guests. + +There are many limitations to this: + +=over 4 + +=item * + +The kernel version that the command runs under will be different +from what it expects. + +=item * + +If the command needs to communicate with daemons, then most likely +they won't be running. + +=item * + +The command will be running in limited memory. + +=item * + +Only supports Linux guests (not Windows, BSD, etc). + +=item * + +Architecture limitations (eg. won't work for a PPC guest on +an X86 host). + +=item * + +For SELinux guests, you may need to enable SELinux and load policy +first. See L</SELINUX> in this manpage. + +=back + +The two main API calls to run commands are C<guestfs_command> and +C<guestfs_sh> (there are also variations). + +The difference is that C<guestfs_sh> runs commands using the shell, so +any shell globs, redirections, etc will work. + +=head2 CONFIGURATION FILES + +To read and write configuration files in Linux guest filesystems, we +strongly recommend using Augeas. For example, Augeas understands how +to read and write, say, a Linux shadow password file or X.org +configuration file, and so avoids you having to write that code. + +The main Augeas calls are bound through the C<guestfs_aug_*> APIs. We +don't document Augeas itself here because there is excellent +documentation on the L<http://augeas.net/> website. + +If you don't want to use Augeas (you fool!) then try calling +C<guestfs_read_lines> to get the file as a list of lines which +you can iterate over. + +=head2 SELINUX + +We support SELinux guests. To ensure that labeling happens correctly +in SELinux guests, you need to enable SELinux and load the guest's +policy: + +=over 4 + +=item 1. + +Before launching, do: + + guestfs_set_selinux (g, 1); + +=item 2. + +After mounting the guest's filesystem(s), load the policy. This +is best done by running the L<load_policy(8)> command in the +guest itself: + + guestfs_sh (g, "/usr/sbin/load_policy"); + +(Older versions of C<load_policy> require you to specify the +name of the policy file). + +=item 3. + +Optionally, set the security context for the API. The correct +security context to use can only be known by inspecting the +guest. As an example: + + guestfs_setcon (g, "unconfined_u:unconfined_r:unconfined_t:s0"); + +=back + +This will work for running commands and editing existing files. + +When new files are created, you may need to label them explicitly, +for example by running the external command +C<restorecon pathname>. + +=head2 SPECIAL CONSIDERATIONS FOR WINDOWS GUESTS + +Libguestfs can mount NTFS partitions. It does this using the +L<http://www.ntfs-3g.org/> driver. + +DOS and Windows still use drive letters, and the filesystems are +always treated as case insensitive by Windows itself, and therefore +you might find a Windows configuration file referring to a path like +C<c:\windows\system32>. When the filesystem is mounted in libguestfs, +that directory might be referred to as C</WINDOWS/System32>. + +Drive letter mappings are outside the scope of libguestfs. You have +to use libguestfs to read the appropriate Windows Registry and +configuration files, to determine yourself how drives are mapped (see +also L<virt-inspector(1)>). + +Replacing backslash characters with forward slash characters is also +outside the scope of libguestfs, but something that you can easily do. + +Where we can help is in resolving the case insensitivity of paths. +For this, call C<guestfs_case_sensitive_path>. + +Libguestfs also provides some help for decoding Windows Registry +"hive" files, through the library C<libhivex> which is part of +libguestfs. You have to locate and download the hive file(s) +yourself, and then pass them to C<libhivex> functions. See also the +programs L<hivexml(1)>, L<hivexget(1)> and L<virt-win-reg(1)> for more +help on this issue. + +=head2 USING LIBGUESTFS WITH OTHER PROGRAMMING LANGUAGES + +Although we don't want to discourage you from using the C API, we will +mention here that the same API is also available in other languages. + +The API is broadly identical in all supported languages. This means +that the C call C<guestfs_mount(handle,path)> is +C<$handle-E<gt>mount($path)> in Perl, C<handle.mount(path)> in Python, +and C<Guestfs.mount handle path> in OCaml. In other words, a +straightforward, predictable isomorphism between each language. + +Error messages are automatically transformed +into exceptions if the language supports it. + +We don't try to "object orientify" parts of the API in OO languages, +although contributors are welcome to write higher level APIs above +what we provide in their favourite languages if they wish. + +=over 4 + +=item B<C++> + +You can use the I<guestfs.h> header file from C++ programs. The C++ +API is identical to the C API. C++ classes and exceptions are +not implemented. + +=item B<Haskell> + +This is the only language binding that is incomplete. Only calls +which return simple integers have been bound in Haskell, and we are +looking for help to complete this binding. + +=item B<Java> + +Full documentation is contained in the Javadoc which is distributed +with libguestfs. + +=item B<OCaml> + +For documentation see the file C<guestfs.mli>. + +=item B<Perl> + +For documentation see L<Sys::Guestfs(3)>. + +=item B<Python> + +For documentation do: + + $ python + >>> import guestfs + >>> help (guestfs) + +=item B<Ruby> + +Use the Guestfs module. There is no Ruby-specific documentation, but +you can find examples written in Ruby in the libguestfs source. + +=item B<shell scripts> + +For documentation see L<guestfish(1)>. + +=back + +=head1 CONNECTION MANAGEMENT + +=head2 guestfs_h * + +C<guestfs_h> is the opaque type representing a connection handle. +Create a handle by calling C<guestfs_create>. Call C<guestfs_close> +to free the handle and release all resources used. + +For information on using multiple handles and threads, see the section +L</MULTIPLE HANDLES AND MULTIPLE THREADS> below. + +=head2 guestfs_create + + guestfs_h *guestfs_create (void); + +Create a connection handle. + +You have to call C<guestfs_add_drive> on the handle at least once. + +This function returns a non-NULL pointer to a handle on success or +NULL on error. + +After configuring the handle, you have to call C<guestfs_launch>. + +You may also want to configure error handling for the handle. See +L</ERROR HANDLING> section below. + +=head2 guestfs_close + + void guestfs_close (guestfs_h *handle); + +This closes the connection handle and frees up all resources used. + +=head1 ERROR HANDLING + +The convention in all functions that return C<int> is that they return +C<-1> to indicate an error. You can get additional information on +errors by calling C<guestfs_last_error> and/or by setting up an error +handler with C<guestfs_set_error_handler>. + +The default error handler prints the information string to C<stderr>. + +Out of memory errors are handled differently. The default action is +to call L<abort(3)>. If this is undesirable, then you can set a +handler using C<guestfs_set_out_of_memory_handler>. + +=head2 guestfs_last_error + + const char *guestfs_last_error (guestfs_h *handle); + +This returns the last error message that happened on C<handle>. If +there has not been an error since the handle was created, then this +returns C<NULL>. + +The lifetime of the returned string is until the next error occurs, or +C<guestfs_close> is called. + +The error string is not localized (ie. is always in English), because +this makes searching for error messages in search engines give the +largest number of results. + +=head2 guestfs_set_error_handler + + typedef void (*guestfs_error_handler_cb) (guestfs_h *handle, + void *data, + const char *msg); + void guestfs_set_error_handler (guestfs_h *handle, + guestfs_error_handler_cb cb, + void *data); + +The callback C<cb> will be called if there is an error. The +parameters passed to the callback are an opaque data pointer and the +error message string. + +Note that the message string C<msg> is freed as soon as the callback +function returns, so if you want to stash it somewhere you must make +your own copy. + +The default handler prints messages on C<stderr>. + +If you set C<cb> to C<NULL> then I<no> handler is called. + +=head2 guestfs_get_error_handler + + guestfs_error_handler_cb guestfs_get_error_handler (guestfs_h *handle, + void **data_rtn); + +Returns the current error handler callback. + +=head2 guestfs_set_out_of_memory_handler + + typedef void (*guestfs_abort_cb) (void); + int guestfs_set_out_of_memory_handler (guestfs_h *handle, + guestfs_abort_cb); + +The callback C<cb> will be called if there is an out of memory +situation. I<Note this callback must not return>. + +The default is to call L<abort(3)>. + +You cannot set C<cb> to C<NULL>. You can't ignore out of memory +situations. + +=head2 guestfs_get_out_of_memory_handler + + guestfs_abort_fn guestfs_get_out_of_memory_handler (guestfs_h *handle); + +This returns the current out of memory handler. + +=head1 PATH + +Libguestfs needs a kernel and initrd.img, which it finds by looking +along an internal path. + +By default it looks for these in the directory C<$libdir/guestfs> +(eg. C</usr/local/lib/guestfs> or C</usr/lib64/guestfs>). + +Use C<guestfs_set_path> or set the environment variable +C<LIBGUESTFS_PATH> to change the directories that libguestfs will +search in. The value is a colon-separated list of paths. The current +directory is I<not> searched unless the path contains an empty element +or C<.>. For example C<LIBGUESTFS_PATH=:/usr/lib/guestfs> would +search the current directory and then C</usr/lib/guestfs>. + +=head1 HIGH-LEVEL API ACTIONS + +=head2 ABI GUARANTEE + +We guarantee the libguestfs ABI (binary interface), for public, +high-level actions as outlined in this section. Although we will +deprecate some actions, for example if they get replaced by newer +calls, we will keep the old actions forever. This allows you the +developer to program in confidence against libguestfs. + +@ACTIONS@ + +=head1 STRUCTURES + +@STRUCTS@ + +=head1 AVAILABILITY + +=head2 GROUPS OF FUNCTIONALITY IN THE APPLIANCE + +Using L</guestfs_available> you can test availability of +the following groups of functions. This test queries the +appliance to see if the appliance you are currently using +supports the functionality. + +@AVAILABILITY@ + +=head2 SINGLE CALLS AT COMPILE TIME + +If you need to test whether a single libguestfs function is +available at compile time, we recommend using build tools +such as autoconf or cmake. For example in autotools you could +use: + + AC_CHECK_LIB([guestfs],[guestfs_create]) + AC_CHECK_FUNCS([guestfs_dd]) + +which would result in C<HAVE_GUESTFS_DD> being either defined +or not defined in your program. + +=head2 SINGLE CALLS AT RUN TIME + +Testing at compile time doesn't guarantee that a function really +exists in the library. The reason is that you might be dynamically +linked against a previous I<libguestfs.so> (dynamic library) +which doesn't have the call. This situation unfortunately results +in a segmentation fault, which is a shortcoming of the C dynamic +linking system itself. + +You can use L<dlopen(3)> to test if a function is available +at run time, as in this example program (note that you still +need the compile time check as well): + + #include <config.h> + + #include <stdio.h> + #include <stdlib.h> + #include <unistd.h> + #include <dlfcn.h> + #include <guestfs.h> + + main () + { + #ifdef HAVE_GUESTFS_DD + void *dl; + int has_function; + + /* Test if the function guestfs_dd is really available. */ + dl = dlopen (NULL, RTLD_LAZY); + if (!dl) { + fprintf (stderr, "dlopen: %s\n", dlerror ()); + exit (1); + } + has_function = dlsym (dl, "guestfs_dd") != NULL; + dlclose (dl); + + if (!has_function) + printf ("this libguestfs.so does NOT have guestfs_dd function\n"); + else { + printf ("this libguestfs.so has guestfs_dd function\n"); + /* Now it's safe to call + guestfs_dd (g, "foo", "bar"); + */ + } + #else + printf ("guestfs_dd function was not found at compile time\n"); + #endif + } + +You may think the above is an awful lot of hassle, and it is. +There are other ways outside of the C linking system to ensure +that this kind of incompatibility never arises, such as using +package versioning: + + Requires: libguestfs >= 1.0.80 + +=begin html + +<!-- old anchor for the next section --> +<a name="state_machine_and_low_level_event_api"/> + +=end html + +=head1 ARCHITECTURE + +Internally, libguestfs is implemented by running an appliance (a +special type of small virtual machine) using L<qemu(1)>. Qemu runs as +a child process of the main program. + + ___________________ + / \ + | main program | + | | + | | child process / appliance + | | __________________________ + | | / qemu \ + +-------------------+ RPC | +-----------------+ | + | libguestfs <--------------------> guestfsd | | + | | | +-----------------+ | + \___________________/ | | Linux kernel | | + | +--^--------------+ | + \_________|________________/ + | + _______v______ + / \ + | Device or | + | disk image | + \______________/ + +The library, linked to the main program, creates the child process and +hence the appliance in the L</guestfs_launch> function. + +Inside the appliance is a Linux kernel and a complete stack of +userspace tools (such as LVM and ext2 programs) and a small +controlling daemon called C<guestfsd>. The library talks to +C<guestfsd> using remote procedure calls (RPC). There is a mostly +one-to-one correspondence between libguestfs API calls and RPC calls +to the daemon. Lastly the disk image(s) are attached to the qemu +process which translates device access by the appliance's Linux kernel +into accesses to the image. + +A common misunderstanding is that the appliance "is" the virtual +machine. Although the disk image you are attached to might also be +used by some virtual machine, libguestfs doesn't know or care about +this. (But you will care if both libguestfs's qemu process and your +virtual machine are trying to update the disk image at the same time, +since these usually results in massive disk corruption). + +=head1 STATE MACHINE + +libguestfs uses a state machine to model the child process: + + | + guestfs_create + | + | + ____V_____ + / \ + | CONFIG | + \__________/ + ^ ^ ^ \ + / | \ \ guestfs_launch + / | _\__V______ + / | / \ + / | | LAUNCHING | + / | \___________/ + / | / + / | guestfs_launch + / | / + ______ / __|____V + / \ ------> / \ + | BUSY | | READY | + \______/ <------ \________/ + +The normal transitions are (1) CONFIG (when the handle is created, but +there is no child process), (2) LAUNCHING (when the child process is +booting up), (3) alternating between READY and BUSY as commands are +issued to, and carried out by, the child process. + +The guest may be killed by C<guestfs_kill_subprocess>, or may die +asynchronously at any time (eg. due to some internal error), and that +causes the state to transition back to CONFIG. + +Configuration commands for qemu such as C<guestfs_add_drive> can only +be issued when in the CONFIG state. + +The high-level API offers two calls that go from CONFIG through +LAUNCHING to READY. C<guestfs_launch> blocks until the child process +is READY to accept commands (or until some failure or timeout). +C<guestfs_launch> internally moves the state from CONFIG to LAUNCHING +while it is running. + +High-level API actions such as C<guestfs_mount> can only be issued +when in the READY state. These high-level API calls block waiting for +the command to be carried out (ie. the state to transition to BUSY and +then back to READY). But using the low-level event API, you get +non-blocking versions. (But you can still only carry out one +operation per handle at a time - that is a limitation of the +communications protocol we use). + +Finally, the child process sends asynchronous messages back to the +main program, such as kernel log messages. Mostly these are ignored +by the high-level API, but using the low-level event API you can +register to receive these messages. + +=head2 SETTING CALLBACKS TO HANDLE EVENTS + +The child process generates events in some situations. Current events +include: receiving a log message, the child process exits. + +Use the C<guestfs_set_*_callback> functions to set a callback for +different types of events. + +Only I<one callback of each type> can be registered for each handle. +Calling C<guestfs_set_*_callback> again overwrites the previous +callback of that type. Cancel all callbacks of this type by calling +this function with C<cb> set to C<NULL>. + +=head2 guestfs_set_log_message_callback + + typedef void (*guestfs_log_message_cb) (guestfs_h *g, void *opaque, + char *buf, int len); + void guestfs_set_log_message_callback (guestfs_h *handle, + guestfs_log_message_cb cb, + void *opaque); + +The callback function C<cb> will be called whenever qemu or the guest +writes anything to the console. + +Use this function to capture kernel messages and similar. + +Normally there is no log message handler, and log messages are just +discarded. + +=head2 guestfs_set_subprocess_quit_callback + + typedef void (*guestfs_subprocess_quit_cb) (guestfs_h *g, void *opaque); + void guestfs_set_subprocess_quit_callback (guestfs_h *handle, + guestfs_subprocess_quit_cb cb, + void *opaque); + +The callback function C<cb> will be called when the child process +quits, either asynchronously or if killed by +C<guestfs_kill_subprocess>. (This corresponds to a transition from +any state to the CONFIG state). + +=head2 guestfs_set_launch_done_callback + + typedef void (*guestfs_launch_done_cb) (guestfs_h *g, void *opaque); + void guestfs_set_launch_done_callback (guestfs_h *handle, + guestfs_ready_cb cb, + void *opaque); + +The callback function C<cb> will be called when the child process +becomes ready first time after it has been launched. (This +corresponds to a transition from LAUNCHING to the READY state). + +=head1 BLOCK DEVICE NAMING + +In the kernel there is now quite a profusion of schemata for naming +block devices (in this context, by I<block device> I mean a physical +or virtual hard drive). The original Linux IDE driver used names +starting with C</dev/hd*>. SCSI devices have historically used a +different naming scheme, C</dev/sd*>. When the Linux kernel I<libata> +driver became a popular replacement for the old IDE driver +(particularly for SATA devices) those devices also used the +C</dev/sd*> scheme. Additionally we now have virtual machines with +paravirtualized drivers. This has created several different naming +systems, such as C</dev/vd*> for virtio disks and C</dev/xvd*> for Xen +PV disks. + +As discussed above, libguestfs uses a qemu appliance running an +embedded Linux kernel to access block devices. We can run a variety +of appliances based on a variety of Linux kernels. + +This causes a problem for libguestfs because many API calls use device +or partition names. Working scripts and the recipe (example) scripts +that we make available over the internet could fail if the naming +scheme changes. + +Therefore libguestfs defines C</dev/sd*> as the I<standard naming +scheme>. Internally C</dev/sd*> names are translated, if necessary, +to other names as required. For example, under RHEL 5 which uses the +C</dev/hd*> scheme, any device parameter C</dev/sda2> is translated to +C</dev/hda2> transparently. + +Note that this I<only> applies to parameters. The +C<guestfs_list_devices>, C<guestfs_list_partitions> and similar calls +return the true names of the devices and partitions as known to the +appliance. + +=head2 ALGORITHM FOR BLOCK DEVICE NAME TRANSLATION + +Usually this translation is transparent. However in some (very rare) +cases you may need to know the exact algorithm. Such cases include +where you use C<guestfs_config> to add a mixture of virtio and IDE +devices to the qemu-based appliance, so have a mixture of C</dev/sd*> +and C</dev/vd*> devices. + +The algorithm is applied only to I<parameters> which are known to be +either device or partition names. Return values from functions such +as C<guestfs_list_devices> are never changed. + +=over 4 + +=item * + +Is the string a parameter which is a device or partition name? + +=item * + +Does the string begin with C</dev/sd>? + +=item * + +Does the named device exist? If so, we use that device. +However if I<not> then we continue with this algorithm. + +=item * + +Replace initial C</dev/sd> string with C</dev/hd>. + +For example, change C</dev/sda2> to C</dev/hda2>. + +If that named device exists, use it. If not, continue. + +=item * + +Replace initial C</dev/sd> string with C</dev/vd>. + +If that named device exists, use it. If not, return an error. + +=back + +=head2 PORTABILITY CONCERNS + +Although the standard naming scheme and automatic translation is +useful for simple programs and guestfish scripts, for larger programs +it is best not to rely on this mechanism. + +Where possible for maximum future portability programs using +libguestfs should use these future-proof techniques: + +=over 4 + +=item * + +Use C<guestfs_list_devices> or C<guestfs_list_partitions> to list +actual device names, and then use those names directly. + +Since those device names exist by definition, they will never be +translated. + +=item * + +Use higher level ways to identify filesystems, such as LVM names, +UUIDs and filesystem labels. + +=back + +=head1 INTERNALS + +=head2 COMMUNICATION PROTOCOL + +Don't rely on using this protocol directly. This section documents +how it currently works, but it may change at any time. + +The protocol used to talk between the library and the daemon running +inside the qemu virtual machine is a simple RPC mechanism built on top +of XDR (RFC 1014, RFC 1832, RFC 4506). + +The detailed format of structures is in C<src/guestfs_protocol.x> +(note: this file is automatically generated). + +There are two broad cases, ordinary functions that don't have any +C<FileIn> and C<FileOut> parameters, which are handled with very +simple request/reply messages. Then there are functions that have any +C<FileIn> or C<FileOut> parameters, which use the same request and +reply messages, but they may also be followed by files sent using a +chunked encoding. + +=head3 ORDINARY FUNCTIONS (NO FILEIN/FILEOUT PARAMS) + +For ordinary functions, the request message is: + + total length (header + arguments, + but not including the length word itself) + struct guestfs_message_header (encoded as XDR) + struct guestfs_<foo>_args (encoded as XDR) + +The total length field allows the daemon to allocate a fixed size +buffer into which it slurps the rest of the message. As a result, the +total length is limited to C<GUESTFS_MESSAGE_MAX> bytes (currently +4MB), which means the effective size of any request is limited to +somewhere under this size. + +Note also that many functions don't take any arguments, in which case +the C<guestfs_I<foo>_args> is completely omitted. + +The header contains the procedure number (C<guestfs_proc>) which is +how the receiver knows what type of args structure to expect, or none +at all. + +The reply message for ordinary functions is: + + total length (header + ret, + but not including the length word itself) + struct guestfs_message_header (encoded as XDR) + struct guestfs_<foo>_ret (encoded as XDR) + +As above the C<guestfs_I<foo>_ret> structure may be completely omitted +for functions that return no formal return values. + +As above the total length of the reply is limited to +C<GUESTFS_MESSAGE_MAX>. + +In the case of an error, a flag is set in the header, and the reply +message is slightly changed: + + total length (header + error, + but not including the length word itself) + struct guestfs_message_header (encoded as XDR) + struct guestfs_message_error (encoded as XDR) + +The C<guestfs_message_error> structure contains the error message as a +string. + +=head3 FUNCTIONS THAT HAVE FILEIN PARAMETERS + +A C<FileIn> parameter indicates that we transfer a file I<into> the +guest. The normal request message is sent (see above). However this +is followed by a sequence of file chunks. + + total length (header + arguments, + but not including the length word itself, + and not including the chunks) + struct guestfs_message_header (encoded as XDR) + struct guestfs_<foo>_args (encoded as XDR) + sequence of chunks for FileIn param #0 + sequence of chunks for FileIn param #1 etc. + +The "sequence of chunks" is: + + length of chunk (not including length word itself) + struct guestfs_chunk (encoded as XDR) + length of chunk + struct guestfs_chunk (encoded as XDR) + ... + length of chunk + struct guestfs_chunk (with data.data_len == 0) + +The final chunk has the C<data_len> field set to zero. Additionally a +flag is set in the final chunk to indicate either successful +completion or early cancellation. + +At time of writing there are no functions that have more than one +FileIn parameter. However this is (theoretically) supported, by +sending the sequence of chunks for each FileIn parameter one after +another (from left to right). + +Both the library (sender) I<and> the daemon (receiver) may cancel the +transfer. The library does this by sending a chunk with a special +flag set to indicate cancellation. When the daemon sees this, it +cancels the whole RPC, does I<not> send any reply, and goes back to +reading the next request. + +The daemon may also cancel. It does this by writing a special word +C<GUESTFS_CANCEL_FLAG> to the socket. The library listens for this +during the transfer, and if it gets it, it will cancel the transfer +(it sends a cancel chunk). The special word is chosen so that even if +cancellation happens right at the end of the transfer (after the +library has finished writing and has started listening for the reply), +the "spurious" cancel flag will not be confused with the reply +message. + +This protocol allows the transfer of arbitrary sized files (no 32 bit +limit), and also files where the size is not known in advance +(eg. from pipes or sockets). However the chunks are rather small +(C<GUESTFS_MAX_CHUNK_SIZE>), so that neither the library nor the +daemon need to keep much in memory. + +=head3 FUNCTIONS THAT HAVE FILEOUT PARAMETERS + +The protocol for FileOut parameters is exactly the same as for FileIn +parameters, but with the roles of daemon and library reversed. + + total length (header + ret, + but not including the length word itself, + and not including the chunks) + struct guestfs_message_header (encoded as XDR) + struct guestfs_<foo>_ret (encoded as XDR) + sequence of chunks for FileOut param #0 + sequence of chunks for FileOut param #1 etc. + +=head3 INITIAL MESSAGE + +Because the underlying channel (QEmu -net channel) doesn't have any +sort of connection control, when the daemon launches it sends an +initial word (C<GUESTFS_LAUNCH_FLAG>) which indicates that the guest +and daemon is alive. This is what C<guestfs_launch> waits for. + +=head1 MULTIPLE HANDLES AND MULTIPLE THREADS + +All high-level libguestfs actions are synchronous. If you want +to use libguestfs asynchronously then you must create a thread. + +Only use the handle from a single thread. Either use the handle +exclusively from one thread, or provide your own mutex so that two +threads cannot issue calls on the same handle at the same time. + +=head1 QEMU WRAPPERS + +If you want to compile your own qemu, run qemu from a non-standard +location, or pass extra arguments to qemu, then you can write a +shell-script wrapper around qemu. + +There is one important rule to remember: you I<must C<exec qemu>> as +the last command in the shell script (so that qemu replaces the shell +and becomes the direct child of the libguestfs-using program). If you +don't do this, then the qemu process won't be cleaned up correctly. + +Here is an example of a wrapper, where I have built my own copy of +qemu from source: + + #!/bin/sh - + qemudir=/home/rjones/d/qemu + exec $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios "$@" + +Save this script as C</tmp/qemu.wrapper> (or wherever), C<chmod +x>, +and then use it by setting the LIBGUESTFS_QEMU environment variable. +For example: + + LIBGUESTFS_QEMU=/tmp/qemu.wrapper guestfish + +Note that libguestfs also calls qemu with the -help and -version +options in order to determine features. + +=head1 ENVIRONMENT VARIABLES + +=over 4 + +=item LIBGUESTFS_APPEND + +Pass additional options to the guest kernel. + +=item LIBGUESTFS_DEBUG + +Set C<LIBGUESTFS_DEBUG=1> to enable verbose messages. This +has the same effect as calling C<guestfs_set_verbose (handle, 1)>. + +=item LIBGUESTFS_MEMSIZE + +Set the memory allocated to the qemu process, in megabytes. For +example: + + LIBGUESTFS_MEMSIZE=700 + +=item LIBGUESTFS_PATH + +Set the path that libguestfs uses to search for kernel and initrd.img. +See the discussion of paths in section PATH above. + +=item LIBGUESTFS_QEMU + +Set the default qemu binary that libguestfs uses. If not set, then +the qemu which was found at compile time by the configure script is +used. + +See also L</QEMU WRAPPERS> above. + +=item LIBGUESTFS_TRACE + +Set C<LIBGUESTFS_TRACE=1> to enable command traces. This +has the same effect as calling C<guestfs_set_trace (handle, 1)>. + +=item TMPDIR + +Location of temporary directory, defaults to C</tmp>. + +If libguestfs was compiled to use the supermin appliance then each +handle will require rather a large amount of space in this directory +for short periods of time (~ 80 MB). You can use C<$TMPDIR> to +configure another directory to use in case C</tmp> is not large +enough. + +=back + +=head1 SEE ALSO + +L<guestfish(1)>, +L<qemu(1)>, +L<febootstrap(1)>, +L<http://libguestfs.org/>. + +Tools with a similar purpose: +L<fdisk(8)>, +L<parted(8)>, +L<kpartx(8)>, +L<lvm(8)>, +L<disktype(1)>. + +=head1 BUGS + +To get a list of bugs against libguestfs use this link: + +L<https://bugzilla.redhat.com/buglist.cgi?component=libguestfs&product=Virtualization+Tools> + +To report a new bug against libguestfs use this link: + +L<https://bugzilla.redhat.com/enter_bug.cgi?component=libguestfs&product=Virtualization+Tools> + +When reporting a bug, please check: + +=over 4 + +=item * + +That the bug hasn't been reported already. + +=item * + +That you are testing a recent version. + +=item * + +Describe the bug accurately, and give a way to reproduce it. + +=item * + +Run libguestfs-test-tool and paste the B<complete, unedited> +output into the bug report. + +=back + +=head1 AUTHORS + +Richard W.M. Jones (C<rjones at redhat dot com>) + +=head1 COPYRIGHT + +Copyright (C) 2009 Red Hat Inc. +L<http://libguestfs.org/> + +This library is free software; you can redistribute it and/or +modify it under the terms of the GNU Lesser General Public +License as published by the Free Software Foundation; either +version 2 of the License, or (at your option) any later version. + +This library is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +Lesser General Public License for more details. + +You should have received a copy of the GNU Lesser General Public +License along with this library; if not, write to the Free Software +Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |