summaryrefslogtreecommitdiffstats
path: root/guestfs.pod
blob: 4d462f34c599d77c4c0d9fc474a311485e2ea977 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
=encoding utf8

=head1 NAME

guestfs - Library for accessing and modifying virtual machine images

=head1 SYNOPSIS

 #include <guestfs.h>

 guestfs_h *handle = guestfs_create ();
 guestfs_add_drive (handle, "guest.img");
 guestfs_launch (handle);
 guestfs_wait_ready (handle);
 guestfs_mount (handle, "/dev/sda1", "/");
 guestfs_touch (handle, "/hello");
 guestfs_sync (handle);
 guestfs_close (handle);

=head1 DESCRIPTION

Libguestfs is a library for accessing and modifying guest disk images.
Amongst the things this is good for: making batch configuration
changes to guests, getting disk used/free statistics (see also:
virt-df), migrating between virtualization systems (see also:
virt-p2v), performing partial backups, performing partial guest
clones, cloning guests and changing registry/UUID/hostname info, and
much else besides.

Libguestfs uses Linux kernel and qemu code, and can access any type of
guest filesystem that Linux and qemu can, including but not limited
to: ext2/3/4, btrfs, FAT and NTFS, LVM, many different disk partition
schemes, qcow, qcow2, vmdk.

Libguestfs provides ways to enumerate guest storage (eg. partitions,
LVs, what filesystem is in each LV, etc.).  It can also run commands
in the context of the guest.  Also you can access filesystems over FTP.

Libguestfs is a library that can be linked with C and C++ management
programs (or management programs written in OCaml, Perl, Python, Ruby, Java
or Haskell).  You can also use it from shell scripts or the command line.

You don't need to be root to use libguestfs, although obviously you do
need enough permissions to access the disk images.

=head1 CONNECTION MANAGEMENT

If you are using the high-level API, then you should call the
functions in the following order:

 guestfs_h *handle = guestfs_create ();

 guestfs_add_drive (handle, "guest.img");
 /* call guestfs_add_drive additional times if the guest has
  * multiple disks
  */

 guestfs_launch (handle);
 guestfs_wait_ready (handle);

 /* now you can examine what partitions, LVs etc are available
  * you have to mount / at least
  */
 guestfs_mount (handle, "/dev/sda1", "/");

 /* now you can perform actions on the guest disk image */
 guestfs_touch (handle, "/hello");

 /* you only need to call guestfs_sync if you have made
  * changes to the guest image
  */
 guestfs_sync (handle);

 guestfs_close (handle);

C<guestfs_wait_ready> and all of the actions including C<guestfs_sync>
are blocking calls.  You can use the low-level event API to do
non-blocking operations instead.

All functions that return integers, return C<-1> on error.  See
section ERROR HANDLING below for how to handle errors.

=head2 guestfs_h *

C<guestfs_h> is the opaque type representing a connection handle.
Create a handle by calling C<guestfs_create>.  Call C<guestfs_close>
to free the handle and release all resources used.

For information on using multiple handles and threads, see the section
MULTIPLE HANDLES AND MULTIPLE THREADS below.

=head2 guestfs_create

 guestfs_h *guestfs_create (void);

Create a connection handle.

You have to call C<guestfs_add_drive> on the handle at least once.

This function returns a non-NULL pointer to a handle on success or
NULL on error.

After configuring the handle, you have to call C<guestfs_launch> and
C<guestfs_wait_ready>.

You may also want to configure error handling for the handle.  See
ERROR HANDLING section below.

=head2 guestfs_close

 void guestfs_close (guestfs_h *handle);

This closes the connection handle and frees up all resources used.

=head1 ERROR HANDLING

The convention in all functions that return C<int> is that they return
C<-1> to indicate an error.  You can get additional information on
errors by calling C<guestfs_last_error> and/or by setting up an error
handler with C<guestfs_set_error_handler>.

The default error handler prints the information string to C<stderr>.

Out of memory errors are handled differently.  The default action is
to call L<abort(3)>.  If this is undesirable, then you can set a
handler using C<guestfs_set_out_of_memory_handler>.

=head2 guestfs_last_error

 const char *guestfs_last_error (guestfs_h *handle);

This returns the last error message that happened on C<handle>.  If
there has not been an error since the handle was created, then this
returns C<NULL>.

The lifetime of the returned string is until the next error occurs, or
C<guestfs_close> is called.

The error string is not localized (ie. is always in English), because
this makes searching for error messages in search engines give the
largest number of results.

=head2 guestfs_set_error_handler

 typedef void (*guestfs_error_handler_cb) (guestfs_h *handle,
                                           void *data,
                                           const char *msg);
 void guestfs_set_error_handler (guestfs_h *handle,
                                 guestfs_error_handler_cb cb,
                                 void *data);

The callback C<cb> will be called if there is an error.  The
parameters passed to the callback are an opaque data pointer and the
error message string.

Note that the message string C<msg> is freed as soon as the callback
function returns, so if you want to stash it somewhere you must make
your own copy.

The default handler prints messages on C<stderr>.

If you set C<cb> to C<NULL> then I<no> handler is called.

=head2 guestfs_get_error_handler

 guestfs_error_handler_cb guestfs_get_error_handler (guestfs_h *handle,
                                                     void **data_rtn);

Returns the current error handler callback.

=head2 guestfs_set_out_of_memory_handler

 typedef void (*guestfs_abort_cb) (void);
 int guestfs_set_out_of_memory_handler (guestfs_h *handle,
                                        guestfs_abort_cb);

The callback C<cb> will be called if there is an out of memory
situation.  I<Note this callback must not return>.

The default is to call L<abort(3)>.

You cannot set C<cb> to C<NULL>.  You can't ignore out of memory
situations.

=head2 guestfs_get_out_of_memory_handler

 guestfs_abort_fn guestfs_get_out_of_memory_handler (guestfs_h *handle);

This returns the current out of memory handler.

=head1 PATH

Libguestfs needs a kernel and initrd.img, which it finds by looking
along an internal path.

By default it looks for these in the directory C<$libdir/guestfs>
(eg. C</usr/local/lib/guestfs> or C</usr/lib64/guestfs>).

Use C<guestfs_set_path> or set the environment variable
C<LIBGUESTFS_PATH> to change the directories that libguestfs will
search in.  The value is a colon-separated list of paths.  The current
directory is I<not> searched unless the path contains an empty element
or C<.>.  For example C<LIBGUESTFS_PATH=:/usr/lib/guestfs> would
search the current directory and then C</usr/lib/guestfs>.

=head1 API OVERVIEW

This section provides additional documentation for groups of API
calls, which may not be obvious from reading about the individual
calls below.

=head2 LVM2

Libguestfs provides access to a large part of the LVM2 API.  It won't
make much sense unless you familiarize yourself with the concepts of
physical volumes, volume groups and logical volumes.

This author strongly recommends reading the LVM HOWTO, online at
L<http://tldp.org/HOWTO/LVM-HOWTO/>.

=head2 PARTITIONING

To create MBR-style (ie. normal PC) partitions use one of the
C<guestfs_sfdisk*> variants.  These calls use the external
L<sfdisk(8)> command.

The simplest call is:

 char *lines[] = { ",", NULL };
 guestfs_sfdiskM (g, "/dev/sda", lines);

This will create a single partition on C</dev/sda> called
C</dev/sda1> covering the whole disk.

In general MBR partitions are both unnecessarily complicated and
depend on archaic details, namely the Cylinder-Head-Sector (CHS)
geometry of the disk.  C<guestfs_sfdiskM> allows you to specify sizes
in megabytes instead of cylinders, which is a small win.
C<guestfs_sfdiskM> will choose the nearest cylinder to approximate the
requested size.  There's a lot of crazy stuff to do with IDE and
virtio disks having different, incompatible CHS geometries, that you
probably don't want to know about.  My advice: make a single partition
to cover the whole disk, then use LVM on top.

In future we aim to provide access to libparted.

=head2 UPLOADING

For small, single files, use C<guestfs_write_file>.  In some versions
of libguestfs there was a bug which limited this call to text files
(not containing ASCII NUL characters).

To upload a single file, use C<guestfs_upload>.  This call has no
limits on file content or size (even files larger than 4 GB).

To upload multiple files, see C<guestfs_tar_in> and C<guestfs_tgz_in>.

However the fastest way to upload I<large numbers of arbitrary files>
is to turn them into a squashfs or CD ISO (see L<mksquashfs(8)> and
L<mkisofs(8)>), then attach this using C<guestfs_add_drive_ro>.  If
you add the drive in a predictable way (eg. adding it last after all
other drives) then you can get the device name from
C<guestfs_list_devices> and mount it directly using
C<guestfs_mount_ro>.  Note that squashfs images are sometimes
non-portable between kernel versions, and they don't support labels or
UUIDs.  If you want to pre-build an image or you need to mount it
using a label or UUID, use an ISO image instead.

=head2 DOWNLOADING

Use C<guestfs_cat> to download small, text only files.  This call
is limited to files which are less than 2 MB and which cannot contain
any ASCII NUL (C<\0>) characters.  However it has a very simple
to use API.

C<guestfs_read_file> can be used to read files which contain
arbitrary 8 bit data, since it returns a (pointer, size) pair.
However it is still limited to "small" files, less than 2 MB.

C<guestfs_download> can be used to download any file, with no
limits on content or size (even files larger than 4 GB).

To download multiple files, see C<guestfs_tar_out> and
C<guestfs_tgz_out>.

=head2 RUNNING COMMANDS

Although libguestfs is a primarily an API for manipulating files
inside guest images, we also provide some limited facilities for
running commands inside guests.

There are many limitations to this:

=over 4

=item *

The kernel version that the command runs under will be different
from what it expects.

=item *

If the command needs to communicate with daemons, then most likely
they won't be running.

=item *

The command will be running in limited memory.

=item *

Only supports Linux guests (not Windows, BSD, etc).

=item *

Architecture limitations (eg. won't work for a PPC guest on
an X86 host).

=back

The two main API calls to run commands are C<guestfs_command> and
C<guestfs_sh> (there are also variations).

The difference is that C<guestfs_sh> runs commands using the shell, so
any shell globs, redirections, etc will work.

=head2 LISTING FILES

C<guestfs_ll> is just designed for humans to read (mainly when using
the L<guestfish(1)>-equivalent command C<ll>).

C<guestfs_ls> is a quick way to get a list of files in a directory
from programs.

C<guestfs_readdir> is a programmatic way to get a list of files in a
directory, plus additional information about each one.

C<guestfs_find> can be used to recursively list files.

=head1 HIGH-LEVEL API ACTIONS

=head2 ABI GUARANTEE

We guarantee the libguestfs ABI (binary interface), for public,
high-level actions as outlined in this section.  Although we will
deprecate some actions, for example if they get replaced by newer
calls, we will keep the old actions forever.  This allows you the
developer to program in confidence against libguestfs.

@ACTIONS@

=head1 STRUCTURES

@STRUCTS@

=head1 STATE MACHINE AND LOW-LEVEL EVENT API

Internally, libguestfs is implemented by running a virtual machine
using L<qemu(1)>.  QEmu runs as a child process of the main program,
and most of this discussion won't make sense unless you understand
that the complexity is dealing with the (asynchronous) actions of the
child process.

                            child process
  ___________________       _________________________
 /                   \     /                         \
 | main program      |     | qemu +-----------------+|
 |                   |     |      | Linux kernel    ||
 +-------------------+     |      +-----------------+|
 | libguestfs     <-------------->| guestfsd        ||
 |                   |     |      +-----------------+|
 \___________________/     \_________________________/

The diagram above shows libguestfs communicating with the guestfsd
daemon running inside the qemu child process.  There are several
points of failure here: qemu can fail to start, the virtual machine
inside qemu can fail to boot, guestfsd can fail to start or not
establish communication, any component can start successfully but fail
asynchronously later, and so on.

=head2 STATE MACHINE

libguestfs uses a state machine to model the child process:

                         |
                    guestfs_create
                         |
                         |
                     ____V_____
                    /          \
                    |  CONFIG  |
                    \__________/
                     ^ ^   ^  \
                    /  |    \  \ guestfs_launch
                   /   |    _\__V______
                  /    |   /           \
                 /     |   | LAUNCHING |
                /      |   \___________/
               /       |       /
              /        |  guestfs_wait_ready
             /         |     /
    ______  /        __|____V
   /      \ ------> /        \
   | BUSY |         | READY  |
   \______/ <------ \________/

The normal transitions are (1) CONFIG (when the handle is created, but
there is no child process), (2) LAUNCHING (when the child process is
booting up), (3) alternating between READY and BUSY as commands are
issued to, and carried out by, the child process.

The guest may be killed by C<guestfs_kill_subprocess>, or may die
asynchronously at any time (eg. due to some internal error), and that
causes the state to transition back to CONFIG.

Configuration commands for qemu such as C<guestfs_add_drive> can only
be issued when in the CONFIG state.

The high-level API offers two calls that go from CONFIG through
LAUNCHING to READY.  C<guestfs_launch> is a non-blocking call that
starts up the child process, immediately moving from CONFIG to
LAUNCHING.  C<guestfs_wait_ready> blocks until the child process is
READY to accept commands (or until some failure or timeout).  The
low-level event API described below provides a non-blocking way to
replace C<guestfs_wait_ready>.

High-level API actions such as C<guestfs_mount> can only be issued
when in the READY state.  These high-level API calls block waiting for
the command to be carried out (ie. the state to transition to BUSY and
then back to READY).  But using the low-level event API, you get
non-blocking versions.  (But you can still only carry out one
operation per handle at a time - that is a limitation of the
communications protocol we use).

Finally, the child process sends asynchronous messages back to the
main program, such as kernel log messages.  Mostly these are ignored
by the high-level API, but using the low-level event API you can
register to receive these messages.

=head2 SETTING CALLBACKS TO HANDLE EVENTS

The child process generates events in some situations.  Current events
include: receiving a reply message after some action, receiving a log
message, the child process exits, &c.

Use the C<guestfs_set_*_callback> functions to set a callback for
different types of events.

Only I<one callback of each type> can be registered for each handle.
Calling C<guestfs_set_*_callback> again overwrites the previous
callback of that type.  Cancel all callbacks of this type by calling
this function with C<cb> set to C<NULL>.

=head2 NON-BLOCKING ACTIONS

XXX This section was documented in previous versions but never
implemented in a way which matched the documentation.  For now I have
removed the documentation, pending a working implementation.  See also
C<src/guestfs-actions.c> in the source.


=head2 guestfs_set_send_callback

 typedef void (*guestfs_send_cb) (guestfs_h *g, void *opaque);
 void guestfs_set_send_callback (guestfs_h *handle,
                                 guestfs_send_cb cb,
                                 void *opaque);

The callback function C<cb> will be called whenever a message
which is queued for sending, has been sent.

=head2 guestfs_set_reply_callback

 typedef void (*guestfs_reply_cb) (guestfs_h *g, void *opaque, XDR *xdr);
 void guestfs_set_reply_callback (guestfs_h *handle,
                                  guestfs_reply_cb cb,
                                  void *opaque);

The callback function C<cb> will be called whenever a reply is
received from the child process.  (This corresponds to a transition
from the BUSY state to the READY state).

Note that the C<xdr> that you get in the callback is in C<XDR_DECODE>
mode, and you need to consume it before you return from the callback
function (since it gets destroyed after).

=head2 guestfs_set_log_message_callback

 typedef void (*guestfs_log_message_cb) (guestfs_h *g, void *opaque,
                                         char *buf, int len);
 void guestfs_set_log_message_callback (guestfs_h *handle,
                                        guestfs_log_message_cb cb,
                                        void *opaque);

The callback function C<cb> will be called whenever qemu or the guest
writes anything to the console.

Use this function to capture kernel messages and similar.

Normally there is no log message handler, and log messages are just
discarded.

=head2 guestfs_set_subprocess_quit_callback

 typedef void (*guestfs_subprocess_quit_cb) (guestfs_h *g, void *opaque);
 void guestfs_set_subprocess_quit_callback (guestfs_h *handle,
                                            guestfs_subprocess_quit_cb cb,
                                            void *opaque);

The callback function C<cb> will be called when the child process
quits, either asynchronously or if killed by
C<guestfs_kill_subprocess>.  (This corresponds to a transition from
any state to the CONFIG state).

=head2 guestfs_set_launch_done_callback

 typedef void (*guestfs_launch_done_cb) (guestfs_h *g, void *opaque);
 void guestfs_set_launch_done_callback (guestfs_h *handle,
                                        guestfs_ready_cb cb,
                                        void *opaque);

The callback function C<cb> will be called when the child process
becomes ready first time after it has been launched.  (This
corresponds to a transition from LAUNCHING to the READY state).

You can use this instead of C<guestfs_wait_ready> to implement a
non-blocking wait for the child process to finish booting up.

=head2 EVENT MAIN LOOP

To use the low-level event API and/or to use handles from multiple
threads, you have to provide an event "main loop".  You can write your
own, but if you don't want to write one, two types are provided for
you:

=over 4

=item libguestfs-select

A simple main loop that is implemented using L<select(2)>.

This is the default main loop for new guestfs handles, unless you
call C<guestfs_set_main_loop> after a handle is created.

=item libguestfs-glib

An implementation which can be used with GLib and GTK+ programs.  You
can use this to write graphical (GTK+) programs which use libguestfs
without hanging during long or slow operations.

=back

=head2 MULTIPLE HANDLES AND MULTIPLE THREADS

The support for multiple handles and multiple threads is modelled
after glib (although doesn't require glib, if you use the select-based
main loop).

L<http://library.gnome.org/devel/glib/unstable/glib-The-Main-Event-Loop.html>

You will need to create one main loop for each thread that wants to
use libguestfs.  Each guestfs handle should be confined to one thread.
If you try to pass guestfs handles between threads, you will get
undefined results.

If you only want to use guestfs handles from one thread in your
program, but your program has other threads doing other things, then
you don't need to do anything special.

=head2 SINGLE THREAD CASE

In the single thread case, there is a single select-based main loop
created for you.  All guestfs handles will use this main loop to
execute high level API actions.

=head2 MULTIPLE THREADS CASE

In the multiple threads case, you will need to create a main loop for
each thread that wants to use libguestfs.

To create main loops for other threads, use
C<guestfs_create_main_loop> or C<guestfs_glib_create_main_loop>.

Then you will need to attach each handle to the thread-specific main
loop by calling:

 handle = guestfs_create ();
 guestfs_set_main_loop (handle, main_loop_of_current_thread);

=head2 guestfs_set_main_loop

 void guestfs_set_main_loop (guestfs_h *handle,
                             guestfs_main_loop *main_loop);

Sets the main loop used by high level API actions for this handle.  By
default, the select-based main loop is used (see
C<guestfs_get_default_main_loop>).

You only need to use this in multi-threaded programs, where multiple
threads want to use libguestfs.  Create a main loop for each thread,
then call this function.

You cannot pass guestfs handles between threads.

=head2 guestfs_get_main_loop

 guestfs_main_loop *guestfs_get_main_loop (guestfs_h *handle);

Return the main loop used by C<handle>.

=head2 guestfs_get_default_main_loop

 guestfs_main_loop *guestfs_get_default_main_loop (void);

Return the default select-based main loop.

=head2 guestfs_create_main_loop

 guestfs_main_loop *guestfs_create_main_loop (void);

This creates a select-based main loop.  You should create one main
loop for each additional thread that needs to use libguestfs.

=head2 guestfs_free_main_loop

 void guestfs_free_main_loop (guestfs_main_loop *);

Free the select-based main loop which was previously allocated with
C<guestfs_create_main_loop>.

=head2 WRITING A CUSTOM MAIN LOOP

This isn't documented.  Please see the libguestfs-select and
libguestfs-glib implementations.

=head1 BLOCK DEVICE NAMING

In the kernel there is now quite a profusion of schemata for naming
block devices (in this context, by I<block device> I mean a physical
or virtual hard drive).  The original Linux IDE driver used names
starting with C</dev/hd*>.  SCSI devices have historically used a
different naming scheme, C</dev/sd*>.  When the Linux kernel I<libata>
driver became a popular replacement for the old IDE driver
(particularly for SATA devices) those devices also used the
C</dev/sd*> scheme.  Additionally we now have virtual machines with
paravirtualized drivers.  This has created several different naming
systems, such as C</dev/vd*> for virtio disks and C</dev/xvd*> for Xen
PV disks.

As discussed above, libguestfs uses a qemu appliance running an
embedded Linux kernel to access block devices.  We can run a variety
of appliances based on a variety of Linux kernels.

This causes a problem for libguestfs because many API calls use device
or partition names.  Working scripts and the recipe (example) scripts
that we make available over the internet could fail if the naming
scheme changes.

Therefore libguestfs defines C</dev/sd*> as the I<standard naming
scheme>.  Internally C</dev/sd*> names are translated, if necessary,
to other names as required.  For example, under RHEL 5 which uses the
C</dev/hd*> scheme, any device parameter C</dev/sda2> is translated to
C</dev/hda2> transparently.

Note that this I<only> applies to parameters.  The
C<guestfs_list_devices>, C<guestfs_list_partitions> and similar calls
return the true names of the devices and partitions as known to the
appliance.

=head2 ALGORITHM FOR BLOCK DEVICE NAME TRANSLATION

Usually this translation is transparent.  However in some (very rare)
cases you may need to know the exact algorithm.  Such cases include
where you use C<guestfs_config> to add a mixture of virtio and IDE
devices to the qemu-based appliance, so have a mixture of C</dev/sd*>
and C</dev/vd*> devices.

The algorithm is applied only to I<parameters> which are known to be
either device or partition names.  Return values from functions such
as C<guestfs_list_devices> are never changed.

=over 4

=item *

Is the string a parameter which is a device or partition name?

=item *

Does the string begin with C</dev/sd>?

=item *

Does the named device exist?  If so, we use that device.
However if I<not> then we continue with this algorithm.

=item *

Replace initial C</dev/sd> string with C</dev/hd>.

For example, change C</dev/sda2> to C</dev/hda2>.

If that named device exists, use it.  If not, continue.

=item *

Replace initial C</dev/sd> string with C</dev/vd>.

If that named device exists, use it.  If not, return an error.

=back

=head2 PORTABILITY CONCERNS

Although the standard naming scheme and automatic translation is
useful for simple programs and guestfish scripts, for larger programs
it is best not to rely on this mechanism.

Where possible for maximum future portability programs using
libguestfs should use these future-proof techniques:

=over 4

=item *

Use C<guestfs_list_devices> or C<guestfs_list_partitions> to list
actual device names, and then use those names directly.

Since those device names exist by definition, they will never be
translated.

=item *

Use higher level ways to identify filesystems, such as LVM names,
UUIDs and filesystem labels.

=back

=head1 INTERNALS

=head2 COMMUNICATION PROTOCOL

Don't rely on using this protocol directly.  This section documents
how it currently works, but it may change at any time.

The protocol used to talk between the library and the daemon running
inside the qemu virtual machine is a simple RPC mechanism built on top
of XDR (RFC 1014, RFC 1832, RFC 4506).

The detailed format of structures is in C<src/guestfs_protocol.x>
(note: this file is automatically generated).

There are two broad cases, ordinary functions that don't have any
C<FileIn> and C<FileOut> parameters, which are handled with very
simple request/reply messages.  Then there are functions that have any
C<FileIn> or C<FileOut> parameters, which use the same request and
reply messages, but they may also be followed by files sent using a
chunked encoding.

=head3 ORDINARY FUNCTIONS (NO FILEIN/FILEOUT PARAMS)

For ordinary functions, the request message is:

 total length (header + arguments,
      but not including the length word itself)
 struct guestfs_message_header (encoded as XDR)
 struct guestfs_<foo>_args (encoded as XDR)

The total length field allows the daemon to allocate a fixed size
buffer into which it slurps the rest of the message.  As a result, the
total length is limited to C<GUESTFS_MESSAGE_MAX> bytes (currently
4MB), which means the effective size of any request is limited to
somewhere under this size.

Note also that many functions don't take any arguments, in which case
the C<guestfs_I<foo>_args> is completely omitted.

The header contains the procedure number (C<guestfs_proc>) which is
how the receiver knows what type of args structure to expect, or none
at all.

The reply message for ordinary functions is:

 total length (header + ret,
      but not including the length word itself)
 struct guestfs_message_header (encoded as XDR)
 struct guestfs_<foo>_ret (encoded as XDR)

As above the C<guestfs_I<foo>_ret> structure may be completely omitted
for functions that return no formal return values.

As above the total length of the reply is limited to
C<GUESTFS_MESSAGE_MAX>.

In the case of an error, a flag is set in the header, and the reply
message is slightly changed:

 total length (header + error,
      but not including the length word itself)
 struct guestfs_message_header (encoded as XDR)
 struct guestfs_message_error (encoded as XDR)

The C<guestfs_message_error> structure contains the error message as a
string.

=head3 FUNCTIONS THAT HAVE FILEIN PARAMETERS

A C<FileIn> parameter indicates that we transfer a file I<into> the
guest.  The normal request message is sent (see above).  However this
is followed by a sequence of file chunks.

 total length (header + arguments,
      but not including the length word itself,
      and not including the chunks)
 struct guestfs_message_header (encoded as XDR)
 struct guestfs_<foo>_args (encoded as XDR)
 sequence of chunks for FileIn param #0
 sequence of chunks for FileIn param #1 etc.

The "sequence of chunks" is:

 length of chunk (not including length word itself)
 struct guestfs_chunk (encoded as XDR)
 length of chunk
 struct guestfs_chunk (encoded as XDR)
   ...
 length of chunk
 struct guestfs_chunk (with data.data_len == 0)

The final chunk has the C<data_len> field set to zero.  Additionally a
flag is set in the final chunk to indicate either successful
completion or early cancellation.

At time of writing there are no functions that have more than one
FileIn parameter.  However this is (theoretically) supported, by
sending the sequence of chunks for each FileIn parameter one after
another (from left to right).

Both the library (sender) I<and> the daemon (receiver) may cancel the
transfer.  The library does this by sending a chunk with a special
flag set to indicate cancellation.  When the daemon sees this, it
cancels the whole RPC, does I<not> send any reply, and goes back to
reading the next request.

The daemon may also cancel.  It does this by writing a special word
C<GUESTFS_CANCEL_FLAG> to the socket.  The library listens for this
during the transfer, and if it gets it, it will cancel the transfer
(it sends a cancel chunk).  The special word is chosen so that even if
cancellation happens right at the end of the transfer (after the
library has finished writing and has started listening for the reply),
the "spurious" cancel flag will not be confused with the reply
message.

This protocol allows the transfer of arbitrary sized files (no 32 bit
limit), and also files where the size is not known in advance
(eg. from pipes or sockets).  However the chunks are rather small
(C<GUESTFS_MAX_CHUNK_SIZE>), so that neither the library nor the
daemon need to keep much in memory.

=head3 FUNCTIONS THAT HAVE FILEOUT PARAMETERS

The protocol for FileOut parameters is exactly the same as for FileIn
parameters, but with the roles of daemon and library reversed.

 total length (header + ret,
      but not including the length word itself,
      and not including the chunks)
 struct guestfs_message_header (encoded as XDR)
 struct guestfs_<foo>_ret (encoded as XDR)
 sequence of chunks for FileOut param #0
 sequence of chunks for FileOut param #1 etc.

=head3 INITIAL MESSAGE

Because the underlying channel (QEmu -net channel) doesn't have any
sort of connection control, when the daemon launches it sends an
initial word (C<GUESTFS_LAUNCH_FLAG>) which indicates that the guest
and daemon is alive.  This is what C<guestfs_wait_ready> waits for.

=head1 QEMU WRAPPERS

If you want to compile your own qemu, run qemu from a non-standard
location, or pass extra arguments to qemu, then you can write a
shell-script wrapper around qemu.

There is one important rule to remember: you I<must C<exec qemu>> as
the last command in the shell script (so that qemu replaces the shell
and becomes the direct child of the libguestfs-using program).  If you
don't do this, then the qemu process won't be cleaned up correctly.

Here is an example of a wrapper, where I have built my own copy of
qemu from source:

 #!/bin/sh -
 qemudir=/home/rjones/d/qemu
 exec $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios "$@"

Save this script as C</tmp/qemu.wrapper> (or wherever), C<chmod +x>,
and then use it by setting the LIBGUESTFS_QEMU environment variable.
For example:

 LIBGUESTFS_QEMU=/tmp/qemu.wrapper guestfish

Note that libguestfs also calls qemu with the -help and -version
options in order to determine features.

=head1 ENVIRONMENT VARIABLES

=over 4

=item LIBGUESTFS_APPEND

Pass additional options to the guest kernel.

=item LIBGUESTFS_DEBUG

Set C<LIBGUESTFS_DEBUG=1> to enable verbose messages.  This
has the same effect as calling C<guestfs_set_verbose (handle, 1)>.

=item LIBGUESTFS_MEMSIZE

Set the memory allocated to the qemu process, in megabytes.  For
example:

 LIBGUESTFS_MEMSIZE=700

=item LIBGUESTFS_PATH

Set the path that libguestfs uses to search for kernel and initrd.img.
See the discussion of paths in section PATH above.

=item LIBGUESTFS_QEMU

Set the default qemu binary that libguestfs uses.  If not set, then
the qemu which was found at compile time by the configure script is
used.

See also L<QEMU WRAPPERS> above.

=item TMPDIR

Location of temporary directory, defaults to C</tmp>.

If libguestfs was compiled to use the supermin appliance then each
handle will require rather a large amount of space in this directory
for short periods of time (~ 80 MB).  You can use C<$TMPDIR> to
configure another directory to use in case C</tmp> is not large
enough.

=back

=head1 SEE ALSO

L<guestfish(1)>,
L<qemu(1)>,
L<febootstrap(1)>,
L<http://libguestfs.org/>.

Tools with a similar purpose:
L<fdisk(8)>,
L<parted(8)>,
L<kpartx(8)>,
L<lvm(8)>,
L<disktype(1)>.

=head1 BUGS

To get a list of bugs against libguestfs use this link:

L<https://bugzilla.redhat.com/buglist.cgi?component=libguestfs&product=Virtualization+Tools>

To report a new bug against libguestfs use this link:

L<https://bugzilla.redhat.com/enter_bug.cgi?component=libguestfs&product=Virtualization+Tools>

When reporting a bug, please check:

=over 4

=item *

That the bug hasn't been reported already.

=item *

That you are testing a recent version.

=item *

Describe the bug accurately, and give a way to reproduce it.

=item *

Run libguestfs-test-tool and paste the B<complete, unedited>
output into the bug report.

=back

=head1 AUTHORS

Richard W.M. Jones (C<rjones at redhat dot com>)

=head1 COPYRIGHT

Copyright (C) 2009 Red Hat Inc.
L<http://libguestfs.org/>

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA