1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
|
TODO list for libguestfs
======================================================================
This list contains random ideas and musings on features we could add
to libguestfs in future.
- RWMJ
FUSE API
--------
The API needs more test coverage, particularly lesser-used system
calls.
The big unresolved issue is UID/GID mapping between guest filesystem
IDs and the host. It's not easy to automate this because you need
extra details about the guest itself in order to get to its
UID->username map (eg. /etc/passwd from the guest).
Haskell bindings
----------------
Complete the Haskell bindings (see discussion on haskell-cafe).
PHP bindings
------------
Add bindtests to PHP bindings.
Complete bind tests
-------------------
Complete the bind tests - must test the return values and error cases.
virt-inspector - make libvirt XML
---------------------------------
It should be possible to generate libvirt XML from virt-inspector
data, at least partially. This would be just another output type so:
virt-inspector --libvirt guest.img
Note that recent versions of libvirt/virt-install allow guests to be
imported, so this is not so useful any more.
"Standalone/local mode"
-----------------------
Instead of running guestfsd (the daemon) inside qemu, there should be
an option to just run guestfsd directly.
The architecture in this mode would look like:
+------------------+
| main program |
|------------------|
| libguestfs |
+--------^---------+
| | reply
cmd | |
+----v-------------+
| guestfsd |
+------------------+
Notes:
(1) This only makes sense if we are running as root.
(2) There is no console / kernel messages in this configuration, but
we might consider capturing stderr from the daemon.
(3) guestfs_config and guestfs_add_drive become no-ops.
Obviously in this configuration, commands are run directly on the
local machine's disks. You could just run the commands themselves
directly, but libguestfs provides a convenient API and language
bindings. Also deals with tricky stuff like parsing the output of the
LVM commands. Also we get to leverage other code such as
virt-inspector.
This is mainly useful from live CDs, ie. virt-p2v.
Should we bother having the daemon at all and just link the guestfsd
code directly into libguestfs?
Ideas for extra commands
------------------------
General glibc / core programs:
chgrp
more mk*temp calls
ext2 properties:
chattr
lsattr
badblocks
debugfs
dumpe2fs
e2image
e2undo
filefrag
findfs
logsave
mklost+found
SELinux:
chcat
restorecon
ch???
Oddball:
pivot_root
fts(3) / ftw(3)
Other initrd-* commands
-----------------------
Such as:
initrd-extract
initrd-replace
Simple editing of configuration files
-------------------------------------
Some easy non-Augeas methods to edit configuration files.
I'm thinking:
replace /etc/file key value
which would look in /etc/file for any instances of
key=...
key ...
key:...
and replace them with
key=value
key value
key:value
That would solve about 50% of reconfiguration needs, and for the
rest you'd use Augeas, 'download'+'upload' or 'edit'.
RWMJ: I had a go at implementing this, but it's quite error-prone to
do this sort of editing inside the C-based daemon code. It's far
better to do it with Augeas, or else to use an external language like
Perl.
Quick Perl scripts
------------------
Currently we can't do Perl "one-liners". ie. The current syntax for
any short Perl one-liner would be:
perl -MSys::Guestfs -e '$g = Sys::Guestfs->new(); $g->add_drive ("foo"); $g->launch; $g->mount ("/dev/sda1", "/"); ....'
You can see we're well beyond a single line just getting to the point
of adding drives and mounting.
First suggestion:
$h = create ($filename, \"/dev/sda1\" => \"/\");
$h = create ([$file1, $file2], \"/dev/sda1\" => \"/\");
To mount read-only, add C<ro =E<gt> 1> like this:
$h = create ($filename, \"/dev/sda1\" => \"/\", ro => 1);
which is equivalent to the following sequence of calls:
$h = Sys::Guestfs->new ();
$h->add_drive_ro ($filename);
$h->launch ();
$h->mount_ro (\"/dev/sda1\", \"/\");
Command-line form would be:
perl -MSys::Guestfs=:all -e '$_=create("guest.img", "/dev/sda1" => "/"); $_->cat ("/etc/fstab");'
That's not brief enough for one-liners, so we could have an extra
autogenerated module which creates a Sys::Guestfs handle singleton
(the handle is an implicit global variable as in guestfish), eg:
perl -MSys::Guestfs::One -e 'inspect("guest.img"); cat ("/etc/fstab");'
How would editing files work?
virt-rescue pty
---------------
See:
http://search.cpan.org/~rgiersig/IO-Tty-1.08/Pty.pm
http://www.perlmonks.org/index.pl?node_id=582185
Note that pty requires cooperation inside the C code too (there are
two sides to a pty, and one has to be handled after the fork).
[I tried to implement this in the new C virt-rescue, but it doesn't
work. qemu is implementing its own ptys, and they are broken. Need
to fix qemu.]
Windows-based daemon/appliance
------------------------------
See discussion on list:
https://www.redhat.com/archives/libguestfs/2009-November/msg00165.html
qemu locking
------------
Add -drive file=...,lock=exclusive and -drive file=...,lock=shared
Change libguestfs and libvirt to do the right thing, so that multiple
instances of qemu cannot stomp on each other.
virt-disk-explore
-----------------
For multi-level disk images such as live CDs:
http://rwmj.wordpress.com/2009/07/15/unpack-the-russian-doll-of-a-f11-live-cd/
It's possible with libguestfs to recursively look for anything that
might be a filesystem, mount-{,loop} it and look in those, revealing
anything in a disk image.
However this won't work easily for VM disk images in the disk image.
One would have to download those to the host and launch another
libguestfs instance.
[Not sure this is such a good idea. See also live CD inspection idea below.]
Map filesystems to disk blocks
------------------------------
Map files/filesystems/(any other object) to the actual disk
blocks they occupy.
And vice versa.
Is it even possible?
See also contribs/visualize-alignment/
Integration with host intrusion systems
---------------------------------------
Perfect way to monitor VMs from outside the VM. Look for file
hashes, log events, login/logout etc.
http://www.ossec.net/
http://la-samhna.de/samhain/
http://sourceforge.net/projects/aide/
http://osiris.shmoo.com/
http://sourceforge.net/projects/tripwire/
Fix 'file'
----------
https://www.redhat.com/archives/libguestfs/2010-June/msg00053.html
https://www.redhat.com/archives/libguestfs/2010-June/msg00079.html
Freeze/thaw filesystems
-----------------------
Access to these ioctls:
http://git.kernel.org/linus/fcccf502540e3d7
Tips for new users in guestfish
-------------------------------
$ guestfish
Tip: You need to 'add disk.img' or 'alloc disk.img nn' to make a new image.
Type 'notips' to disable tips permanently.
><fs> add mydisk
Tip: You need to type 'run' before you can see into the disk image.
><fs> run
Tip: Use 'list-filesystems' to see what filesystems are available.
><fs> list-filesystems
/dev/vda1
Tip: Use 'mount fs /' to mount a filesystem.
><fs> mount /dev/vda1 /
Tip: Use 'll /' to view the filesystem or ...
><fs> ll /
Could we make guestfish interactive if commands are used without params?
------------------------------------------------------------------------
><fs> sparse
[[Prints man page]]
Image name? disk.img
Size of image? 10M
Common problems
---------------
How can we solve these common user problems?
[space for common problems here]
Better support for encrypted devices
------------------------------------
Currently LUKS support only works if the device contains volume
groups. If it contains, eg., partitions, you cannot access them.
We would like to add:
- Direct access to the /dev/mapper device (eg. if it contains
anything apart from VGs).
Display image as PS
-------------------
Display the structure of an image file as a PS.
Greater use of blkid / libblkid
-------------------------------
There are various useful functions in libblkid for listing partitions,
devices etc which we are essentially duplicating in the daemon. It
would make more sense to just use libblkid for this.
There are some places where we call out to the 'blkid' program. This
might be replaced by direct use of the library (if this is easier).
Visualization
-------------
Eric Sandeen pointed out the blktrace tool which is a better way of
capturing traces than using patched qemu (see
contrib/visualize-alignment). We would still use the same
visualization tools in conjunction with blktrace traces.
guestfish parsing
-----------------
At the moment guestfish uses an ad hoc parser which has many
shortcomings. We should change to using a lex/yacc-based scanner and
parser (there are better parsers out there, but yacc is sufficient and
very widely available).
The scanner must deal with the case of parsing a whole command string,
eg. for a command that the user types in:
><fs> add-drive-opts "/tmp/foo" readonly:true
and also with parsing single words from the command line:
guestfish add-drive-opts /tmp/foo readonly:true
Note the quotes are for scanning and don't indicate types.
We should also allow variables and expressions as part of this new
parsing code, eg:
set roots inspect-os
set product inspect-get-product-name %{roots[0]}
% is better than $ because of shell escaping and confusion with shell
variables.
Can we combine this with ability to set and read environment
variables? Currently guestfish uses many environment variables like
$EDITOR without any corresponding ability to set them.
set EDITOR /usr/bin/emacs
echo $EDITOR # or %{EDITOR}
edit /etc/resolv.conf
live CD inspection for Windows 7
--------------------------------
Windows 7 install CDs are quite different and pretty impenetrable.
There are no obvious files to parse.
More ntfs tools
---------------
ntfsprogs actually has a lot more useful tools than we currently
use. Interesting ones are:
ntfscluster: display file(s) that occupy a cluster or sector
ntfsinfo: print various information about NTFS volume and files
ntfs streams: extract alternate streams from NTFS files
ntfsck: checker for NTFS filesystems
Undelete files
--------------
Two useful tools:
- ext2undelete
- ntfsundelete
More mkfs_opts options
----------------------
Useful options to offer:
- Set label.
- Set UUID.
Use /proc/self/mountinfo
------------------------
This file contains lots of interesting information about
what is mounted and where. eg:
16 21 0:3 / /proc rw,relatime - proc /proc rw
17 21 0:16 / /sys rw,relatime - sysfs /sys rw,seclabel
18 23 0:5 / /dev rw,relatime - devtmpfs udev rw,seclabel,size=1906740k,nr_inodes=476685,mode=755
26 21 253:3 / /home rw,relatime - ext4 /dev/mapper/vg-lv_home rw,seclabel,barrier=1,data=ordered
This could be used instead of current hairy code to parse the output
of the 'mount' command. We could add new APIs to return kernel mount
options, type of filesystem at a mountpoint etc.
guestfish drive letters
-----------------------
There should be an option to mount all Windows drives as separate
paths, like C: => /c/, D: => /d/ etc.
More inspection features
------------------------
- last shutdown time
- DHCP address
- last time the software was updated
- last user who logged in
- lastlog, last, who
Integrate virt-inspector with CMDBs
-----------------------------------
Either integrate virt-inspector with Configuration Management
Databases (CMDBs) or at least check that virt-inspector produces the
right range of data so that integration would be possible. The
standards for CMDBs come from the DMTF, see eg:
http://dmtf.org/news/pr/2009/7/dmtf-releases-cmdbf-standard-federating-configuration-management-data
Efficient way to visit all files
--------------------------------
https://rwmj.wordpress.com/2010/12/15/tip-audit-virtual-machine-for-setuid-files/#content
A naive method would look like:
g#visit ~return_stats:true "/" (
fun pathname stat ->
...
)
However this has two disadvantages:
- requires hand-written custom bindings in each language
- unclear about locking, thread-safety and re-entrancy of handle g
A better way would be to have some sort of explicit "download all
filenames and stat structures", which could then be iterated over:
let files = g#find_opts ~return_stats:true "/" in
List.iter (
fun pathname stat ->
...
)
The problem with this is that 'files' is going to be larger than a
protocol buffer.
This leads to thinking about changes to the protocol / generator to
make this simpler. The proposal would be to add RBigStringList,
RBigStructList [or RBig (Ranytype ...)]. These would work like
FileOut, in that they would use file streaming to stream XDR
structures (probably written to a file on the library side).
Generated code would hide most of the implementation.
We also need to think about security issues: is it possible for the
daemon to keep sending back data forever, and if so what happens on
the library side.
[Users can now use virt-ls to solve some of these problems, but it is
not a general solution at the API level]
Interactive disk creator
------------------------
An interactive disk creator program.
Attach method for disconnected operation
----------------------------------------
http://libguestfs.org/guestfs.3.html#guestfs_set_attach_method
"Librarian" has an idea that he should be able to attach to a regular
appliance, but disconnect from it and reconnect to it later. This
would be some sort of modified attach method (see link above).
The complexity here is that we would no longer have access to
stdin/stdout (or we'd have to direct that somewhere else).
GObject Introspection
---------------------
We periodically get asked to implement gobject-introspection (it's a
GNOME thing):
http://live.gnome.org/GObjectIntrospection
This would require a separate Gtk C API since the main guestfs handle
would have to be encapsulated in a GObject. However the main
difficulty is that the annotations supported to define types are not
very rich. Notably missing are support for optional arguments
(defined but not implemented), support for structs (unless mapped to
other objects).
Also note that the libguestfs API is not "object oriented".
libosinfo mappings for virt-inspector
-------------------------------------
Return libosinfo mappings from inspection API.
virt-sysprep ideas
------------------
- touch /.unconfigured ?
- other Spacewalk / RHN IDs (?)
- Kerberos keys
- Puppet registration
- user accounts
- Windows sysprep
(see: https://github.com/clalancette/oz/blob/e74ce83283d468fd987583d6837b441608e5f8f0/oz/Windows.py )
- blue skies: change the background image
- (librarian suggests ...)
. install a firstboot script virt-sysprep --script=/tmp/foo.sh
. run an external shell script
. run external guestfish script virt-sysprep --fish=/tmp/foo.fish
. rm /var/cache/apt/archives/*
- /var/run/* and pam_faillock's data files
- homedirs/.ssh directory, especially /root/.ssh (Steve Grubb)
- if drives are encrypted, then dm-crypt key should be changed
and drives all re-encrypted
- /etc/pki
(Steve says ...)
Rpm uses nss. Nss sets up its crypto database in
/etc/pki. Depending on how long the machine ran before cloning, you
may have picked up some certificates or things. This is an area
that you would want to look into.
- secure erase of inodes etc using scrub (Steve Grubb)
- other directories that could require cleaning include:
/var/run/*
/var/lib/sss/db/*
/var/lib/samba/*
/var/lib/samba/*/*
(thanks Marko Myllynen, James Antill)
- remove or modify UUIDs in /etc/fstab (eg. on Ubuntu)
(thanks Joshua Daniel Franklin)
Launch remote sessions over ssh
-------------------------------
We had an idea you could add a launch method that uses ssh, ie. all
febootstrap and qemu commands happen the same as now, but prefixed by
ssh so it happens on a remote machine.
Note that proper remote support and integration with libvirt is
different from this, and people are working on that. ssh would just
be "remote-lite".
virt-make-fs and virt-win-reg need to not be in Perl
----------------------------------------------------
Probably they should be in C or OCaml.
Integrate snap-type functionality in inspection tools
-----------------------------------------------------
Mo Morsi's "snap" program lets you describe a guest as the list of
packages (eg. RPMs) installed + changes made to those RPMs + files
added.
http://projects.morsi.org/wiki/Snap
This results in a compact description of the guest. He even managed
to do a kind of migration of guests by simply recreating the guest
from the description on the target machine.
It would be ideal to integrate this and/or use inspection to do this.
Ongoing code cleanups
---------------------
Examine every use of 'int' in C code for signed overflow problems.
All file descriptors in the library and daemon should normally be
opened with O_CLOEXEC. Therefore we need to examine every call to:
- open, openat
- creat
- pipe (see also: pipe2)
- dup, dup2 (see also: dup3)
- socket, socketpair
- accept (see also: accept4)
- signalfd, timerfd, epoll_create
virt-sparsify enhancements
--------------------------
TMPDIR should be checked to ensure that we won't run out of space
during the conversion, since current behaviour is very bad when this
happens (it usually causes virt-sparsify to hang). This requires
writing a small C binding to statvfs for OCaml.
Passing file descriptors using attach-method fd:N
-------------------------------------------------
The idea is that you can pass a file descriptor to the appliance to
another process, which can then attach to it by setting
'attach-method' to 'fd:N' (where N = file descriptor).
The process(es) cooperating like this would have to arrange for mutual
exclusion on the file descriptor, since the protocol itself does not
and cannot support this.
One issue with this is whether just passing the fd is sufficient, or
if other fields in the guestfs_h struct need to be passed too.
Another issue is that the parent process still has to handle
verbose/debug messages, and has to remain around to regain and kill
off the appliance at the end. Thus the parent cannot do much more
than wait(2) and at the same time select(2) on g->fd.
Virt tools would have to have a new --attach-fd=N option.
|