From b93cda0f8143c9f5026bfe15ce71fd1486cc4a3c Mon Sep 17 00:00:00 2001 From: Ales Kozumplik Date: Thu, 27 Oct 2011 11:25:39 +0200 Subject: Document iscsi and multipath implementations. --- docs/multipath.txt | 143 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 143 insertions(+) create mode 100644 docs/multipath.txt (limited to 'docs/multipath.txt') diff --git a/docs/multipath.txt b/docs/multipath.txt new file mode 100644 index 000000000..e8af24ec9 --- /dev/null +++ b/docs/multipath.txt @@ -0,0 +1,143 @@ +====================== +Multipath and Anaconda +====================== + + +Introduction +------------ + +If there are two block devices in your /dev for which udev reports the same +'ID_SERIAL' then you can create a certain device mapper device which arbitrarily +uses those devices to access the physical device. And that is Multipath [1]. + +For instance, suppose there are + +/dev/sda, with ID_SERIAL of 20090ef12700001d2, and +/dev/sdb, with the same ID_SERIAL. + +Those are probably some adapters in the system that just connect your box to a +storage area network (SAN) somewhere. There are perhaps two cables, one for sda, +one for sdb, and if one of the cables gets cut the other can still transmit +data. Normally the system won't recognize that sda and sdb have this special +relation to each other, but by creating a suitable device map using multipath +tools [2] we can create a DM device /dev/mapper/mpatha and use it for storing +and retrieving data. + +The device mapper then automatically routes IO requests to /dev/mapper/mpatha to +either sda or sdb depending on the load of the line or network congestion on the +particular network etc. + +The nomenclature I will use here is: + +- 'multipath device' for the smart /dev/mapper/mpathX device. +- 'multipath member device' for the '/dev/sdX' devices. Also 'a path'. + + +What is expected from Anaconda +------------------------------ + +Anaconda is expected to: +- detect that there are multipath devices present +- coalesce all relevant (e.g. exclusiveDisks) multipath devices. +- only let the user interact with the multipath devices in filtering, + cleardiskssel and partition screen, that is once we know 'sdc' and 'sdd' are + part of 'mpathb' show only 'mpathb' and never the paths. +- install bootloader and boot from an mpath device +- make it happen so all the multipath devices (carrying or not the root + filesystem) we used for installation are correctly coalesced in the booted + system. This is achieved by generating a suitable /etc/multipath.conf and + writing it into sysroot. +- be able to refer to mpath devices from kickstart, either by name like 'mpatha' + or by their id like 'disk/by-id/scsi-20090ef12700001d2' + + +How Anaconda handles multipath +------------------------------ + +To detect presence of multipath devices we rely on multipath tools. The same we +do for coalescing, see pyanaconda/storage/devicelibs/mpath.py, the file that +provides some abstraction from mpath tools. During the device scan we use the +'multipath -d' output to find out what devices are going to end up as multipath +members. The MultipathTopology object also enhances the multipath member's udev +dictionaries with 'ID_FS_TYPE' set to 'multipath_member' (yes, this is a hack +surviving from the original mpath implementation, and righteous is he who +eradicates it). This information is picked up by DeviceTree when populating +itself. Meaning, if 'sda' and 'sdb' are multipath member devices DeviceTree +gives them MultipathMember format and creates one MultipathDevice for them (we +know its name from 'multipath -d'). We end up with: + +DiskDevice 'sda', format 'MultipathMember' +DiskDevice 'sdb', format 'MultipathMember' +MultipathDevice 'mpatha', parents are 'sda' and 'sdb'. + +From then on, Anaconda only deals with the MultipathDevice and generally leaves +anything with 'MultipathMember' format alone (understand, this is an inert +format that really is not there but we use it just to mark the device as +"useless beyond a multipath member", kind of like MDRaidMember). + +Partition happens over the multipath device and during the preinstallconfig step +/mnt/sysimage/etc/multipath.conf is created and filled with information about +the coalesced devices. This is handled in the Storage.write() method. It is +important this file and /etc/multipath/wwids (autogenerated by mpath tools) +make it to the sysimage before the dracut image is generated. + + +Debugging multipath bugs +------------------------ + +Unlike with iSCSI, to reproduce a multipath bug one does not need the same +specific hardware as the reporter. Just found any box connected to a multipathed +SAN and you are fine (at the moment, connecting to the same iSCSI target through +its IPv4 and IPv6 address also produces a multipathed device). + +On top of that, much of the necessary information is already included in the +anaconda logs or can be easily extracted from the reporter. The things to +particularly look at are: + +- storage.log, the output around 'devices to scan for multipath' and 'devices + post multipath scan'. The latter shows a triple with regular disks, disks + comprising multipath devices and partitions. This helps you quickly find out + what the target system is about. + +- this information is also in program.log's calls to 'multipath' [3]. If mpath + devices are mysteriously appearing/disappearing between filtering and + partitioning screens look at those. 'multipath -ll' is called to display + currently coalesced mpath devices, 'multipath -d' is called to show the mpath + devices that would be coalesced if we ran 'multipath' now. This is exploited + by the device filtering screen. + + +Future of multipath in Anaconda +------------------------------- + +Overall as of RHEL6.2, the shape of multipath in Anaconda is good and what's +more important it is flexible enough to sustain new RFEs and bugs. Those are +however bugs that I expect to appear sometime soon: + +- enable or disable mpath_friendly_names in kickstart. Disabling friendly names + just means the mpath devices are called by their wwid, + e.g. /dev/mapper/360334332345343234, not '/dev/mapper/mpathc'. This is + straightforward to implement. +- extend support for mpath devices in kickstart in general. Currently mpath + devices should be accepted in most commands but I am sure there will be corner + cases. Difficulty medium. +- [rawhide] stop extending the udev info dictionary with 'ID_FS_TYPE' and + 'ID_MPATH_NAME'. Doing it this way is asking for the trouble if a dictionary + of particular mpath device is reloaded from udev without running it through + the MultipathTopology object as it will miss those entries (and DeviceTree + depends on them a lot). Difficulty hard, but includes a lot of pleasant + refactoring. +- Improve support for multipathing iSCSI devices. Someone might ask for it one + day (in fact, with the NIC bounding they already did), and it will make mpath + debugging possible on any virt machine with multiple virt NICs. + + + +[1] http://akozumpl.fedorapeople.org/archive/Multipass.jpg +[2] http://christophe.varoqui.free.fr/ +[3] 'man 8 multipath' + + + +--- +Red Hat Author(s): Ales Kozumplik -- cgit