summaryrefslogtreecommitdiffstats
path: root/libdm/libdm-deptree.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix alignment warning in bitcount calculation for raid segment.Milan Broz2011-10-171-4/+4
|
* Use pool for dm_tree allocationZdenek Kabelac2011-10-141-11/+7
| | | | | Using the same pool allocation strategy as we use for vg, so dm_tree structure is part of the pool itself.
* This patch fixes issues with improper udev flags on sub-LVs.Jonathan Earl Brassow2011-10-061-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code does not always assign proper udev flags to sub-LVs (e.g. mirror images and log LVs). This shows up especially during a splitmirror operation in which an image is split off from a mirror to form a new LV. A mirror with a disk log is actually composed of 4 different LVs: the 2 mirror images, the log, and the top-level LV that "glues" them all together. When a 2-way mirror is split into two linear LVs, two of those LVs must be removed. The segments of the image which is not split off to form the new LV are transferred to the top-level LV. This is done so that the original LV can maintain its major/minor, UUID, and name. The sub-lv from which the segments were transferred gets an error segment as a transitory process before it is eventually removed. (Note that if the error target was not put in place, a resume_lv would result in two LVs pointing to the same segment! If the machine crashes before the eventual removal of the sub-LV, the result would be a residual LV with the same mapping as the original (now linear) LV.) So, the two LVs that need to be removed are now the log device and the sub-LV with the error segment. If udev_flags are not properly set, a resume will cause the error LV to come up and be scanned by udev. This causes I/O errors. Additionally, when udev scans sub-LVs (or former sub-LVs), it can cause races when we are trying to remove those LVs. This is especially bad during failure conditions. When the mirror is suspended, the top-level along with its sub-LVs are suspended. The changes (now 2 linear devices and the yet-to-be-removed log and error LV) are committed. When the resume takes place on the original LV, there are no longer links to the other sub-lvs through the LVM metadata. The links are implicitly handled by querying the kernel for a list of dependencies. This is done in the '_add_dev' function (which is recursively called for each dependency found) - called through the following chain: _add_dev dm_tree_add_dev_with_udev_flags <*** DM / LVM divide ***> _add_dev_to_dtree _add_lv_to_dtree _create_partial_dtree _tree_action dev_manager_activate _lv_activate_lv _lv_resume lv_resume_if_active When udev flags are calculated by '_get_udev_flags', it is done by referencing the 'logical_volume' structure. Those flags are then passed down into 'dm_tree_add_dev_with_udev_flags', which in turn passes them to '_add_dev'. Unfortunately, when '_add_dev' is finding the dependencies, it has no way to calculate their proper udev_flags. This is because it is below the DM/LVM divide - it doesn't have access to the logical_volume structure. In fact, '_add_dev' simply reuses the udev_flags given for the initial device! This virtually guarentees the udev_flags are wrong for all the dependencies unless they are reset by some other mechanism. The current code provides no such mechanism. Even if '_add_new_lv_to_dtree' were called on the sub-devices - which it isn't - entries already in the tree are simply passed over, failing to reset any udev_flags. The solution must retain its implicit nature of discovering dependencies and be able to go back over the dependencies found to properly set the udev_flags. My solution simply calls a new function before leaving '_add_new_lv_to_dtree' that iterates over the dtree nodes to properly reset the udev_flags of any children. It is important that this function occur after the '_add_dev' has done its job of querying the kernel for a list of dependencies. It is this list of children that we use to look up their respective LVs and properly calculate the udev_flags. This solution has worked for single machine, cluster, and cluster w/ exclusive activation.
* Move defines to headerZdenek Kabelac2011-10-061-17/+9
| | | | | | | Make limits for thin data_block_size and device_id part of public API. FIXME: read them possible from some kernel header file in the future ? But we may need to support different values for different versions ?
* Name changesZdenek Kabelac2011-10-041-8/+8
| | | | | | typo zeroeing->zeroing add size low_water_mark->low_water_mark_size so it's more obvious its sector related variable.
* Add intial code to check transaction_idZdenek Kabelac2011-10-031-3/+73
| | | | | | | Fix typy in transaction_id. Add this as node property, so it could be easily checked on resume. Code is not yet finished.
* Move priority check in frontZdenek Kabelac2011-10-031-3/+3
| | | | | Just a minor code mode - make a test for priority before more complex uuid checks.
* Update error path tracing for _resume_nodeZdenek Kabelac2011-10-031-8/+11
| | | | | | dm_task_create & dm_task_set_name produces it's own log_error Add missing stacks for dm_task_set_cookie, dm_task_run, dm_task_get_info.
* Transaction_id is property of thin_poolZdenek Kabelac2011-10-031-1/+2
| | | | | Remove Transaction_id from thin target. Store device_id for thin target.
* Add supporting function for thinpZdenek Kabelac2011-09-291-1/+115
| | | | | New dm_tree_node_add_thin_pool_target() and dm_tree_node_add_thin_target() This API is highly experimental and unstable for now.
* Just add warning about potential problem exteding dm_segtypesZdenek Kabelac2011-09-291-0/+5
| | | | Since raid target is using now dm_segtypes also for search purpose.
* Introduce revert_lv for better pvmove cleanup.Alasdair Kergon2011-09-271-0/+2
| | | | (One further fix needed to remove the stray pvmove LVs left behind.)
* Add log_error even for general device in use when we can't do the sysfs checks.Peter Rajnoha2011-09-261-2/+9
|
* Add dm_tree_retry_remove to use retry logic for device removal in a dm_tree.Peter Rajnoha2011-09-221-3/+14
|
* Replace open_count check with holders/mounted_fs check on lvremove path.Peter Rajnoha2011-09-221-2/+28
| | | | | | | | | | | | | | | Before, we used to display "Can't remove open logical volume" which was generic. There 3 possibilities of how a device could be opened: - used by another device - having a filesystem on that device which is mounted - opened directly by an application With the help of sysfs info, we can distinguish the first two situations. The third one will be subject to "remove retry" logic - if it's opened quickly (e.g. a parallel scan from within a udev rule run), this will finish quickly and we can remove it once it has finished. If it's a legitimate application that keeps the device opened, we'll do our best to remove the device, but we will fail finally after a few retries.
* Remove unused passed parametersZdenek Kabelac2011-09-071-6/+3
|
* spaces->tabsAlasdair Kergon2011-08-191-23/+23
|
* restrict dm_tree_node_add_null_areaAlasdair Kergon2011-08-191-0/+16
|
* Add support for m-way to n-way up-convert in RAID1 (no linear to n-way yet)Jonathan Earl Brassow2011-08-181-2/+15
| | | | | | | | | | | This patch adds the ability to upconvert a raid1 array - say from 2-way to 3-way. It does not yet support upconverting linear to n-way. The 'raid' device-mapper target allows for individual components (images) of an array to be specified for rebuild. This mechanism is used when adding new images to the array so that the new images can be resync'ed while the rest of the images in the array can remain 'in-sync'. (There is no mirror-on-mirror layering required.)
* Add the ability to split an image from the mirror and track changes.Jonathan Earl Brassow2011-08-181-4/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ~> lvconvert --splitmirrors 1 --trackchanges vg/lv The '--trackchanges' option allows a user the ability to use an image of a RAID1 array for the purposes of temporary read-only access. The image can be merged back into the array at a later time and only the blocks that have changed in the array since the split will be resync'ed. This operation can be thought of as a partial split. The image is never completely extracted from the array, in that the array reserves the position the device occupied and tracks the differences between the array and the split image via a bitmap. The image itself is rendered read-only and the name (<LV>_rimage_*) cannot be changed. The user can complete the split (permanently splitting the image from the array) by re-issuing the 'lvconvert' command without the '--trackchanges' argument and specifying the '--name' argument. ~> lvconvert --splitmirrors 1 --name my_split vg/lv Merging the tracked image back into the array is done with the '--merge' option (included in a follow-on patch). ~> lvconvert --merge vg/lv_rimage_<n> The internal mechanics of this are relatively simple. The 'raid' device- mapper target allows for the specification of an empty slot in an array via '- -'. This is what will be used if a partial activation of an array is ever required. (It would also be possible to use 'error' targets in place of the '- -'.) If a RAID image is found to be both read-only and visible, then it is considered separate from the array and '- -' is used to hold it's position in the array. So, all that needs to be done to temporarily split an image from the array /and/ cause the kernel target's bitmap to track (aka "mark") changes made is to make the specified image visible and read-only. To merge the device back into the array, the image needs to be returned to the read/write state of the top-level LV and made invisible.
* Add some log_error msg's and fix potential segfaultJonathan Earl Brassow2011-08-111-0/+3
| | | | | Thanks to kabi for spotting these - especially the possibility for segfault if a loop runs all the way through without finding a match.
* Add basic RAID segment type(s) support.Jonathan Earl Brassow2011-08-021-3/+116
| | | | | | | | | | | | | Implementation described in doc/lvm2-raid.txt. Basic support includes: - ability to create RAID 1/4/5/6 arrays - ability to delete RAID arrays - ability to display RAID arrays Notable missing features (not included in this patch): - ability to clean-up/repair failures - ability to convert RAID segment types - ability to monitor RAID segment types
* Downgrade error message - it isn't strictly an internal error in theAlasdair Kergon2011-07-081-3/+5
| | | | library, and the known cause within lvm2 got fixed.
* Report internal error when parameters are missing on table loadZdenek Kabelac2011-06-301-0/+4
| | | | | | | When some target is passing empty parameters to some dm target, report this as an internal error to better catch some broken table construction (some mirror conversions seem to be doing this for now).
* Extend debug log messages to distinguish between the 3 states:Alasdair Kergon2011-06-271-1/+1
| | | | trust udev; verify udev; perform dev node operations directly.
* Move udev_only logic inside stacked node op code.Alasdair Kergon2011-06-271-4/+3
| | | | | | (We still need to treat add+readhead+del as a no-op.) Rename udev_fallback to verify_udev_operations. Rename --udevfallback to --verifyudev
* Return immediately dm_lib_exit() if called more than once.Alasdair Kergon2011-06-241-1/+2
| | | | | (Avoiding calling it twice would involve some untangling.) Decrement the new suspended_counter if removing a suspended device.
* Add check for library fallback in _deactivate_node.Peter Rajnoha2011-06-221-2/+3
| | | | | | This fn calls rm_dev_node directly - an exceptional case. It needs to check the DM_UDEV_DISABLE_LIBRARY_FALLBACK flag directly (it's called in dm_task_run normally where it's checked already).
* Maintain a count of the number of suspended devices in libdevmapperAlasdair Kergon2011-06-131-5/+11
| | | | | | | | | and use this for the LVM critical section logic. Also report an error if code tries to load a table while any device is known to be in the suspended state. (If the variety of problems these changes are showing up can't be fixed before the next release, the error messages can be reduced to debug level.)
* Fix --mirrorlog mirrored.Alasdair Kergon2011-06-111-0/+4
|
* Major pvmove fix to issue ioctls in the correct order when multiple LVsAlasdair Kergon2011-06-111-6/+26
| | | | | | | | | | | are affected by the move. (Currently it's possible for I/O to become trapped between suspended devices amongst other problems. The current fix was selected so as to minimise the testing surface. I hope eventually to replace it with a cleaner one that extends the deptree code. Some lvconvert scenarios still suffer from related problems.
* Fix another occurrence of linux kernel version check.Milan Broz2011-06-091-4/+11
|
* Remove double bracesZdenek Kabelac2011-03-291-2/+2
| | | | | Clang gives notice about possible confusion as commonly double bracces are used when some assignment is done inside them.
* Fix dm_udev_wait calls in dmsetup to occur before readahead display not after.Alasdair Kergon2011-03-021-1/+0
| | | | Include an implicit dm_task_update_nodes() within dm_udev_wait().
* Add debug message for open_count failureZdenek Kabelac2011-02-181-1/+4
| | | | | | | Report open_count problem as debug. Function using _node_has_closed_parents decides whether it's error or could be ignored.
* Remove dead assignment in _mirror_emit_segment_lineZdenek Kabelac2010-11-291-2/+1
| | | | Remove unused 'r' assignment.
* Remove dead assignment in dm_tree_node_add_mirror_target_logZdenek Kabelac2010-11-291-3/+1
| | | | 'seg' is never used - remove it.
* Do not call dm_task_destroy with NULLZdenek Kabelac2010-11-231-1/+0
|
* Add dm_zalloc and use it and dm_pool_zalloc throughout.Alasdair Kergon2010-09-301-2/+1
|
* Use __attribute__ consistently throughout.Alasdair Kergon2010-07-091-1/+1
|
* Add printf format attributes to yes_no_prompt & dm_{sn,as}printf and fix a calleAlasdair Kergon2010-07-021-6/+6
|
* Use early udev synchronisation and update of dev nodes for clustered mirrors.Peter Rajnoha2010-06-211-0/+27
| | | | | | | | | | | | | When using clustered mirrors, we need device nodes to be created during processing of device tree, not at its end like we normally do (we need to access the nodes in cmirror prematurely). Therefore we use a new flag called "immediate_dev_node" stored in deptree's load_properties struct to instruct the device tree processing code to immediately synchronize with udev and flush all stacked node operations so the nodes are prepared for use. For now, the immediate_dev_node is used for clustered mirrors during processing the dm_tree_preload_children code only. We can add more later if needed.
* Fix copy&paste detection of kernel release version.Zdenek Kabelac2010-05-251-2/+4
| | | | Add log_error to avoid return_0 without log_error.
* Replace strncmp kernel version number checks with proper onesAlasdair Kergon2010-05-241-5/+4
|
* Choose between clustered log versions based on kernel version.Alasdair Kergon2010-05-241-0/+1
| | | | Add fixmes for broken strcmp.
* Update Copyright date for resently modifed filesZdenek Kabelac2010-05-241-1/+1
|
* Replicator: check open_count for parents of presuspend_nodeZdenek Kabelac2010-05-211-1/+43
| | | | | | For deactivation of Replicator check in advance that all heads have open_count == 0. For this presuspend_node is used as all head nodes are linking this control node.
* Replicator: support deactivate of replicator-dev nodesZdenek Kabelac2010-05-211-0/+21
| | | | | | | | | Introducing dm_tree_node_set_presuspend_node() for presuspending child node (i.e. replicator control target) before deactivation of parent node (i.e. replicator-dev target). This patch presents no functional change to current dtree - only replicator target currently sets presuspend node for dev nodes.
* Replicator: libdm supportZdenek Kabelac2010-05-211-5/+291
| | | | | | | | Introducing new API calls: dm_tree_node_add_replicator_target() dm_tree_node_add_replicator_dev_target(). Define new typedef dm_replicator_mode_t.
* Only fail if the top-level LV fails to be deactivated - allow deactivationAlasdair Kergon2010-04-071-9/+24
| | | | of its dependencies to fail.