summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * test: udev-settle before testing device.NeilBrown2009-10-191-0/+1
| | | | | | | | | | | | | | I think we sometime get way ahead of udev and devices disappear and appear almost at random. So add some settling. Signed-off-by: NeilBrown <neilb@suse.de>
| * mdadm(8): fix spurious space after -e headerMike Frysinger2009-10-191-1/+1
| | | | | | | | | | Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: NeilBrown <neilb@suse.de>
| * Monitor: add option to specify rebuild incrementsZdenek Behan2009-10-195-17/+36
| | | | | | | | | | | | | | | | | | | | ie. the percent increments after which RebuildNN event is generated This is particulary useful when using --program option, rather than (only) syslog for alerts. Signed-off-by: Zdenek Behan <rain@matfyz.cz> Signed-off-by: NeilBrown <neilb@suse.de>
| * mdmon: lock current memory as well as future memory.NeilBrown2009-10-191-1/+1
| | | | | | | | | | | | | | | | mlockall(MCL_FUTURE) only locks mappings that have not yet been created. To lock all memory used by the process, we need MCL_CURRENT | MCL_FUTURE Signed-off-by: NeilBrown <neilb@suse.de>
| * Merge git://github.com/djbw/mdadmNeilBrown2009-10-1914-289/+780
| |\
| | * mdmon: preserve socket over chrootDan Williams2009-10-136-12/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Connect to the monitor in the old namespace and use that connection for WaitClean requests when stopping the victim mdmon instance. This allows ping_monitor() to work post chroot(). Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * mdmon: exec(2) when the switchroot argument is not "/"Dan Williams2009-10-131-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Try to execute mdmon from the target namespace. When used for initramfs handovers we need to drop all references to the initramfs filesystem for that memory to be freed. Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * mdmon: avoid writes in the startup path for mdmon on root arraysDan Williams2009-10-132-46/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When killing a previous monitor be careful not to cause writes to the filesystem until the reads necessary to get the monitor operational have completed. The code is already prepared for errors creating the pid and socket files, so simply defer creation of these files until after the first call to manage(). Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * Detail: export MD_UUID from mapfileDan Williams2009-10-133-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The load_super() from an mdadm --detail call may race against an mdmon update. When this happens the load_super sees an inconsistent metadata block and returns an error. The fallback path to use the map file contents lacks uuid reporting, so provide __fname_from_uuid for generically printing a uuid. Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: regression test for prodigal array member scenarioDan Williams2009-10-132-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | Provide a test to sanity check assembly and reassembly in the presence of conflicting family number information. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: add --update=uuid supportDan Williams2009-10-133-12/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When disks have conflicting container memberships (same container ids but incompatible member arrays) --update=uuid can be used to move offenders to a new container id by changing 'orig_family_num'. Note that this only supports random updates of the uuid as the actual uuid is synthesized. We also need to communicate the new 'orig_family_num' value to all disks involved in the update. A new field 'update_private' is added to struct mdinfo to allow this information to be transmitted. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * ddf: prevent superblock being zeroed on --updateDan Williams2009-10-131-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The full fix would be to support updating ddf metadata, but this minimal fix just prevents the superblock from being zeroed when someone inadvertently passes an unsupported --update option during assembly. Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: fix/support --updateDan Williams2009-10-131-28/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix init_super_imsm() to return an empty mpb when info == NULL, and teach store_super_imsm() to simply write out the passed in mpb. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=523320 Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: fix spare record writeout raceDan Williams2009-10-131-24/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | imsm_activate_spare() in the manager thread may race against write_super_imsm_spares() in the monitor thread. Give write_super_imsm_spares() its own private mpb buffer to prevent confusing the manager. This change uncovered cases where spares were not being assembled due to a failed metadata version number check. Spares can freely associate across metadata version number, so reduce the scope of the version check in the spare assembly case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: disambiguate family_numDan Williams2009-09-301-132/+448
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a result of trawling through the Windows implementation to learn the mechanism of how it disambiguates family_num. It is a continuation of commit 148acb7b "imsm: fix family number handling" which introduced a regression when reassembling a container with stale disks and rebuilt members. When rebuilding, a new family number is assigned to protect against the "prodigal array member" problem. It prevents a former family member from returning to the system and causing a rebuild to go the wrong direction. However, this invalidates looking at the generation number to determine the most up-to-date disk when comparing across family numbers. Instead the assembly logic looks for agreement between a disk's local family membership compared against a global list of all families in the system. Whenever a disk's local metadata does not match a family number on the global list that family number is marked offline. It is possible that this logic results in multiple incompatible but valid family numbers existing in a container. In this case mdadm.conf cannot be consulted because it only records the uuid which is generated from static fields in the metadata. The metadata lacks the data needed to disambiguate "local" versus "foreign". The "foreign" array in this case requires updating to change its container-id information (orig_family_num), and possibly the member array names. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: kill close() of component deviceDan Williams2009-09-301-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | None of the other formats close the passed in fd at load, and this becomes a problem when trying to support --update where we need O_EXCL protection across the entire operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: cleanup disk status testsDan Williams2009-09-281-24/+29
| | | | | | | | | | | | | | | | | | | | | Add is_failed(), is_configured(), and is_spare() helpers to clean up disk status flag testing. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | Grow: update backup-metadata mtime every time we write it.NeilBrown2009-10-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally the backup-metadata was only written once at the start of a raid5 reshape that made the array bigger. So we only set the mtime once. Now that we can be writing metadata continually during an in-place reshape, we need to update the mtime more often. Also, allow the metadata mtime to be slightly in advance of the array mtime. Normally the difference will be less than a second, so 10 minutes should be plenty. This guards against an old backup file being used to restart an array. but starting two reshapes in the 10 minutes is sufficiently unlikely, and the possibility of an accident is already sufficiently small, that 10 minutes is probably fine. Thanks to Guy Martin <gmsoft@tuxicoman.be> for discovering and reporting that .mtime wasn't being updated properly. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Compile fixes for mdassembleNeilBrown2009-10-202-0/+4
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | Grow: reject raid-disks reduction in RAID5 etc before 2.6.32NeilBrown2009-10-201-1/+9
| | | | | | | | | | | | | | | | | | | | | 2.6.31 has some bugs with restarting a RAID5 reduction, so refuse to try unless at least 2.6.32. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Assemble: print more verbose messages about restarting a reshapeNeilBrown2009-10-203-20/+63
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | Add missing 'continue' in Grow_restart.NeilBrown2009-10-201-0/+1
| | | | | | | | | | | | | | | | | | Thus we weren't checking the uuid properly. Signed-off-by: NeilBrown <neilb@suse.de>
* | | tests/imsm: allow for rounding of array size.NeilBrown2009-10-162-0/+8
| | | | | | | | | | | | | | | | | | | | | IMSM rounds array size to a multiple of 1024K, so our tests must assume this. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Test different r5/r6 layouts.NeilBrown2009-10-165-3/+155
| | | | | | | | | | | | | | | | | | Make sure kernel and restripe agree on all different layouts. Signed-off-by: NeilBrown <neilb@suse.de>
* | | restripe: fix assignment of raid6 blocks for syndrome calculation.NeilBrown2009-10-161-8/+19
| | | | | | | | | | | | | | | | | | Particularly for the _6 style. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Handle negative delta_disks in super0 and super1.NeilBrown2009-10-162-16/+19
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | Grow_restart to handle reducing number of devices in an array.NeilBrown2009-10-161-10/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FIXME this is wrong . what direction does reshape_position move? If the device count in an array is shrinking, the critical region is different so the tests need to be different when restarting. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Grow: don't make 'blocks' too large during in-place reshape.NeilBrown2009-10-161-3/+7
| | | | | | | | | | | | | | | | | | | | | On small (test) arrays, multiplying by 16 can make the 'chunk' size larger than half the array, which is a problem. Signed-off-by: NeilBrown <neilb@suse.de>
* | | restripe: fix compile warning.NeilBrown2009-10-121-1/+1
| | | | | | | | | | | | | | | | | | Just a type cast... Signed-off-by: NeilBrown <neilb@suse.de>
* | | test changelevel: add tests for changing degraded arrays.NeilBrown2009-10-121-0/+56
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | restripe : various fixed for RAID6 2-failure recovery.NeilBrown2009-10-121-12/+40
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | Test level changes and related reshaping.NeilBrown2009-10-122-1/+54
| | | | | | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
* | | Grow: ignore error from final wait_backupNeilBrown2009-10-121-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | The last time wait_backup is called, it might see reshape finish and so return an error indicator. But this is not an error, and we must go ahead and prepare the array for full access. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Grow: make sure bsb2 is properly alignedNeilBrown2009-10-121-3/+2
| | | | | | | | | | | | | | | | | | | | | We do O_DIRECT io in bsb2, so it must be aligned properly. Easiest if it is static. Signed-off-by: NeilBrown <neilb@suse.de>
* | | testreshape5 - add tests for RAID6NeilBrown2009-10-121-5/+12
| | | | | | | | | | | | | | | | | | .. to make sure our raid6 calculations are working. Signed-off-by: NeilBrown <neilb@suse.de>
* | | Merge branch 'master' into devel-3.1NeilBrown2009-10-0174-2766/+1054
|\| | | | | | | | | | | | | | Conflicts: mdadm.8
| * | Fix null-dereference in set_member_infoNeilBrown2009-10-011-6/+9
| | | | | | | | | | | | | | | | | | | | | set_member_info would try to dereference ->metadata_version, without checking that it isn't NULL. Signed-off-by: NeilBrown <neilb@suse.de>
| * | Add missing space in "--detail --brief" output.NeilBrown2009-10-011-2/+2
| |/ | | | | | | | | | | We need a space between the device name and the word "level".. Signed-off-by: NeilBrown <neilb@suse.de>
| * Release mdadm-3.0.2NeilBrown2009-09-258-5/+31
| | | | | | | | Just one bugfix.
| * super0: fix crash on assemble if homehost is not set.NeilBrown2009-09-251-3/+7
| | | | | | | | | | | | | | | | If homehost is not set - typically during early boot, and assemble of v0.90 metadata arrays will crash. Reported-by: Paweł Sikora <pluto@agmk.net> Signed-off-by: NeilBrown <neilb@suse.de>
| * Release mdadm-3.0.1NeilBrown2009-09-258-6/+34
| | | | | | | | | | | | Just bugfixes. Signed-off-by: NeilBrown <neilb@suse.de>
| * testreshape5 - flush devices between tests.NeilBrown2009-09-251-0/+1
| | | | | | | | | | | | We need to flush the block devices before reading different data. Signed-off-by: NeilBrown <neilb@suse.de>
| * Merge branch 'master' of git://github.com/djbw/mdadmNeilBrown2009-09-255-18/+36
| |\
| | * mdmon: fix freeing unallocated memoryHans de Goede2009-09-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mdmon was creating a supertype struct with malloc, and thus not necessarily getting zero-d memory. This was causing it to segfault when called like this from the initrd: /sbin/mdmon /proc/mdstat /sysroot The problem was that load_super_imsm would get called on the non-zero'd super struct, whcih in turn calls free_super_imsm, which checks st->sb, which should be zero but isn't and then starts freeing bogus memory. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: clear CONFIGURED_DISK for failed drivesDan Williams2009-09-151-0/+1
| | | | | | | | | | | | | | | | | | | | | Synchronizing with what the Windows driver does. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: kill USABLE_DISK flagDan Williams2009-09-151-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | 'USABLE_DISK' is not a 'persistent' status flag it is an internal status flag used for the in memory representation of the disk in the Windows driver. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * Examine: don't count containers as sparesDan Williams2009-09-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mdadm -Ebs will include containers in the scanned device list. Examine() falsely thinks they are spares when MD_DISK_SYNC is not set. This could be fixed by forcing all formats to set this flag for container devices, but this flag is currently used by imsm to identify free-floating spares. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * Detail: fix for an imsm container with a spareDan Williams2009-09-152-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Spares for imsm arrays do not have any info about the container in their metadata records. If Detail() inadvertantly picks such a device for ->get_array_info() it will end up with less than useful info for the container. So, continue to read from the disks until a non-spare device is found. This bug was found by timeouts waiting for udev to create the user-friendly container name. To detect future UUID reporting problems and a debug print to the timeout case in wait_for(). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * Examine: fixup output in the presence of containers with sparesDan Williams2009-09-151-3/+9
| | | | | | | | | | | | | | | | | | | | | If we dump any 'spare' or 'device' information for a container in the 'brief' case then we need a newline before printing member array info. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| | * imsm: fix spare promotionDan Williams2009-09-151-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | 1/ Fix an off by one error when detecting whether the device allocation loop succeeded or not 2/ Update ->num_raid_devs before copying to avoid a segmentation fault Signed-off-by: Dan Williams <dan.j.williams@intel.com>