summaryrefslogtreecommitdiffstats
path: root/tests/basic
Commit message (Collapse)AuthorAgeFilesLines
...
* tests/shd: Mark "tests/basic/volume-scale-shd-mux.t" as badMohammed Rafi KC2019-09-131-0/+2
| | | | | | | | | This test case is failing in upstream. Marking this test as bad for now. Change-Id: I014c67628c14683c32a3c1dd770b10aaf35ad4cc Updates: bz#1752331 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* libgfapi: return correct errno on invalid volume nameSheetal Pamecha2019-08-193-8/+95
| | | | | | | | | | | | | glfs_init when called with volume name prefixed by '/' sets errno to 0. Setting errno to EINVAL to resolve the issue. Also volname is a parameter to glfs_new. Thus, validating volname in glfs_new itself and returning EINVAL from that function fixes: bz#1507896 Change-Id: I0d4d2423e26cc07644d50ec8cce788ecc639203d Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* tests: revive back volume-scale-shd-mux.tAtin Mukherjee2019-07-055-30/+28
| | | | | | | Fixes: bz#1708929 Change-Id: I9cc81a9047ff874df752ca5552e00bf033485bd8 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* cluster/ec: quorum-count implementationPranith Kumar K2019-09-052-0/+215
| | | | | | fixes: #721 Change-Id: I5333540e3c635ccf441cf1f4696e4c8986e38ea8 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* cluster/ec: Fail fsync/flush for files on update size/version failurePranith Kumar K2019-09-042-0/+150
| | | | | | | | | | | | | | | | | Problem: If update size/version is not successful on the file, updates on the same stripe could lead to data corruptions if the earlier un-aligned write is not successful on all the bricks. Application won't have any knowledge of this because update size/version happens in the background. Fix: Fail fsync/flush on fds that are opened before update-size-version went bad. fixes: bz#1748836 Change-Id: I9d323eddcda703bd27d55f340c4079d76e06e492 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* graph/cleanup: Fix race in graph cleanupMohammed Rafi KC2019-07-052-3/+68
| | | | | | | | | | | | | | | | | We were unconditionally cleaning up the grap when we get child_down followed by parent_down. But this is prone to race condition when some of the bricks are already disconnected. In this case, even before the last child down is executed in the client xlator code,we might have freed the graph. Because the child_down event is alreadt recevied. To fix this race, we have introduced a check to see if all client xlator have cleared thier reconnect chain, and called the child_down for last time. Change-Id: I7d02813bc366dac733a836e0cd7b14a6fac52042 fixes: bz#1727329 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* tests/dht: Add a test file for file renamesN Balachandran2018-09-071-0/+1021
| | | | | | | | | Test the various combinations of hashed and cached subvols for the src and dst. Change-Id: I41416f9e5f2b7ea1c880d1913fdd6576da1ee868 fixes: bz#1626543 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* tests/shd: Break down shd mux tests into multiple .t fileMohammed Rafi KC2019-07-314-149/+191
| | | | | | | | | | Test file tests/basic/shd-mux.t was taking longer than 200 seconds in some iterations. So this patch is breaking the test case to three files Change-Id: I1430f58798f876edf6368d6f4b8b5a75f0114c31 Updates: bz#1708929 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* xdr: add code so we have more xdr functions coveredAmar Tumballi2019-08-011-0/+3
| | | | | | Updates: bz#1693692 Change-Id: Ia10ccca5e1fed6c4269842ebb4d507662ca0f6a6 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* gfapi: increase function-coverageAmar Tumballi2019-07-252-25/+49
| | | | | | | | | | | | | | * Add few more mgmt functions to the coverage * While testing mgmt function, found an issue, where if the 'glfs_set_volfile_server()' is not called before calling 'glfs_unset_volfile_server()', unset would cause a crash. Null check of few variables fixes the issue, which is handled in this patch itself. * Added a test for volfile API Updates: bz#1693692 Change-Id: Iba151f8da1b64107e2f436ddbfef9da45b1c1588 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* trace: add more coverage by testing it with glfs-coverage too.Amar Tumballi2019-07-291-0/+22
| | | | | | | | | | make sure to provide 'log-file' option, so we can see the logs. This test does test volgen inserting the trace xlator in server graph. Updates: bz#1693692 Change-Id: I26c736b04376674b4c094d48060660421e6c983c Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cluster/ec: fix EIO error for concurrent writes on sparse filesXavi Hernandez2019-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EC doesn't allow concurrent writes on overlapping areas, they are serialized. However non-overlapping writes are serviced in parallel. When a write is not aligned, EC first needs to read the entire chunk from disk, apply the modified fragment and write it again. The problem appears on sparse files because a write to an offset implicitly creates data on offsets below it (so, in some way, they are overlapping). For example, if a file is empty and we read 10 bytes from offset 10, read() will return 0 bytes. Now, if we write one byte at offset 1M and retry the same read, the system call will return 10 bytes (all containing 0's). So if we have two writes, the first one at offset 10 and the second one at offset 1M, EC will send both in parallel because they do not overlap. However, the first one will try to read missing data from the first chunk (i.e. offsets 0 to 9) to recombine the entire chunk and do the final write. This read will happen in parallel with the write to 1M. What could happen is that half of the bricks process the write before the read, and the half do the read before the write. Some bricks will return 10 bytes of data while the otherw will return 0 bytes (because the file on the brick has not been expanded yet). When EC tries to recombine the answers from the bricks, it can't, because it needs more than half consistent answers to recover the data. So this read fails with EIO error. This error is propagated to the parent write, which is aborted and EIO is returned to the application. The issue happened because EC assumed that a write to a given offset implies that offsets below it exist. This fix prevents the read of the chunk from bricks if the current size of the file is smaller than the read chunk offset. This size is correctly tracked, so this fixes the issue. Also modifying ec-stripe.t file for Test #13 within it. In this patch, if a file size is less than the offset we are writing, we fill zeros in head and tail and do not consider it strip cache miss. That actually make sense as we know what data that part holds and there is no need of reading it from bricks. Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6 Fixes: bz#1730715 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* ctime: Set mdata xattr on legacy filesKotresh HR2019-06-241-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | Problem: The files which were created before ctime enabled would not have "trusted.glusterfs.mdata"(stores time attributes) xattr. Upon fops which modifies either ctime or mtime, the xattr gets created with latest ctime, mtime and atime, which is incorrect. It should update only the corresponding time attribute and rest from backend Solution: Creating xattr with values from brick is not possible as each brick of replica set would have different times. So create the xattr upon successful lookup if the xattr is not created Note To Reviewers: The time attributes used to set xattr is got from successful lookup. Instead of sending the whole iatt over the wire via setxattr, a structure called mdata_iatt is sent. The mdata_iatt contains only time attributes. Change-Id: I5e535631ddef04195361ae0364336410a2895dd4 fixes: bz#1593542 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* quick-read: rename cache-invalidation key to avoid redundant keysAtin Mukherjee2019-04-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | With group-metadata-cache group profile settings performance.cache-invalidation option when turned on enables both md-cache and quick-read xlator's cache-invalidation feature. While the intent of the group-metadata-cache is to set md-cache xlator's cache-invalidation feature, quick-read xlator also gets affected due to the same. While md-cache feature and it's profile existed since release-3.9, quick-read cache-invalidation was introduced in release-4 and due to this op-version mismatch on any cluster which is >= glusterfs-4 when this group profile is applied it breaks backward compatibility with the old clients. The proposed fix here is to rename the key in quick-read to 'quick-read-cache-invalidation' so that both these features have distinct identification. While this brings in by itself a backward compatibility challenge where this feature is enabled in an existing cluster and when the same is upgraded to a version where this change exists, it will lead to an unidentified old key. But as a workaround we can always ask users upgrading to release-7 version to turn off this option, upgrade the cluster and turn it back on with the new key. This needs to be documented once the patch is accepted. Fixes: bz#1698042 Change-Id: I30422ba6496208e21191a8d78ad29b2e21078664 Signed-off-by: Atin Mukherjee <amukherj@redhat.com> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
* glusterd/thin-arbiter: Thin-arbiter integration with GD1Vishal Pandey2019-04-242-0/+70
| | | | | | | | | | | | | | | | | | | | | | | | gluster volume create <VOLNAME> replica 2 thin-arbiter 1 <host1>:<brick1> <host2>:<brick2> <thin-arbiter-host>:<path-to-store-replica-id-file> [force] The changes have been made in a way that the last brick in the bricks list will be treated as the thin-arbiter. GD1 will be manipulated to consider replica count to be as 2 and continue creating the volume like any other replica 2 volume but since thin-arbiter volumes need ta-brick client xlator entries for each subvolume in fuse volfile, volfile generation is modified in a way to inject these entries seperately in the volfile for every subvolume. Few more additions - 1- Save the volinfo with new fields ta_bricks list and thin_arbiter_count. 2- Introduce a new option client.ta-brick-port to add remote-port to ta-brick xlator entry in fuse volfiles. The option can be set using the following CLI syntax - gluster volume set <VOLNAME> client.ta-brick-port <PORTNO.> 3- Volume Info will contain a Thin-Arbiter-path entry to distinguish from other replicate volumes. Change-Id: Ib434e2313b29716f32476c6c211d282c4ef39406 Updates #687 Signed-off-by: Vishal Pandey <vpandey@redhat.com>
* lcov: add more tests to glfsxmp-coverageAmar Tumballi2019-06-181-19/+79
| | | | | | | | | * found a bug with quiesce fallocate() - fixed. * found a bug with cloudsync part of code in posix - fixed updates: bz#1693692 Change-Id: I4f315ffebb612de072ae08761b8cd0f47714080a Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cluster/ec: Prevent double pre-op xattropsPranith Kumar K2019-06-201-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Race: Thread-1 Thread-2 1) Does ec_get_size_version() to perform pre-op fxattrop as part of write-1 2) Calls ec_set_dirty_flag() in ec_get_size_version() for write-2. This sets dirty[] to 1 3) Completes executing ec_prepare_update_cbk leading to ctx->dirty[] = '1' 4) Takes LOCK(inode->lock) to check if there are any flags and sets dirty-flag because lock->waiting_flag is 0 now. This leads to fxattrop to increment on-disk dirty[] to '2' At the end of the writes the file will be marked for heal even when it doesn't need heal. Fix: Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked as '1' in step 2) above Updates bz#1593224 Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* posix/ctime: Fix ctime upgrade issueKotresh HR2019-06-131-16/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: On a EC volume, during upgrade from the older version where ctime feature is not enabled(or not present) to the newer version where the ctime feature is available (enabled default), the self heal hangs and doesn't complete. Cause: The ctime feature has both client side code (utime) and server side code (posix). The feature is driven from client. Only if the client side sets the time in the frame, should the server side sets the time attributes in xattr. But posix setattr/fseattr was not doing that. When one of the server nodes is updated, since ctime is enabled by default, it starts setting xattr on setattr/fseattr on the updated node/brick. On a EC volume the first two updated nodes(bricks) are not a problem because there are 4 other bricks with consistent data. However once the third brick is updated, the new attribute(mdata xattr) will cause an inconsistency on metadata on 3 bricks, which prevents the file to be repaired. Fix: Don't create mdata xattr with utimes/utimensat system call. Only update if already present. Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c fixes: bz#1720201 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* core: improve timer accuracyXavier Hernandez2018-01-191-3/+4
| | | | | | | | Also fixed some issues on test ec-1468261.t. Change-Id: If156f86af986d9eed13cdd1f15c5a7214cd11706 Updates: bz#1193929 Signed-off-by: Xavier Hernandez <jahernan@redhat.com>
* gfapi: provide an api for setting statedump pathAmar Tumballi2019-03-141-0/+5
| | | | | | | | | | | | | | | Currently for an application using glfsapi to use glusterfs, when a statedump is taken, it uses /var/run/gluster dir to dump info. There can be concerns as this directory may be owned by some other user, and hence it may fail taking statedump. Such applications should have an option to use different path. This patch provides an API to do so. Updates: bz#1689097 Change-Id: I8918e002bc823d83614c972b6c738baa04681b23 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: keep glfsxmp in tests directoryAmar Tumballi2019-06-104-9/+1815
| | | | | | | | | | | | | this is critical so all the tests will be contained in the same directory, and one can just 'cp -a tests/ <any-location>/' and run glusterfs tests. only 'glfsxmp.c' was an exception as it was just copying the file from api example directory. Now moved it to tests. updates: bz#1193929 Change-Id: I00359d64be580bffc5b3c3a090968d86c2c6952a Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: Fix split-brain-favorite-child-policy.t failurekarthik-us2019-06-101-3/+4
| | | | | | | | | | | | | | | | | | Problem: The test case is failing to heal the volume within $HEAL_TIMEOUT @195. This is happening because as part of split-brain resolution the file gets expunged from the sink and the new entry mark for that file will be done on the source bricks as part of impunging. Since the source bricks shd-threads failed to get the heal-domain lock, they will wait for the heal-timeout of 10 minutes, which is greater than $HEAL_TIMEOUT. Fix: Set the cluster.heal-timeout to 5 seconds to trigger the heal so that one of the source brick heals the file within the $HEAL_TIMEOUT. Change-Id: Ie73c578cc5361c0d617a48ccc86026734d20ba8c fixes: bz#1718998 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* uss: Ensure that snapshot is deleted before creating a new snapshotRaghavendra Bhat2019-05-141-0/+12
| | | | | | | | * Also some logging enhancements in snapview-server Change-Id: I6a7646771cedf4bd1c62806eea69d720bbaf0c83 fixes: bz#1715921 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* tests/quick-read-upcall: mark it badAmar Tumballi2019-06-071-0/+5
| | | | | | | | | | | | | | | | Frequent intermittent failures observed. ``` 08:59:24 ok 11 [ 10/ 3] < 36> 'write_to /mnt/glusterfs/0/test.txt test-message1' 08:59:24 ok 12 [ 10/ 6] < 37> 'test-message1 cat /mnt/glusterfs/0/test.txt' 08:59:24 ok 13 [ 10/ 4] < 38> 'test-message0 cat /mnt/glusterfs/1/test.txt' 08:59:24 not ok 14 [ 3715/ 6] < 45> 'test-message1 cat /mnt/glusterfs/1/test.txt' -> 'Got "test-message0" instead of "test-message1"' 08:59:24 ok 15 [ 10/ 162] < 47> 'gluster --mode=script --wignore volume set patchy features.cache-invalidation on' 08:59:24 ok 16 [ 10/ 148] < 48> 'gluster --mode=script --wignore volume set patchy performance.qr-cache-timeout 15' ``` updates: bz#1718191 Change-Id: Ieb9e5a9a428995ff178f77bc4a5155b8298d3fa0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests/volume-scale-shd-mux: mark as bad testAmar Tumballi2019-06-071-0/+3
| | | | | | | | | | | | The test is giving frequent failures in regression. Error seen is normally like below: `09:09:24 not ok 58 [ 14/ 80343] < 104> '^3$ number_healer_threads_shd patchy_distribute1 __afr_shd_healer_wait' -> 'Got "1" instead of "^3$"'` updates: bz#1708929 Change-Id: I240bdcfb76b1f953d75937a53c5dfabba134f282 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests/shd: Add test coverage for shd muxMohammed Rafi KC2019-05-093-0/+357
| | | | | | | | | | | | | | | | | | | | This patch add more test cases for shd mux test cases The test case includes 1) Createing multiple volumes to check the attach and detach of self heal daemon requests. 2) Make sure the healing happens in all sceanarios 3) After a volume detach make sure the threads of the detached volume is all cleaned. 4) Repeat all the above tests for ec volume 5) Node Reboot case 6) glusterd restart cases 7) Add-brick/remove brick 8) Convert a distributed volume to disperse volume 9) Convert a replicated volume to distributed volume Change-Id: I7c317ef9d23a45ffd831157e4890d7c83a8fce7b fixes: bz#1708929 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
* lcov: run more fops on translatorsAmar Tumballi2019-06-011-1/+13
| | | | | | | | | | | | | | Translators covered: * playground/template * debug/delay-gen * debug/error-gen * features/namespace * features/quiesce * meta updates: bz#1693692 Change-Id: Ic8fde8efcb309ea492d8e819241f786f7ff467a1 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* lcov: more coverage to shard, old-protocol, sdfsAmar Tumballi2019-05-314-6/+19
| | | | | | updates: bz#1693692 Change-Id: If4c30572d4501d169bb4b0871c677d974515867c Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: add tests for different signal handlingAmar Tumballi2019-04-123-41/+6
| | | | | | | | | | | Also some cleanup: * old-protocol.t was actually added to make sure we have line-coverage * first-test.t should have been removed as per the comment. It doesn't do anything. * add statvfs to rpc-coverage so we can cover statvfs in few xlators. updates: bz#1693692 Change-Id: Ie8651ce007de484c4abced16b4de765aa5e517be Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: Add changelog api testsKotresh HR2019-05-221-0/+37
| | | | | | updates: bz#1193929 Change-Id: Iee9aab8140882069165621189741f189fb2cc884 Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests: Add history api testsKotresh HR2019-05-212-0/+43
| | | | | | updates: bz#1193929 Change-Id: Ic26ab5277f720c734f083150c1c541763dfa64aa Signed-off-by: Kotresh HR <khiremat@redhat.com>
* gfapi:add missng api to increase code coverageSheetal Pamecha2019-05-071-18/+340
| | | | | | | | | | | | | | | add test for async Read/Write combinations glfs_read_async/write_async glfs_pread_async/pwrite_async glfs_readv_async/writev_async glfs_preadv_async/pwritev_async ftruncate/ftruncate_async fsync/fsync_async fdatasync/fdatasync_async Updates: #655 Change-Id: I12beb97029fd60bce79650a376d8fcd8d383ef16 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* api/glfsxmp.c: minor fixesSheetal Pamecha2019-04-151-0/+30
| | | | | | | | | | | * add more fops: f{get,set,list,remove}xattr(), access(), fstat(), fsetattr(), getxattr(), lgetxattr(), llistxattr(), lsetxattr(), fgetxattr() * handle some error cases (like volume not found) Updates: #655 Change-Id: I3334bdf3090eafd83a54e1be12036ea01b181089 Signed-off-by: Amar Tumballi <amarts@redhat.com> Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* tests: Fix spurious failures in ta-write-on-bad-brick.tPranith Kumar K2019-05-213-15/+15
| | | | | | | | | | | | | | | | | | Problem: afr_child_up_status_meta works only when LOOKUP on $M0 is successful. There are cases where quorum is not met and LOOKUP fails on $M0 which leads to failures similar to: grep: /mnt/glusterfs/0/.meta/graphs/active/patchy-replicate-0/private: Transport endpoint is not connected This was happening once in a while based on attribute-timeout and md-cache not serving the lookup. Fix: Find child-up status based on statedump instead. Also changed mount options to include --entry-timeout=0 and --attribute-timeout=0 updates bz#1193929 Change-Id: Ic0de72c3006d7399a5feb3e4d10d4748949b2ab3 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* tests: Test openfd heal doesn't truncate filesPranith Kumar K2019-05-062-0/+218
| | | | | | fixes bz#1706603 Change-Id: I0bfd30f787f157b7a54f71088f767ccfd7621208 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* tests/quick-read-with-upcall.t: increase the timeoutAmar Tumballi2019-05-211-1/+5
| | | | | | | | | | | | | | | Running with 2 second sleep at this place caused failures like: `not ok 14 [ 2014/ 7] < 41> 'test-message1 cat /mnt/glusterfs/1/test.txt' -> 'Got "test-message0" instead of "test-message1"'` in few runs in 100 iterations. But when increased to higher than sleep 3, have not seen any failures in 100 runs. While I don't know the exact reasons for the behavior yet, looks like this increase in wait helps to pass the regression without failures. updates: bz#1693692 Change-Id: I0610b79bea53e36de3eea6c11234b7fc9dfd6232 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* tests: improve and fix some test scriptsXavier Hernandez2018-01-197-13/+37
| | | | | | Change-Id: Iceefe22af754096c599dc570d4894d14fce4deae Updates: bz#1193929 Signed-off-by: Xavier Hernandez <xhernandez@redhat.com>
* tests: delete the snapshots and the volume after the testsRaghavendra Bhat2019-04-301-0/+22
| | | | | | | | | | | | In uss.t multiple snapshots are taken and after all the tests things are left for the cleanup () function to get removed. Instead of that, delete the snapshots and the volume once all the tests are over so that cleanup operation becomes relatively a light operation. Change-Id: I2342740bbb185cd6c9a450eb3b4f5cbbba78974c fixes: bz#1704888 Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
* tests: Add changelog snapshot testcaseKotresh HR2019-04-161-0/+60
| | | | | | | | | | Add testcase to test snapshot creation while I/O is happening with changelog enabled. updates: bz#1193929 Change-Id: Ice4cb596286c583ed7308484d65902007a48396c Signed-off-by: Kotresh HR <khiremat@redhat.com>
* nl-cache:add test to increase code coverageSheetal Pamecha2019-04-251-0/+30
| | | | | | Change-Id: Ie0a5c522dfa0123ca45f9decf5015d39b92cb0f3 updates: bz#1693692 Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* tests: add .t file to increase cli code coverageSanju Rakonde2019-04-221-0/+20
| | | | | | | updates: bz#1693692 Change-Id: I848e622d7b8562e864f0e208aafdc21d9cb757d3 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
* cluster/ec: fix fd reopenXavi Hernandez2019-04-121-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently EC tries to reopen fd's that have been opened while a brick was down. This is done as part of regular write operations, just after having acquired the locks, and it's sent as a sub-fop of the main write fop. There were two problems: 1. The reopen was attempted on all UP bricks, even if a previous lock didn't succeed. This is incorrect because most probably the open will fail. 2. If reopen is sent and fails, the error is propagated to the main operation, causing it to fail when it shouldn't. To fix this, we only attempt reopens on bricks where the current fop owns a lock, and we prevent any error to be propagated to the main fop. To implement this behaviour an argument used to indicate the minimum number of required answers has overloaded to also include some flags. To make the change consistent, it has been necessary to rename the argument, which means that a lot of files have been changed. However there are no functional changes. This change has also uncovered a problem in discard code, which didn't correctely process requests of small sizes because no real discard fop was being processed, only a write of 0's on some region. In this case some fields of the fop remained uninitialized or with incorrect values. To fix this, a new function has been created to simulate success on a fop and it's used in the discard case. Thanks to Pranith for providing a test script that has also detected an issue in this patch. This patch includes a small modification of this script to force data to be written into bricks before stopping them. Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec Fixes: bz#1699866 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* tests: Heal should fail when read/write failsPranith Kumar K2019-04-161-0/+65
| | | | | | updates: bz#1699866 Change-Id: I7ccd1fc5fc134eeb6d443c755962a20819320d48 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
* posix/ctime: Fix stat(time attributes) inconsistency during readdirpKotresh HR2019-04-092-0/+79
| | | | | | | | | | | | | | | | | | | | Problem: Creation of tar file on gluster volume throws warning 'file changed as we read it' Cause: During readdirp, for few of the files whose inode is not present, time attributes were served from backend. This caused the ctime of few files to be different between before readdir and after readdir by tar. Solution: If ctime feature is enabled and inode is not present, don't serve the time attributes from backend file, serve it from xattr. fixes: bz#1698078 Change-Id: I427ef865f97399475faf5aa6ca495f7e317603ae Signed-off-by: Kotresh HR <khiremat@redhat.com>
* tests/dht: Test that lookups are sent post brick upN Balachandran2019-04-111-0/+83
| | | | | | Change-Id: I3556793c5e9d58cc6a08644b41dc5740fab2610b updates: bz#1628194 Signed-off-by: N Balachandran <nbalacha@redhat.com>
* mgmt/glusterd: Make changes related to cloudsync xlatorAnuradha Talur2018-11-191-0/+48
| | | | | | | | | | 1) The placement of cloudsync xlator has been changed to make it shard xlator's child. If cloudsync has to work with shard in the graph, it needs to be child of shard. Change-Id: Ib55424fdcb7ce8edae9f19b8a6e3d3ba86c1f0c4 fixes: bz#1642168 Signed-off-by: Anuradha Talur <atalur@commvault.com>
* protocol: add an option to force using old-protocolAmar Tumballi2019-03-291-0/+31
| | | | | | | | | | | | | | As protocol implements every fop, and in general a large part of the codebase. Considering our regression is run mostly in 1 machine, there was no way of forcing the client to use old protocol (while new one is available). With this patch, a new 'testing' option is provided which forces client to use old protocol if found. This should help increase the code coverage by at least 10k lines overall. updates: bz#1693692 Change-Id: Ie45256f7dea250671b689c72b4b6f25037cef948 Signed-off-by: Amar Tumballi <amarts@redhat.com>
* ec: increase line coverage of ecXavi Hernandez2019-04-041-1/+2
| | | | | | | | | | | Test ec-cpu-extensions.t has been modified so that it uses a bigger matrix. This makes use of more functions from ec-code-c.c. Changing read-policy to round-robin increases even more the functions used, reaching 100% of line and function coverage for this file. Change-Id: I26e4d33269cbd67f5d76d862f4cf1e69285e85e1 updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* tests: add a tests for trace xlatorAmar Tumballi2019-03-291-0/+33
| | | | | | | | this test alone covers most of code of trace xlator updates: bz#1693692 Change-Id: I287c72ee89bd1c02d992b020d5644e8dac0b77ab Signed-off-by: Amar Tumballi <amarts@redhat.com>
* cluster/dht: refactor dht lookup functionsN Balachandran2019-03-251-0/+145
| | | | | | | | | | Part 1: refactor the dht_lookup_dir_cbk and dht_selfheal_directory functions. Added a simple dht selfheal directory test Change-Id: I1410c26359e3c14b396adbe751937a52bd2fcff9 updates: bz#1590385 Signed-off-by: N Balachandran <nbalacha@redhat.com>