summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glusterd: After upgrade on release 9.1 glusterd protocol is broken (#2352)develmohit842021-04-232-4/+6
| | | | | | | | | | | | | | | | | * glusterd: After upgrade on release 9.1 glusterd protocol is broken After upgrade on release-9 glusterd protocol is broken because on the upgraded nodes glusterd is not able to find an actor at expected index in rpc procedure table.The new proc (GLUSTERD_MGMT_V3_POST_COMMIT) was introduced from a patch(https://review.gluster.org/#/c/glusterfs/+/24771/) in the middle due to that index of existing actor is changed on new upgraded nodes glusterd is failing. Solution: Change the proc(GLUSTERD_MGMT_V3_POST_COMMIT) position at last in proc table to avoid an issue. Fixes: #2351 Change-Id: I36575fd4302944336a75a8d4a305401a7128fd84 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* tests: avoid empty paths in environment variables (#2349)Xavi Hernandez2021-04-231-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | * tests: avoid empty paths in environment variables Many variables containing paths in env.rc.in are defined in a way that leave a trailing ':' in the variable when the previous value was empty or undefined. In the particular case of 'LD_PRELOAD_PATH' variable, this causes that the system looks for dynamic libraries in the current working directory. When this directory is inside a Gluster mount point, a significant delay is caused each time a program is run (and testing framework can run lots of programs for each test). This patch prevents that variables containing paths could end with a trailing ':'. Fixes: #2348 Change-Id: I669f5a78e14f176c0a58824ba577330989d84769 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> * Fix PYTHONPATH name and duplicity Change-Id: Iaa0b092118bb86856bbe621eb03fef6fa7478971 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* protocol/client: Fix lock memory leak (#2338)Pranith Kumar Karampuri2021-04-227-55/+213
| | | | | | | | | | | | | | | | | | | | | | Problem-1: When an overlapping lock is issued the merged lock is not assigned the owner. When flush is issued on the fd, this particular lock is not freed leading to memory leak Fix-1: Assign the owner while merging the locks. Problem-2: On fd-destroy lock structs could be present in fdctx. For some reason with flock -x command and closing of the bash fd, it leads to this code path. Which leaks the lock structs. Fix-2: When fdctx is being destroyed in client, make sure to cleanup any lock structs. fixes: #2337 Change-Id: I298124213ce5a1cf2b1f1756d5e8a9745d9c0a1c Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* coverity: Removed structural dead code (#2320)Nikhil Ladha2021-04-191-8/+2
| | | | | | | | | | | | | | | Issue: `for` loop was executed only once, leading to structural dead code in coverity Fix: Updated the code to use `if` condition instead of `for` loop for the same. CID: 1437779 Updates: #1060 Change-Id: I2ca1d2c9d2842d586161fe971bb8c7b3444dfb2b Signed-off-by: nik-redhat <nladha@redhat.com>
* logging: fix a relock deadlock (#2332)chenglin1302021-04-151-1/+10
| | | | | | | | | | | In gf_log_inject_timer_event(), got lock log.log_buf_lock. Then, under the lock, any call to gf_msg will hang the thread. Because in _gf_msg_internal(), it will relock log.log_buf_lock. Use a PTHREAD_MUTEX_RECURSIVE type instead of the default type for this mutex to fix this deadlock. Fixes: #2330 Signed-off-by: Cheng Lin cheng.lin130@zte.com.cn
* group-samba: disable performance.write-behind translator. (#2329)Günther Deschner2021-04-141-0/+1
| | | | | | | | | | | | | | | | Fixes: #2328 From the vfs_glusterfs(8) manpage: "The GlusterFS write-behind performance translator, when used with Samba, could be a source of data corruption. The translator, while processing a write call, immediately returns success but continues writing the data to the server in the background. This can cause data corruption when two clients relying on Samba to provide data consistency are operating on the same file." Guenther Signed-off-by: Günther Deschner <gd@samba.org>
* cli: Increased spacing in cli for option table (#2322)Nikhil Ladha2021-04-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | * cli: Increased spacing in cli for option table Issue: Some options have name larger than length 40, due to which the output of command `gluster vol get <volname> all` mixes up the option, value for long option names. Fix: Increased the spacing in cli for `gluster vol get <volname> all` output to 50. Fixes: #2313 Change-Id: I841730ced074547a81171a4432d15ec9c35f39cd Signed-off-by: nik-redhat <nladha@redhat.com> * Added separator Change-Id: I210877c89bc468ed6a3090cd14fde7ecee1d33b6 Signed-off-by: nik-redhat <nladha@redhat.com> * Removed separator and added space Change-Id: Ic0eb9c9bc39a354465aabd939f72bc65be738f6c Signed-off-by: nik-redhat <nladha@redhat.com>
* Doc: Developer session 3 (#2323)Pranith Kumar Karampuri2021-04-101-1/+3
| | | | | | | | Adding links to developer session 3 which covers xlator interface in gluster. updates: #2308 Change-Id: I8dc84263c19613dba665a080d8adb99cdfe677b0 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* dht: fix rebalance of sparse files (#2318)Xavi Hernandez2021-04-093-56/+93
| | | | | | | | | | | | | | Current implementation of rebalance for sparse files has a bug that, in some cases, causes a read of 0 bytes from the source subvolume. Posix xlator doesn't allow 0 byte reads and fails them with EINVAL, which causes rebalance to abort the migration. This patch implements a more robust way of finding data segments in a sparse file that avoids 0 byte reads, allowing the file to be migrated successfully. Fixes: #2317 Change-Id: Iff168dda2fb0f2edf716b21eb04cc2cc8ac3915c Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* tests: Fix incorrect variables in throttle-rebal.t (#2316)Jamie Nguyen2021-04-081-3/+3
| | | | | | | | | | The final test doesn't test what it means to test. It still fails as expected, but only because at this point `THROTTLE_LEVEL` is still set to `garbage`. Easily fixed by correcting the typos in the variable names, and thus fixes https://github.com/gluster/glusterfs/issues/2315 Signed-off-by: Jamie Nguyen <j@jamielinux.com>
* Removal of force option in snapshot create (#2110)nishith-vihar2021-04-065-170/+37
| | | | | | | | | | | | The force option does fail for snapshot create command even though the quorum is satisfied and is redundant. The change deprecates the force option for snapshot create command and checks if all bricks are online instead of checking for quorum for creating a snapshot. Fixes: #2099 Change-Id: I45d866e67052fef982a60aebe8dec069e78015bd Signed-off-by: Nishith Vihar Sakinala <nsakinal@redhat.com>
* Remove some strlen() calls if using DICT_LIST_IMP (#2311)Rinku Kothiya2021-04-061-8/+22
| | | | | | | | | The code was optimized by avoiding some strlen() calls if using DICT_LIST_IMP fixes: #2294 Change-Id: Ic5e784edb9538feb1d1b441c8514c76ba5266832 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* Doc: Developer session 1, 2 (#2291)Pranith Kumar Karampuri2021-04-041-0/+10
| | | | | | | Adding links to recordings, slides of the dev-session 1, 2 updates: #2308 Change-Id: I9e10173e2b3b0d70304fa8fa050734aba06a2c6b Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* list.h: remove offensive language, introduce _list_del() (#2132)Yaniv Kaul2021-04-021-8/+21
| | | | | | | | | | | | | | | 1. Replace offensive variables with the values the Linux kernel uses. 2. Introduce an internal function - _list_del() that can be used when list->next and list->prev are going to be assigned later on. (too bad in the code we do not have enough uses of list_move() and list_move() tail, btw. Would have contributed also to code readability) * list.h: defined LIST_POSION1, LIST_POISION2 similar to Linux kernel defines Fixes: #2025 Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
* marker: initiate xattrs QUOTA_SIZE_KEY for empty volume (#2261)chenglin1302021-04-012-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * marker: initiate quota xattrs for empty volume When a VOL is empty, it's failed to list quota info after setting limit-usage. # gluster volume quota gv0 list / N/A N/A N/A N/A N/A N/A Because there is no QUOTA_SIZE_KEY in the xattrs of the VOL directory. # getfattr -d -m. -e hex /data/brick2/gv0 getfattr: Removing leading '/' from absolute path names # file: data/brick2/gv0 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.mdata=0x01000000000000000000000000603e70f6000000003b3f3c8000000000603e70f6000000003351d14000000000603e70f9000000000ff95b00 trusted.glusterfs.quota.limit-set.1=0x0000000000a00000ffffffffffffffff trusted.glusterfs.volume-id=0xe27d61be048c4195a9e1ee349775eb59 This patch fix it by setting QUOTA_SIZE_KEY for the empty VOL directory when quota enable. # gluster volume quota gv0 list Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded? ------------------------------------------------------------------------------------------------------------------------------- / 4.0MB 80%(3.2MB) 0Bytes 4.0MB No No Fixes: #2260 Change-Id: I6ab3e43d6ef33e5ce9531b48e62fce9e8b3fc555 Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
* xlaotrs/mgmt: Fixing coverity issue 1445996Ashish Pandey2021-03-291-5/+7
| | | | | | | | Fixing "Null pointer dereferences" fixes: #2129 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* afr: don't reopen fds on which POSIX locks are held (#1980)Karthik Subrahmanya2021-03-2712-105/+651
| | | | | | | | | | When client.strict-locks is enabled on a volume and there are POSIX locks held on the files, after disconnect and reconnection of the clients do not re-open such fds which might lead to multiple clients acquiring the locks and cause data corruption. Change-Id: I8777ffbc2cc8d15ab57b58b72b56eb67521787c5 Fixes: #1977 Signed-off-by: karthik-us <ksubrahm@redhat.com>
* common-ha: ensure shared_storage is mounted before setup (#2296)kalebskeithley2021-03-251-0/+13
| | | | | | | If gluster shared-storage isn't mounted, ganesha will fail to start Change-Id: I6ed7044ea6b6c61b013ebe17088bfde311b109b7 fixes: #2278 Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
* afr: make fsync post-op aware of inodelk count (#2273)Ravishankar N2021-03-252-17/+24
| | | | | | | | | | | | | | | | | | | | | | Problem: Since commit bd540db1e, eager-locking was enabled for fsync. But on certain VM workloads wit sharding enabled, shard xlator keeps sending fsync on the base shard. This can cause blocked inodelks from other clients (including shd) to time out due to call bail. Fix: Make afr fsync aware of inodelk count and not delay post-op + unlock when inodelk count > 1, just like writev. Code is restructured so that any fd based AFR_DATA_TRANSACTION can be made aware by setting GLUSTERFS_INODELK_DOM_COUNT in xdata request. Note: We do not know yet why VMs go in to paused state because of the blocked inodelks but this patch should be a first step in reducing the occurence. Updates: #2198 Change-Id: Ib91ebdd3101d590c326e69c829cf9335003e260b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* common-ha: stability fixes for ganesha_grace and ganesha_mon RAsKaleb S KEITHLEY2021-03-242-49/+26
| | | | | | | | | | | | | | | | | | | | Include fixes suggested by ClusterHA devs. 1) It turns out that crm_attribute attrs and attrd_updater attrs really are one and the same, despite what I was told years ago. attrs created with crm_attribute ... --lifetime=reboot ... or attrd_updater are one and same. As per ClusterHA devs having an attr created with crm_attribute ... --lifetime=forever and also creating/updating the same attr with attrd_updater is a recipe for weird things to happen that will be difficult to debug. 2) using hostname -s or hostname for node names in crm_attribute and attrd_updater potentially could use the wrong name if the host has been renamed; use ocf_local_nodename() (in ocf- shellfuncs) instead. fixes:#2276 Change-Id:If572d396fae9206628714fb2ce00f72e94f2258f Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
* configure: add linux-io_uring flag (#2060)Ravishankar N2021-03-242-6/+33
| | | | | | | | | | | | | | | | By default, if liburing is not present on the machine where gluster rpms are being built, then the built rpm won't have the feature present in posix.so. While this is obviously displayed in the ./configure's summary, it means the feature won't work on a target machine where the rpm is installed, even if the target has Linux kernel >=5.1 and liburing installed. I think it is better to have a configure option `--enable-linux-io_uring` which is on by default. That way, the build machines will error out by default and will need to `./configure --disable-linux-io_uring` to compile or install the lbirary and headers on the build machine. Fixes: #2063 Change-Id: Ide1daa11b3513210d12be8d2cb683a4084d41e18 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* CID 1412333 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW) (#2264)Ayush Ujjwal2021-03-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * CID 1412333 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW) CID: 1412333 Description: `path` length might overrun the 108-character fixed-size string. Added a condition to check the size of `path`. Updates: #1060 Change-Id: I4e7c58ab3a3f6807992dfc3023c21f762bff6b32 Signed-off-by: aujjwal-redhat <aujjwal@redhat.com> * refactored the code Change-Id: I1eaa6fc59e43f76224f44b5f8c54495b67076651 Signed-off-by: aujjwal-redhat <aujjwal@redhat.com> * added strncpy in place of strcpy to store only the number of characters as much is size of addr-sunppath Change-Id: I9b4eeed3dd0c00d052dcaaf6b34597fbfe7fe1a2 Signed-off-by: aujjwal-redhat <aujjwal@redhat.com> * Removed goto err as it was already going to err Change-Id: Ib40c11537b57aea72d3095eda86bd5b541930550 Signed-off-by: aujjwal-redhat <aujjwal@redhat.com>
* cluster/dht: use readdir for fix-layout in rebalance (#2243)Pranith Kumar Karampuri2021-03-2210-98/+131
| | | | | | | | | | | | | | | | | | | | Problem: On a cluster with 15 million files, when fix-layout was started, it was not progressing at all. So we tried to do a os.walk() + os.stat() on the backend filesystem directly. It took 2.5 days. We removed os.stat() and re-ran it on another brick with similar data-set. It took 15 minutes. We realized that readdirp is extremely costly compared to readdir if the stat is not useful. fix-layout operation only needs to know that the entry is a directory so that fix-layout operation can be triggered on it. Most of the modern filesystems provide this information in readdir operation. We don't need readdirp i.e. readdir+stat. Fix: Use readdir operation in fix-layout. Do readdir+stat/lookup for filesystems that don't provide d_type in readdir operation. fixes: #2241 Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* Cleanup unused this pointers (#2282)Rinku Kothiya2021-03-228-43/+35
| | | | | | fixes: #2268 Change-Id: If00ee847e15ac7f7e5b0e12125a7d02a610b9708 Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* afr, dht: Add ramifications of disabling ensure-durabilityPranith Kumar K2021-03-193-2/+9
| | | | | | | Also moved options to NO_DOC Change-Id: I86623f4139d156812e622a87655483c9d2491052 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* Coverity Issues: 1447088, 1447089 (#2217)Ashish Pandey2021-03-182-37/+36
| | | | | | | 1447088 - Resource leak 1447089 - Buffer not null terminated updates: #2216 Signed-off-by: Ashish Pandey <aspandey@redhat.com>
* afr: remove priv->root_inode (#2244)Ravishankar N2021-03-174-8/+1
| | | | | | | | | | | | | | priv->root_inode seems to be a remenant of pump xlator and was getting populated in discover code path. thin-arbiter code used it to populate loc info but it seems that in case of some daemons like quotad, the discover path for root gfid is not hit, causing it to crash. Fix: root inode can be accessed via this->itable->root, so use that and remove priv->rot_inode instances from the afr code. Fixes: #2234 Change-Id: Iec59c157f963a4dc455652a5c85a797d00cba52a Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* extras: disable lookup-optimize in virt and block groups (#2254)Xavi Hernandez2021-03-173-0/+3
| | | | | | | | | | | | | lookup-optimize doesn't provide any benefit for virtualized environments and gluster-block workloads, but it's known to cause corruption in some cases when sharding is also enabled and the volume is expanded or shrunk. For this reason, we disable lookup-optimize by default on those environments. Fixes: #2253 Change-Id: I25861aa50b335556a995a9c33318dd3afb41bf71 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* cluster/dht: Provide option to disable fsync in data migration (#2259)Pranith Kumar Karampuri2021-03-1711-19/+85
| | | | | | | | | | | | | | | At the moment dht rebalance doesn't give any option to disable fsync after data migration. Making this an option would give admins take responsibility of data in a way that is suitable for their cluster. Default value is still 'on', so that the behavior is intact for people who don't care about this. For example: If the data that is going to be migrated is already backed up or snapshotted, there is no need for fsync to happen right after migration which can affect active I/O on the volume from applications. fixes: #2258 Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* String not null terminated (#2219)nishith-vihar2021-03-111-0/+2
| | | | | | | | CID: 1214629,1274235,1437648 The buffer has been null terminated thus resolving the issue Change-Id: Ieb1d067d8dd860c55a8091dd6fbba1bcbb4dc19f Updates: #1060 Signed-off-by: Nishith Vihar Sakinala <nsakinal@redhat.com>
* cluster/dht: Fix use-after-free bug dht_queue_readdir(p) (#2242)Pranith Kumar Karampuri2021-03-111-2/+9
| | | | | | | | | | | | | Problem: In dht_queue_readdir(p) 'frame' is accessed after unwind. This will lead to undefined behavior as frame would be freed upon unwind. Fix: Store the variables that are needed in local variables and use them instead. fixes: #2239 Change-Id: I6b2e48e87c85de27fad67a12d97abd91fa27c0c1 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* features/index: Optimize link-count fetching code path (#1789)Pranith Kumar Karampuri2021-03-1017-112/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * features/index: Optimize link-count fetching code path Problem: AFR requests 'link-count' in lookup to check if there are any pending heals. Based on this information, afr will set dirent->inode to NULL in readdirp when heals are ongoing to prevent serving bad data. When heals are completed, link-count xattr is leading to doing an opendir of xattrop directory and then reading the contents to figure out that there is no healing needed for every lookup. This was not detected until this github issue because ZFS in some cases can lead to very slow readdir() calls. Since Glusterfs does lot of lookups, this was slowing down all operations increasing load on the system. Code problem: index xlator on any xattrop operation adds index to the relevant dirs and after the xattrop operation is done, will delete/keep the index in that directory based on the value fetched in xattrop from posix. AFR sends all-zero xattrop for changelog xattrs. This is leading to priv->pending_count manipulation which sets the count back to -1. Next Lookup operation triggers opendir/readdir to find the actual link-count in lookup because in memory priv->pending_count is -ve. Fix: 1) Don't add to index on all-zero xattrop for a key. 2) Set pending-count to -1 when the first gfid is added into xattrop directory, so that the next lookup can compute the link-count. fixes: #1764 Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com> * addressed comments Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com> * tests: Handle base index absence Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com> * Addressed LOCK based comments, .t comments Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310 Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* afr: fix directory entry count (#2233)Xavi Hernandez2021-03-094-4/+129
| | | | | | | | | | | | | | | | | | AFR may hide some existing entries from a directory when reading it because they are generated internally for private management. However the returned number of entries from readdir() function is not updated accordingly. So it may return a number higher than the real entries present in the gf_dirent list. This may cause unexpected behavior of clients, including gfapi which incorrectly assumes that there was an entry when the list was actually empty. This patch also makes the check in gfapi more robust to avoid similar issues that could appear in the future. Fixes: #2232 Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* EC - Fixing a Coverity issue (Uninitialized lock use)Barak Sason Rofman2021-03-051-2/+2
| | | | | | | | | | | | CID: 1444461 A lock is being destroyed, but in some code-flows might be used later on, modified code-flow to make sure the destroyed lock is not being used in all cases. Change-Id: I9610d56d9cb8a8ab7062e9094493dba9afdd0b30 updates: #1060 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* quiesce: Resource leak coverity fix (#2215)Sheetal Pamecha2021-03-051-0/+3
| | | | | | | Fixes CID: 1124725 Updates: #1060 Change-Id: Iced092c5ad1a9445e4c758f09a481501bae7275f Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
* cli: syntax check for arbiter volume creation (#2207)Ravishankar N2021-03-052-1/+3
| | | | | | | | | | | commit 8e7bfd6a58b444b26cb50fb98870e77302f3b9eb changed the syntax for arbiter volume creation to 'replica 2 arbiter 1', while still allowing the old syntax of 'replica 3 arbiter 1'. But while doing so, it also removed a conditional check, thereby allowing replica count > 3. This patch fixes it. Fixes: #2192 Change-Id: Ie109325adb6d78e287e658fd5f59c26ad002e2d3 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
* glusterd: Fix deadlock while concurrent quota enable (#2118)zhangxianwei82021-03-041-1/+1
| | | | | | | | | | | | | in glusterd_svc_start: 1) synctaskA gets attach_lock and then releases big_lock to execute runner_run. 2) synctaskB then gets big_lock but can not gets attach_lock and then wait. 3) After executes runner_run, synctaskA then gets big_lock but synctaskB holds it, wait. This leads to deadlock. This patch uses runner_run_nowait to avoid the deadlock. fixes: #2117 Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn>
* api: Fix a function name in the API docAravinda Vishwanathapura2021-03-011-1/+1
| | | | | | | | | Wrong function name was mentioned in API doc for `glfs_get_volfile`. Change-Id: Id2251837f53270f1f03b8a5501ea335b7995873b Updates: #1000 Signed-off-by: Aravinda Vishwanathapura <aravinda@kadalu.io>
* dict: avoid hash calculation when hash_size=1 (link list imp) (#2171)Tamar Shacked2021-03-011-42/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * dict: avoid hash calculation when hash_size=1 (link list imp) Currently dict_t always constructs with dict::hash_size = 1. With this initializing the dict implemented as a link-list and searching for a key is done by iteration using key comparison. Therfore we can avoid the hash-calculation done each set()/get() in this case. Fixes: #2013 Change-Id: Id93286a8036064d43142bc2b2f8d5a3be4e97fc4 Signed-off-by: Tamar Shacked <tshacked@redhat.com> * dict: avoid hash calculation when hash_size=1 (list imp) Currently dict_t always constructs with dict::hash_size = 1. With this initializing the dict implemented as a link-list and searching for a key is done by iteration using key comparison. Therfore we can avoid the hash-calculation done each set()/get() in this case. Fix: using new macro to delimit and avoid blocks related to hash imp Fixes: #2013 Change-Id: I31180b434a6e9e7bbb456c7ad888c147c4ce3308 Signed-off-by: Tamar Shacked <tshacked@redhat.com>
* afr: fix coverity issue introduced by 90cefde (#2201)Xavi Hernandez2021-03-011-2/+2
| | | | | | | Fixes coverity issues 1447029 and 1447028. Updates: #2161 Change-Id: I6a564231d6aeb76de20675b7ced5d45eed8c377f Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* dht: fix use-after-free introduced by 70e6ee2Xavi Hernandez2021-02-262-3/+39
| | | | | | Change-Id: I97e73c0aae74fc5d80c975f56f2f7a64e3e1ae95 Updates: #2169 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* cluster/afr: Fix race in lockinfo (f)getxattr (#2162)Xavi Hernandez2021-02-241-142/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cluster/afr: Fix race in lockinfo (f)getxattr A shared dictionary was updated outside the lock after having updated the number of remaining answers. This means that one thread may be processing the last answer and unwinding the request before another thread completes updating the dict. Thread 1 Thread 2 LOCK() call_cnt-- (=1) UNLOCK() LOCK() call_cnt-- (=0) UNLOCK() update_dict(dict) if (call_cnt == 0) { STACK_UNWIND(dict); } update_dict(dict) if (call_cnt == 0) { STACK_UNWIND(dict); } The updates from thread 1 are lost. This patch also reduces the work done inside the locked region and reduces code duplication. Fixes: #2161 Change-Id: Idc0d34ab19ea6031de0641f7b05c624d90fac8fa Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* cluster/dht: Fix stack overflow in readdir(p) (#2170)Xavi Hernandez2021-02-243-10/+114
| | | | | | | | | | | | | | | | | | | | | | When parallel-readdir is enabled, readdir(p) requests sent by DHT can be immediately processed and answered in the same thread before the call to STACK_WIND_COOKIE() completes. This means that the readdir(p) cbk is processed synchronously. In some cases it may decide to send another readdir(p) request, which causes a recursive call. When some special conditions happen and the directories are big, it's possible that the number of nested calls is so high that the process crashes because of a stack overflow. This patch fixes this by not allowing nested readdir(p) calls. When a nested call is detected, it's queued instead of sending it. The queued request is processed when the current call finishes by the top level stack function. Fixes: #2169 Change-Id: Id763a8a51fb3c3314588ec7c162f649babf33099 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* dht: Ongoing IO is failed during volume shrink operation (#2188)mohit842021-02-241-11/+30
| | | | | | | | | | | | | | | | | | | In the commit (c878174) we have introduced a check to avoid stale layout issue.To avoid a stale layout issue dht has set a key along with layout at the time of wind a create fop and posix validates the parent layout based on the key value. If layout does not match it throw and error.In case of volume shrink layout has been changed by reabalance daemon and if layout does not matches dht is not able to wind a create fop successfully. Solution: To avoid the issue populate a key only while dht has wind a fop first time. After got an error in 2nd attempt dht takes a lock and then reattempt to wind a fop again. Fixes: #2187 Change-Id: Ie018386e7823a11eea415496bb226ca032453a55 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* tests: Move tests/basic/glusterd-restart-shd-mux.t to flaky (#2191)mohit842021-02-241-0/+0
| | | | | | | | | | | The test case ( tests/basic/glusterd-restart-shd-mux.t ) was introduced as a part of shd mux feature but we observed the feature is not stable and we already planned to revert a feature. For the time being I am moving a test case to flaky to avoid a frequent regression failure. Fixes: #2190 Change-Id: I4a06a5d9212fb952a864d0f26db8323690978bfc Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* fuse: add an option to specify the mount display name (#1989)Amar Tumballi2021-02-224-29/+30
| | | | | | | | | | | | * fuse: add an option to specify the mount display name There are two things this PR is fixing. 1. When a mount is specified with volfile (-f) option, today, you can't make it out its from glusterfs as only volfile is added as 'fsname', so we add it as 'glusterfs:/<volname>'. 2. Provide an options for admins who wants to show the source of mount other than default (useful when one is not providing 'mount.glusterfs', but using their own scripts. Updates: #1000 Change-Id: I19e78f309a33807dc5f1d1608a300d93c9996a2f Signed-off-by: Amar Tumballi <amar@kadalu.io>
* glusterfs:the mount operation will get stuck when the vol isn't exist (#2177)zhangxyue2021-02-222-1/+5
| | | | | when passing wrong volume-name which doesn't exits, it will get stuck. The errno is 0 inited in glusterd-handshake.c. After initing the errno, the process blocks in gf_fuse_umount.
* glusterd: Resolve use after free bug (#2181)mohit842021-02-221-3/+2
| | | | | | | | | | | In the commit 61ae58e67567ea4de8f8efc6b70a9b1f8e0f1bea introduced a coverity bug use object after cleanup the object. Cleanup memory after comeout from a critical section Fixes: #2180 Change-Id: Iee2050c4883a0dd44b8523bb822b664462ab6041 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* dht/linkfile - Remove unused codeBarak Sason Rofman2021-02-182-58/+0
| | | | | | | | A couple of methods are not being used, removing them. Change-Id: I5bb4b7f04bae9486cf9b7960cf5ed91d0b59c8c7 updates: #1000 Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
* glusterd: Rebalance cli is not showing correct status after reboot (#2172)mohit842021-02-185-11/+91
| | | | | | | | | | | | | | | | | Rebalance cli is not showing correct status after reboot. The CLI is not correct status because defrag object is not valid at the time of creating a rpc connection to show the status. The defrag object is not valid because at the time of start a glusterd glusterd_restart_rebalance can be call almost at the same time by two different synctask and glusterd got a disconnect on rpc object and it cleanup the defrag object. Solution: To avoid the defrag object populate a reference count before create a defrag rpc object. Fixes: #1339 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com> Change-Id: Ia284015d79beaa3d703ebabb92f26870a5aaafba