| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* glusterd: After upgrade on release 9.1 glusterd protocol is broken
After upgrade on release-9 glusterd protocol is broken
because on the upgraded nodes glusterd is not able to find an
actor at expected index in rpc procedure table.The new proc (GLUSTERD_MGMT_V3_POST_COMMIT)
was introduced from a patch(https://review.gluster.org/#/c/glusterfs/+/24771/)
in the middle due to that index of existing actor is changed on new upgraded nodes
glusterd is failing.
Solution: Change the proc(GLUSTERD_MGMT_V3_POST_COMMIT) position at
last in proc table to avoid an issue.
Fixes: #2351
Change-Id: I36575fd4302944336a75a8d4a305401a7128fd84
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* tests: avoid empty paths in environment variables
Many variables containing paths in env.rc.in are defined in a way
that leave a trailing ':' in the variable when the previous value
was empty or undefined.
In the particular case of 'LD_PRELOAD_PATH' variable, this causes
that the system looks for dynamic libraries in the current working
directory. When this directory is inside a Gluster mount point, a
significant delay is caused each time a program is run (and testing
framework can run lots of programs for each test).
This patch prevents that variables containing paths could end with
a trailing ':'.
Fixes: #2348
Change-Id: I669f5a78e14f176c0a58824ba577330989d84769
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* Fix PYTHONPATH name and duplicity
Change-Id: Iaa0b092118bb86856bbe621eb03fef6fa7478971
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem-1:
When an overlapping lock is issued the merged lock is not assigned the
owner. When flush is issued on the fd, this particular lock is not freed
leading to memory leak
Fix-1:
Assign the owner while merging the locks.
Problem-2:
On fd-destroy lock structs could be present in fdctx. For some reason
with flock -x command and closing of the bash fd, it leads to this code
path. Which leaks the lock structs.
Fix-2:
When fdctx is being destroyed in client, make sure to cleanup any lock
structs.
fixes: #2337
Change-Id: I298124213ce5a1cf2b1f1756d5e8a9745d9c0a1c
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
`for` loop was executed only once, leading to structural
dead code in coverity
Fix:
Updated the code to use `if` condition instead of
`for` loop for the same.
CID: 1437779
Updates: #1060
Change-Id: I2ca1d2c9d2842d586161fe971bb8c7b3444dfb2b
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In gf_log_inject_timer_event(), got lock log.log_buf_lock. Then,
under the lock, any call to gf_msg will hang the thread. Because
in _gf_msg_internal(), it will relock log.log_buf_lock.
Use a PTHREAD_MUTEX_RECURSIVE type instead of the default type for
this mutex to fix this deadlock.
Fixes: #2330
Signed-off-by: Cheng Lin cheng.lin130@zte.com.cn
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #2328
From the vfs_glusterfs(8) manpage:
"The GlusterFS write-behind performance translator, when used with
Samba, could be a source of data corruption. The translator, while
processing a write call, immediately returns success but continues
writing the data to the server in the background. This can cause data
corruption when two clients relying on Samba to provide data consistency
are operating on the same file."
Guenther
Signed-off-by: Günther Deschner <gd@samba.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cli: Increased spacing in cli for option table
Issue:
Some options have name larger than length 40,
due to which the output of command `gluster vol get <volname> all`
mixes up the option, value for long option names.
Fix:
Increased the spacing in cli for `gluster vol get <volname> all`
output to 50.
Fixes: #2313
Change-Id: I841730ced074547a81171a4432d15ec9c35f39cd
Signed-off-by: nik-redhat <nladha@redhat.com>
* Added separator
Change-Id: I210877c89bc468ed6a3090cd14fde7ecee1d33b6
Signed-off-by: nik-redhat <nladha@redhat.com>
* Removed separator and added space
Change-Id: Ic0eb9c9bc39a354465aabd939f72bc65be738f6c
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
| |
Adding links to developer session 3 which covers xlator interface in
gluster.
updates: #2308
Change-Id: I8dc84263c19613dba665a080d8adb99cdfe677b0
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation of rebalance for sparse files has a bug that,
in some cases, causes a read of 0 bytes from the source subvolume.
Posix xlator doesn't allow 0 byte reads and fails them with EINVAL,
which causes rebalance to abort the migration.
This patch implements a more robust way of finding data segments in
a sparse file that avoids 0 byte reads, allowing the file to be
migrated successfully.
Fixes: #2317
Change-Id: Iff168dda2fb0f2edf716b21eb04cc2cc8ac3915c
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The final test doesn't test what it means to test. It still fails as
expected, but only because at this point `THROTTLE_LEVEL` is still set
to `garbage`.
Easily fixed by correcting the typos in the variable names, and thus
fixes https://github.com/gluster/glusterfs/issues/2315
Signed-off-by: Jamie Nguyen <j@jamielinux.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The force option does fail for snapshot create command even though
the quorum is satisfied and is redundant.
The change deprecates the force option for snapshot create command
and checks if all bricks are online instead of checking for quorum
for creating a snapshot.
Fixes: #2099
Change-Id: I45d866e67052fef982a60aebe8dec069e78015bd
Signed-off-by: Nishith Vihar Sakinala <nsakinal@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The code was optimized by avoiding some strlen()
calls if using DICT_LIST_IMP
fixes: #2294
Change-Id: Ic5e784edb9538feb1d1b441c8514c76ba5266832
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
| |
Adding links to recordings, slides of the dev-session 1, 2
updates: #2308
Change-Id: I9e10173e2b3b0d70304fa8fa050734aba06a2c6b
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Replace offensive variables with the values the Linux kernel uses.
2. Introduce an internal function - _list_del() that can be used
when list->next and list->prev are going to be assigned later on.
(too bad in the code we do not have enough uses of list_move() and
list_move() tail, btw. Would have contributed also to code readability)
* list.h: defined LIST_POSION1, LIST_POISION2 similar to Linux kernel
defines
Fixes: #2025
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* marker: initiate quota xattrs for empty volume
When a VOL is empty, it's failed to list quota info after setting limit-usage.
# gluster volume quota gv0 list
/ N/A N/A N/A N/A N/A N/A
Because there is no QUOTA_SIZE_KEY in the xattrs of the VOL directory.
# getfattr -d -m. -e hex /data/brick2/gv0
getfattr: Removing leading '/' from absolute path names
# file: data/brick2/gv0
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.mdata=0x01000000000000000000000000603e70f6000000003b3f3c8000000000603e70f6000000003351d14000000000603e70f9000000000ff95b00
trusted.glusterfs.quota.limit-set.1=0x0000000000a00000ffffffffffffffff
trusted.glusterfs.volume-id=0xe27d61be048c4195a9e1ee349775eb59
This patch fix it by setting QUOTA_SIZE_KEY for the empty VOL directory when quota enable.
# gluster volume quota gv0 list
Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded?
-------------------------------------------------------------------------------------------------------------------------------
/ 4.0MB 80%(3.2MB) 0Bytes 4.0MB No No
Fixes: #2260
Change-Id: I6ab3e43d6ef33e5ce9531b48e62fce9e8b3fc555
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
|
|
|
|
|
|
|
|
| |
Fixing "Null pointer dereferences"
fixes: #2129
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
When client.strict-locks is enabled on a volume and there are POSIX
locks held on the files, after disconnect and reconnection of the
clients do not re-open such fds which might lead to multiple clients
acquiring the locks and cause data corruption.
Change-Id: I8777ffbc2cc8d15ab57b58b72b56eb67521787c5
Fixes: #1977
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
| |
If gluster shared-storage isn't mounted, ganesha will fail to start
Change-Id: I6ed7044ea6b6c61b013ebe17088bfde311b109b7
fixes: #2278
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Since commit bd540db1e, eager-locking was enabled for fsync. But on
certain VM workloads wit sharding enabled, shard xlator keeps sending
fsync on the base shard. This can cause blocked inodelks from other
clients (including shd) to time out due to call bail.
Fix:
Make afr fsync aware of inodelk count and not delay post-op + unlock
when inodelk count > 1, just like writev.
Code is restructured so that any fd based AFR_DATA_TRANSACTION can be made
aware by setting GLUSTERFS_INODELK_DOM_COUNT in xdata request.
Note: We do not know yet why VMs go in to paused state because of the
blocked inodelks but this patch should be a first step in reducing the
occurence.
Updates: #2198
Change-Id: Ib91ebdd3101d590c326e69c829cf9335003e260b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include fixes suggested by ClusterHA devs.
1) It turns out that crm_attribute attrs and attrd_updater attrs really are one and the same,
despite what I was told years ago.
attrs created with crm_attribute ... --lifetime=reboot ... or attrd_updater are one and same. As
per ClusterHA devs having an attr created with crm_attribute ... --lifetime=forever and also
creating/updating the same attr with attrd_updater is a recipe for weird things to happen that
will be difficult to debug.
2) using hostname -s or hostname for node names in crm_attribute and attrd_updater potentially
could use the wrong name if the host has been renamed; use ocf_local_nodename() (in ocf-
shellfuncs) instead.
fixes:#2276
Change-Id:If572d396fae9206628714fb2ce00f72e94f2258f
Signed-off-by: Kaleb S KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, if liburing is not present on the machine where gluster rpms are
being built, then the built rpm won't have the feature present in posix.so.
While this is obviously displayed in the ./configure's summary, it means the
feature won't work on a target machine where the rpm is installed, even if the
target has Linux kernel >=5.1 and liburing installed.
I think it is better to have a configure option `--enable-linux-io_uring` which
is on by default. That way, the build machines will error out by default and
will need to `./configure --disable-linux-io_uring` to compile or install the
lbirary and headers on the build machine.
Fixes: #2063
Change-Id: Ide1daa11b3513210d12be8d2cb683a4084d41e18
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* CID 1412333 (#1 of 1): Copy into fixed size buffer (STRING_OVERFLOW)
CID: 1412333
Description:
`path` length might overrun the 108-character fixed-size string. Added a condition to check the size of `path`.
Updates: #1060
Change-Id: I4e7c58ab3a3f6807992dfc3023c21f762bff6b32
Signed-off-by: aujjwal-redhat <aujjwal@redhat.com>
* refactored the code
Change-Id: I1eaa6fc59e43f76224f44b5f8c54495b67076651
Signed-off-by: aujjwal-redhat <aujjwal@redhat.com>
* added strncpy in place of strcpy to store only the number of characters as much is size of addr-sunppath
Change-Id: I9b4eeed3dd0c00d052dcaaf6b34597fbfe7fe1a2
Signed-off-by: aujjwal-redhat <aujjwal@redhat.com>
* Removed goto err as it was already going to err
Change-Id: Ib40c11537b57aea72d3095eda86bd5b541930550
Signed-off-by: aujjwal-redhat <aujjwal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.
Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.
fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
| |
fixes: #2268
Change-Id: If00ee847e15ac7f7e5b0e12125a7d02a610b9708
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
| |
Also moved options to NO_DOC
Change-Id: I86623f4139d156812e622a87655483c9d2491052
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
| |
1447088 - Resource leak
1447089 - Buffer not null terminated
updates: #2216
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
priv->root_inode seems to be a remenant of pump xlator and was getting
populated in discover code path. thin-arbiter code used it to populate
loc info but it seems that in case of some daemons like quotad, the
discover path for root gfid is not hit, causing it to crash.
Fix:
root inode can be accessed via this->itable->root, so use that and
remove priv->rot_inode instances from the afr code.
Fixes: #2234
Change-Id: Iec59c157f963a4dc455652a5c85a797d00cba52a
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lookup-optimize doesn't provide any benefit for virtualized
environments and gluster-block workloads, but it's known to cause
corruption in some cases when sharding is also enabled and the volume
is expanded or shrunk.
For this reason, we disable lookup-optimize by default on those
environments.
Fixes: #2253
Change-Id: I25861aa50b335556a995a9c33318dd3afb41bf71
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment dht rebalance doesn't give any option to disable fsync
after data migration. Making this an option would give admins take
responsibility of data in a way that is suitable for their cluster.
Default value is still 'on', so that the behavior is intact for people
who don't care about this.
For example: If the data that is going to be migrated is already backed
up or snapshotted, there is no need for fsync to happen right after
migration which can affect active I/O on the volume from applications.
fixes: #2258
Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
| |
CID: 1214629,1274235,1437648
The buffer has been null terminated thus resolving the issue
Change-Id: Ieb1d067d8dd860c55a8091dd6fbba1bcbb4dc19f
Updates: #1060
Signed-off-by: Nishith Vihar Sakinala <nsakinal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In dht_queue_readdir(p) 'frame' is accessed after unwind. This will lead to
undefined behavior as frame would be freed upon unwind.
Fix:
Store the variables that are needed in local variables and use them
instead.
fixes: #2239
Change-Id: I6b2e48e87c85de27fad67a12d97abd91fa27c0c1
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* features/index: Optimize link-count fetching code path
Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent->inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.
Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv->pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv->pending_count is -ve.
Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
directory, so that the next lookup can compute the link-count.
fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* addressed comments
Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* tests: Handle base index absence
Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* Addressed LOCK based comments, .t comments
Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.
This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.
This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.
Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
CID: 1444461
A lock is being destroyed, but in some code-flows might be used later
on, modified code-flow to make sure the destroyed lock is not being used
in all cases.
Change-Id: I9610d56d9cb8a8ab7062e9094493dba9afdd0b30
updates: #1060
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
| |
Fixes CID: 1124725
Updates: #1060
Change-Id: Iced092c5ad1a9445e4c758f09a481501bae7275f
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 8e7bfd6a58b444b26cb50fb98870e77302f3b9eb changed the syntax for
arbiter volume creation to 'replica 2 arbiter 1', while still allowing
the old syntax of 'replica 3 arbiter 1'. But while doing so, it also
removed a conditional check, thereby allowing replica count > 3. This
patch fixes it.
Fixes: #2192
Change-Id: Ie109325adb6d78e287e658fd5f59c26ad002e2d3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in glusterd_svc_start:
1) synctaskA gets attach_lock and then releases big_lock to execute runner_run.
2) synctaskB then gets big_lock but can not gets attach_lock and then wait.
3) After executes runner_run, synctaskA then gets big_lock but synctaskB holds it, wait.
This leads to deadlock.
This patch uses runner_run_nowait to avoid the deadlock.
fixes: #2117
Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn>
|
|
|
|
|
|
|
|
|
| |
Wrong function name was mentioned in API doc
for `glfs_get_volfile`.
Change-Id: Id2251837f53270f1f03b8a5501ea335b7995873b
Updates: #1000
Signed-off-by: Aravinda Vishwanathapura <aravinda@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* dict: avoid hash calculation when hash_size=1 (link list imp)
Currently dict_t always constructs with dict::hash_size = 1.
With this initializing the dict implemented as a link-list
and searching for a key is done by iteration using key comparison.
Therfore we can avoid the hash-calculation done each set()/get() in this case.
Fixes: #2013
Change-Id: Id93286a8036064d43142bc2b2f8d5a3be4e97fc4
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
* dict: avoid hash calculation when hash_size=1 (list imp)
Currently dict_t always constructs with dict::hash_size = 1.
With this initializing the dict implemented as a link-list
and searching for a key is done by iteration using key comparison.
Therfore we can avoid the hash-calculation done each set()/get() in this case.
Fix:
using new macro to delimit and avoid blocks related to hash imp
Fixes: #2013
Change-Id: I31180b434a6e9e7bbb456c7ad888c147c4ce3308
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
|
|
|
|
|
|
|
| |
Fixes coverity issues 1447029 and 1447028.
Updates: #2161
Change-Id: I6a564231d6aeb76de20675b7ced5d45eed8c377f
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
| |
Change-Id: I97e73c0aae74fc5d80c975f56f2f7a64e3e1ae95
Updates: #2169
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cluster/afr: Fix race in lockinfo (f)getxattr
A shared dictionary was updated outside the lock after having updated
the number of remaining answers. This means that one thread may be
processing the last answer and unwinding the request before another
thread completes updating the dict.
Thread 1 Thread 2
LOCK()
call_cnt-- (=1)
UNLOCK()
LOCK()
call_cnt-- (=0)
UNLOCK()
update_dict(dict)
if (call_cnt == 0) {
STACK_UNWIND(dict);
}
update_dict(dict)
if (call_cnt == 0) {
STACK_UNWIND(dict);
}
The updates from thread 1 are lost.
This patch also reduces the work done inside the locked region and
reduces code duplication.
Fixes: #2161
Change-Id: Idc0d34ab19ea6031de0641f7b05c624d90fac8fa
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When parallel-readdir is enabled, readdir(p) requests sent by DHT can be
immediately processed and answered in the same thread before the call to
STACK_WIND_COOKIE() completes.
This means that the readdir(p) cbk is processed synchronously. In some
cases it may decide to send another readdir(p) request, which causes a
recursive call.
When some special conditions happen and the directories are big, it's
possible that the number of nested calls is so high that the process
crashes because of a stack overflow.
This patch fixes this by not allowing nested readdir(p) calls. When a
nested call is detected, it's queued instead of sending it. The queued
request is processed when the current call finishes by the top level
stack function.
Fixes: #2169
Change-Id: Id763a8a51fb3c3314588ec7c162f649babf33099
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the commit (c878174) we have introduced a check
to avoid stale layout issue.To avoid a stale layout
issue dht has set a key along with layout at the time
of wind a create fop and posix validates the parent
layout based on the key value. If layout does not match
it throw and error.In case of volume shrink layout has
been changed by reabalance daemon and if layout does not
matches dht is not able to wind a create fop successfully.
Solution: To avoid the issue populate a key only while
dht has wind a fop first time. After got an
error in 2nd attempt dht takes a lock and then
reattempt to wind a fop again.
Fixes: #2187
Change-Id: Ie018386e7823a11eea415496bb226ca032453a55
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The test case ( tests/basic/glusterd-restart-shd-mux.t ) was
introduced as a part of shd mux feature but we observed the
feature is not stable and we already planned to revert a feature.
For the time being I am moving a test case to flaky to
avoid a frequent regression failure.
Fixes: #2190
Change-Id: I4a06a5d9212fb952a864d0f26db8323690978bfc
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* fuse: add an option to specify the mount display name
There are two things this PR is fixing.
1. When a mount is specified with volfile (-f) option, today, you can't make it out its from glusterfs as only volfile is added as 'fsname', so we add it as 'glusterfs:/<volname>'.
2. Provide an options for admins who wants to show the source of mount other than default (useful when one is not providing 'mount.glusterfs', but using their own scripts.
Updates: #1000
Change-Id: I19e78f309a33807dc5f1d1608a300d93c9996a2f
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
| |
when passing wrong volume-name which doesn't exits, it will get stuck.
The errno is 0 inited in glusterd-handshake.c. After initing the errno,
the process blocks in gf_fuse_umount.
|
|
|
|
|
|
|
|
|
|
|
| |
In the commit 61ae58e67567ea4de8f8efc6b70a9b1f8e0f1bea
introduced a coverity bug use object after cleanup
the object.
Cleanup memory after comeout from a critical section
Fixes: #2180
Change-Id: Iee2050c4883a0dd44b8523bb822b664462ab6041
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
| |
A couple of methods are not being used, removing them.
Change-Id: I5bb4b7f04bae9486cf9b7960cf5ed91d0b59c8c7
updates: #1000
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rebalance cli is not showing correct status after reboot.
The CLI is not correct status because defrag object is not
valid at the time of creating a rpc connection to show the status.
The defrag object is not valid because at the time of start a glusterd
glusterd_restart_rebalance can be call almost at the same time by two
different synctask and glusterd got a disconnect on rpc object and it
cleanup the defrag object.
Solution: To avoid the defrag object populate a reference count before
create a defrag rpc object.
Fixes: #1339
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Change-Id: Ia284015d79beaa3d703ebabb92f26870a5aaafba
|