| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: dict_get_with_ref throw a message "dict or key is NULL"
if dict or key is NULL.
Solution: Before access a key check if dictionary is valid.
> Fixes: #1909
> Change-Id: I50911679142b52f854baf20c187962a2a3698f2d
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> Cherry picked from commit de1b26d68e31b029a59e59a47b51a7e3e6fbfe22
> Reviewed on upstream link https://github.com/gluster/glusterfs/pull/1910
Fixes: #1909
Change-Id: I50911679142b52f854baf20c187962a2a3698f2d
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cluster/afr: Fix race in lockinfo (f)getxattr
A shared dictionary was updated outside the lock after having updated
the number of remaining answers. This means that one thread may be
processing the last answer and unwinding the request before another
thread completes updating the dict.
Thread 1 Thread 2
LOCK()
call_cnt-- (=1)
UNLOCK()
LOCK()
call_cnt-- (=0)
UNLOCK()
update_dict(dict)
if (call_cnt == 0) {
STACK_UNWIND(dict);
}
update_dict(dict)
if (call_cnt == 0) {
STACK_UNWIND(dict);
}
The updates from thread 1 are lost.
This patch also reduces the work done inside the locked region and
reduces code duplication.
Fixes: #2161
Change-Id: Idc0d34ab19ea6031de0641f7b05c624d90fac8fa
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lookup-optimize doesn't provide any benefit for virtualized
environments and gluster-block workloads, but it's known to cause
corruption in some cases when sharding is also enabled and the volume
is expanded or shrunk.
For this reason, we disable lookup-optimize by default on those
environments.
Fixes: #2253
Change-Id: I25861aa50b335556a995a9c33318dd3afb41bf71
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.
This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.
This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.
Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Added
* provided option to enable/disable storage.linux-io_uring during compilation
* Healing data in 1MB chunks instead of 128KB for improving healing performance
Updates: #2301
Change-Id: Iae49287cca00681426b4ecac85f1122912492ed5
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Since commit bd540db1e, eager-locking was enabled for fsync. But on
certain VM workloads wit sharding enabled, shard xlator keeps sending
fsync on the base shard. This can cause blocked inodelks from other
clients (including shd) to time out due to call bail.
Fix:
Make afr fsync aware of inodelk count and not delay post-op + unlock
when inodelk count > 1, just like writev.
Code is restructured so that any fd based AFR_DATA_TRANSACTION can be made
aware by setting GLUSTERFS_INODELK_DOM_COUNT in xdata request.
Note: We do not know yet why VMs go in to paused state because of the
blocked inodelks but this patch should be a first step in reducing the
occurence.
Updates: #2198
Change-Id: Ib91ebdd3101d590c326e69c829cf9335003e260b
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, if liburing is not present on the machine where gluster rpms are
being built, then the built rpm won't have the feature present in posix.so.
While this is obviously displayed in the ./configure's summary, it means the
feature won't work on a target machine where the rpm is installed, even if the
target has Linux kernel >=5.1 and liburing installed.
I think it is better to have a configure option `--enable-linux-io_uring` which
is on by default. That way, the build machines will error out by default and
will need to `./configure --disable-linux-io_uring` to compile or install the
lbirary and headers on the build machine.
Fixes: #2063
Change-Id: Ide1daa11b3513210d12be8d2cb683a4084d41e18
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 8e7bfd6a58b444b26cb50fb98870e77302f3b9eb changed the syntax for
arbiter volume creation to 'replica 2 arbiter 1', while still allowing
the old syntax of 'replica 3 arbiter 1'. But while doing so, it also
removed a conditional check, thereby allowing replica count > 3. This
patch fixes it.
Updates: #2192
Change-Id: Ie109325adb6d78e287e658fd5f59c26ad002e2d3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
priv->root_inode seems to be a remenant of pump xlator and was getting
populated in discover code path. thin-arbiter code used it to populate
loc info but it seems that in case of some daemons like quotad, the
discover path for root gfid is not hit, causing it to crash.
Fix:
root inode can be accessed via this->itable->root, so use that and
remove priv->rot_inode instances from the afr code.
Fixes: #2234
Change-Id: Iec59c157f963a4dc455652a5c85a797d00cba52a
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Issue:
Mounting shared storage volume was failing in ipv6 env if the hostnames were FQDNs.
The brickname for the volume was being cut off, as a result, volume creation was failing
Change-Id: Ib38993724c709b35b603f9ac666630c50c932c3e
Updates: #1406
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
After glibc 2.32, lchmod() is returning EOPNOTSUPP instead of ENOSYS when
called on symlinks. The man page says that the returned code is ENOTSUP.
They are the same in linux, but this patch correctly handles all errors.
Fixes: #2154
Change-Id: Ib3bb3d86d421cba3d7ec8d66b6beb131ef6e0925
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current block size used for self-heal by default is 128 KiB. This
requires a significant amount of management requests for a very small
portion of data healed.
With this patch the block size is increased to 4 MiB. For a standard
EC volume configuration of 4+2, this means that each healed block of
a file will update 1 MiB on each brick.
Change-Id: Ifeec4a2d54988017d038085720513c121b03445b
Updates: #2067
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
DHT was passing NULL for xdata in fgetxattr() request, ignoring any
data sent by upper xlators.
This patch fixes the issue by sending the received xdata to lower
xlators, as it's currently done for getxattr().
Fixes: #1991
Change-Id: If3d3f1f2ce6215f3b1acc46480e133cb4294eaec
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
| |
At the moment self-heal-window-size is 128KB. This leads to healing data
in 128KB chunks. With the growth of data and the avg file sizes
nowadays, 1MB seems like a better default.
Change-Id: I70c42c83b16c7adb53d6b5762969e878477efb5c
Fixes: #2067
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
fix-layout operation assumes that the directory passed is directory i.e.
layout->cnt == conf->subvolume_cnt. This will lead to a crash when
fix-layout is attempted on a file.
Fix:
Disallow fix-layout on files
fixes: #2107
Change-Id: I2116b8773059f67e3260e9207e20eab3de711417
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
| |
Added
* io_uring requires kernel support
* 5k volume was tested on 3 nodes with brick mux enabled
Updates: #1868
Change-Id: Ib76548398ca6099f5c7c68a091aa1a4fcb5de536
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
| |
LTO isn't added to the build when it is
configured with "--enable-debug"
Fixes: #1772
Change-Id: I87300d950871bdda6542d9bbfb6bdffd500585cc
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* doc: Added release 9.0 notes
Updates: #1868
Change-Id: I8a49ab7ccfd45bd3b469ce34b6625b99cbfbb329
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* doc: Added release 9.0 notes
Updated the highlights and features section according to the
review comments.
Updates: #1868
Change-Id: I3fa89e9f186e15a074cf13a14e711e572df10886
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
* doc: Added release 9.0 notes
Updated the dates of maintenance update.
Updates: #1868
Change-Id: I4049e019312d951eb68f7b68914b265bfe74e2e4
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In non distribute volumes (plain replicate, ec), DHT uses pass-through
FOPs (dht_pt_getxattr) instead of the usual FOPS (dht_getxattr). The
pass through FOP was not handling the DHT_SUBVOL_STATUS_KEY virtual
xattr because of which geo-rep session was going into a faulty state.
Fixing it now.
updates: #1925
Change-Id: I766b5b5c047c954a9957ab78aca680eedef1ff1f
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure: enabling LTO with gcc 10 or above
Adding LTO to the build when gcc_version >= 10
Applicable for source and RPMs build
To disable: ./configure --disable-lto
Fixes: #1772
Change-Id: Ia50210af2e88a5cc188c47b4e61a66397e179257
Signed-off-by: Tamar Shacked <tshacked@redhat.com>
|
|
|
|
|
|
| |
updates: #1868
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During glusterd handshake glusterd received a volume dictionary
from peer end to compare the own volume dictionary data.If the options
are differ it sets the key to recognize volume options are changed
and call import syntask to delete/start the volume.In brick_mux
environment while number of volumes are high(5k) the dict api in function
glusterd_compare_friend_volume takes time because the function
glusterd_handle_friend_req saves all peer volume data in a single dictionary.
Due to time taken by the function glusterd_handle_friend RPC requests receives
a call_bail from a peer end gluster(CLI) won't be able to show volume status.
Solution: To optimize the code done below changes
1) Populate a new specific dictionary to save the peer end version specific
data so that function won't take much time to take the decision about the
peer end has some volume updates.
2) In case of volume has differ version set the key in status_arr instead
of saving in a dictionary to make the operation is faster.
Note: To validate the changes followed below procedure
1) Setup 5100 distributed volumes 3x1
2) Enable brick_mux
3) Start all the volumes
4) Kill all gluster processes on 3rd node
5) Run a loop to update volume option on a 1st node
for i in {1..5100}; do gluster v set vol$i performance.open-behind off; done
6) Start the glusterd process on the 3rd node
7) Wait to finish handshake and check there should not be any call_bail message
in the logs
Change-Id: Ibad7c23988539cc369ecc39dea2ea6985470bee1
Fixes: #1613
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
| |
These were the only offensive language occurences in the code (.c) after
making the changes for geo-rep (whichis tracked in issue 1415).
Change-Id: I21cd558fdcf8098e988617991bd3673ef86e120d
Updates: #1000
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
| |
Reason: from RHEL 8.3, tar is not bundled by default
Fixes: #1849
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
Change-Id: Ic1424e0550cef6a78e3e9e7b42665ab01016436f
|
|
|
|
|
|
|
| |
When compile GlusterFS without git repository, a git error will fail the make.
Avoid to execute git commands when there is no git repository.
Fixes: #1855
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
entry-self-heal-anon-dir-off.t was failing occasionally because
afr_gfid_split_brain_source() returned -1 instead of -EIO for
split-brains, causing the code to proceed to afr_lookup_done(), which
in turn succeeded the lookup if there was a parallel client side heal
going on.
Fix:
Return -EIO instead of -1 so that lookp fails.
Also, afr_selfheal_name() was using the same dict to get and set values. This
could be problematic if the caller passed local->xdata_req, since
setting a response in a request dict can lead to bugs.So changed it to use
separate request and response dicts.
Fixes: #1739
Credits Pranith Karampuri <pranith.karampuri@phonepe.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: I5cb4c547fb25e6bfc8bec1740f7eb64e1a5ad443
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DHT/Rebalance - Ensure Rebalance reports status only once upon stopping
Upon issuing rebalance stop command, the status of rebalance is being
logged twice to the log file, which can sometime result in an
inconsistent reports (one report states status stopped, while the other
may report something else).
This fix ensures rebalance reports it's status only once and that the
correct status is being reported.
fixes: #1782
Change-Id: Id3206edfad33b3db60e9df8e95a519928dc7cb37
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
update was missed in the function posix_get_gfid2path if
GF_MALLOC is failed.
Solution: Update the ret value to -1 if GF_MALLOC is failed
Fixes: #1836
Change-Id: I510ebf0605ee49b84ff3570948771319f283b10e
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* core: tcmu-runner process continuous growing logs lru_size showing -1
At the time of calling inode_table_prune it checks if current lru_size
is greater than lru_limit but lru_list is empty it throws a log message
"Empty inode lru list found but with (%d) lru_size".As per code reading
it seems lru_size is out of sync with the actual number of inodes in
lru_list. Due to throwing continuous error messages entire disk is
getting full and the user has to restart the tcmu-runner process to use
the volumes.The log message was introduce by a patch
https://review.gluster.org/#/c/glusterfs/+/15087/.
Solution: Introduce a flag in_lru_list to take decision about inode is
being part of lru_list or not.
Fixes: #1775
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Change-Id: I4b836bebf4b5db65fbf88ff41c6c88f4a7ac55c1
* core: tcmu-runner process continuous growing logs lru_size showing -1
Update in_lru_list flag only while modify lru_size
Fixes: #1775
Change-Id: I3bea1c6e748b4f50437999bae59edeb3d7677f47
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
* core: tcmu-runner process continuous growing logs lru_size showing -1
Resolve comments in inode_table_destroy and inode_table_prune
Fixes: #1775
Change-Id: I5aa4d8c254f0fe374daa5ec604f643dea8dd56ff
Signed-off-by: Mohit Agrawal moagrawa@redhat.com
* core: tcmu-runner process continuous growing logs lru_size showing -1
Update in_lru_list only while update lru_size
Fixes: #1775
Change-Id: I950eb1f0010c3d4bcc44a33225a502d2291d1a83
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
| |
Change-Id: I65d488674763160b06d8f248ff74ea4d144ecf8b
Signed-off-by: Shwetha K Acharya <sacharya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#1814)
Comments and idea proposed by: Xavi Hernandez(jahernan@redhat.com):
On production systems sometimes we see a log message saying that an assertion
has failed. But it's hard to track why it failed without additional information
(on debug builds, a GF_ASSERT() generates a core dump and kills the process,
so it can be used to debug the issue, but many times we are only able to
reproduce assertion failures on production systems, where GF_ASSERT() only logs
a message and continues).
In other cases we may have a core dump caused by a bug, but the core dump doesn't
necessarily happen when the bug has happened. Sometimes the crash happens so much
later that the causes that triggered the bug are lost. In these cases we can add
more assertions to the places that touch the potential candidates to cause the bug,
but the only thing we'll get is a log message, which may not be enough.
One solution would be to always generate a core dump in case of assertion failure,
but this was already discussed and it was decided that it was too drastic. If a
core dump was really needed, a new macro was created to do so: GF_ABORT(),
but GF_ASSERT() would continue to not kill the process on production systems.
I'm proposing to modify GF_ASSERT() on production builds so that it conditionally
triggers a signal when a debugger is attached. When this happens, the debugger
will generate a core dump and continue the process as if nothing had happened.
If there's no debugger attached, GF_ASSERT() will behave as always.
The idea I have is to use SIGCONT to do that. This signal is harmless, so we can
unmask it (we currently mask all unneeded signals) and raise it inside a GF_ASSERT()
when some global variable is set to true.
To produce the core dump, run the script under extras/debug/gfcore.py on other
terminal. gdb breaks and produces coredump when GF_ASSERT is hit.
The script is copied from #1810 which is written by Xavi Hernandez(jahernan@redhat.com)
Fixes: #1810
Change-Id: I6566ca2cae15501d8835c36f56be4c6950cb2a53
Signed-off-by: Vinayakswami Hariharmath <vharihar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
As detailed in the github issue,`gluster volume set Svolname ganesha.enable on`
is currently broken due to a minor typo in the commit e081ac683b6a5bda548913,
Fixing it now.
Updates: #1778
Change-Id: I99276fedc43f40e8a439e545bd2b8d1698aa03ee
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Strahil Nikolov <hunter86_bg@yahoo.com>
|
|
|
|
|
|
|
|
|
| |
Call posix_io_uring_fini only if it was inited to begin with.
Fixes: #1794
Reported-by: Mohit Agrawal <moagrawa@redhat.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Change-Id: I0e840b6b1d1f26b104b30c8c4b88c14ce4aaac0d
|
|
|
|
|
|
|
|
|
| |
afr_is_lock_mode_mandatory throws a warning message while xdata
is not valid, to avoid a message call a function only while xdata
is valid.
Fixes: #1796
Change-Id: I32d37960ea4e936ba87e65811c1792a2f1158c0d
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
iobref was not freed before exiting the function
if all the checks were OK, which caused the resource
leak.
Fix:
Modified the code a bit to avoid use of an extra reference
to the label, and to free the iobref and iobuf if not NULL,
and then exit the function.
CID: 1430118
Updates: #1060
|
|
|
|
|
|
|
|
|
|
| |
better whitespace in regex
This has worked for years, but somehow no longer works on rhel8
Updates: #1000
Change-Id: I2c1a3537573d125608334772ba1a263c55407dd4
Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently posix xlator spawns posix_disk_space_threads per brick and in
case of brick_mux environment while glusterd attached bricks at maximum
level(250) with a single brick process in that case 250 threads are
spawned for all bricks and brick process memory size also increased.
Solution: Attach a posix_disk_space thread with glusterfs_ctx to
spawn a thread per process basis instead of spawning a per brick
Fixes: #1482
Change-Id: I8dd88f252a950495b71742e2a7588bd5bb019ec7
Signed-off-by: Mohit Agrawal moagrawa@redhat.com
|
|
|
|
|
|
|
|
| |
Add more sections to get more information from user to avoid back and forth queries.
This will help community to provide faster resolution.
Change-Id: I88be0214cea7cfa979bedb4aab7c312e0ff8d5f3
Updates: #1743
|
|
|
|
|
|
| |
fixes: #1302
Change-Id: If0e21f016155276a953c64a8dd13ff3eb281d09d
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* tests: Fix issues in CentOS 8
Due to some configuration changes in CentOS 8/RHEL 8, ssl-ciphers.t
and bug-1053579.t were failing.
The first one was failing because TLS v1.0 is disabled by default. The
test hash been updated to check that at least one of TLS v1.0, v1.1 or
v1.2 succeeds.
For the second case, the issue is that the test assumed that the
latest added group to a user should always be listed the last, but
this is not always true because nsswitch.conf now uses 'sss' before
'files', which means that data comes from a db that could not be
sorted.
Updates: #1009
Change-Id: I4ca01a099854ec25926c3d76b3a98072175bab06
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
* tests: Fix TLS version detection
The old test didn't correctly determine which version of TLS should
be allowed by openssl.
Change-Id: Ic081c329d5ed1842fa9f5fd23742ae007738aec0
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f5e1eb87d4af44be3b317b7f99ab88f89c2f0b1a meant to enable the
volume option only for replica volumes but inadvertently enabled
it for all volume types. Fixing it now.
Also found a bug in glusterd where disabling the option on plain
distribute was succeeding even though setting it in the fist place
fails. Fixed that too.
Fixes: #1483
Change-Id: Icb6c169a8eec44cc4fb4dd636405d3b3485e91b4
Reported-by: Sheetal Pamecha <spamecha@redhat.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Added an error message in CLI when there are volumes present in cluster
but timeout happens on fetching them.
This PR fixes #1738
Signed-off-by: nik-redhat <nladha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
-Removed the occurences of 'master' in api.
-Some changes threw up clang-format errors, so fixed them as well.
-Renamed api/src/{glfs-master.c => glfs-primary.c}
Fixes: #1733
Change-Id: I57aea9d93e219305e87985bc2f81ac47cdebb72f
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In case of file is having huge xattrs on backend a brick process is
crashed while alloca(size) limit has been crossed 256k because iot_worker
stack size is 256k.
Use MALLOC to allocate memory instead of using alloca
Fixes: #1699
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Change-Id: I100468234f83329a7d65b43cbe4e10450c1ccecd
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch help generate appropriate error message
when the gfapi tries to write data equal to or
greater than 1 Gb due to the limitation at the
socket layer.
fixes: #1518
Change-Id: I1234a0b5a6e675a0b20c6b1afe0f4390fd721f6f
Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pass-through option was implemented for all performance xlators
in commit 549b547, but write-behind was missed.
This patch implements the functionality for write-behind.
Given that it's not safe to enable or disable this option while the
volume is mounted, it cannot be reconfigured online. A change will
only take effect after a remount.
Fixes: #1565
Change-Id: I189a48e0044b292e1d6c3b77751ff25045531883
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
core:change xlator_t->ctx->master to xlator_t->ctx->primary
afr: just changed comments.
meta: change .meta/master to .meta/primary. Might break scripts.
changelog: variable/function name changes only.
These are unrelated to geo-rep.
Fixes: #1713
Change-Id: I58eb5fcd75d65fc8269633acc41313503dccf5ff
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue: It is seem that the initialization of rpc to
connect with quotad is done in every glusterfs cli command,
irrespective of whether the quota feature is enabled or disabled.
This seems to be an overkill.
Code change: The file /var/run/quotad/quotad.pid is present
signals that quotad is enabled. Hence we can put a conditional
check for seeing when this file exists and if it doesn't we
just skip over the initialization of the global quotad rpc.
This will go on to reduce the extra rpc calls and operations
being performed in the kernel space.
Fixes: #1577
Change-Id: Icb69d35330f76ce95626f59af75a12726eb620ff
Signed-off-by: srijan-sivakumar <ssivakumar@redhat.com>
Co-authored-by: srijan-sivakumar <ssivakumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* extras/rebalance: Script to perform directory rebalance
How should the script be executed?
$ /path/to/directory-rebalance.py <dir-to-rebalance>
will do rebalance just for that directory. The script assumes that fix-layout
operation is completed for all the directories present inside the
<dir-to-rebalance>
How does it work?
For the given directory path that needs to be rebalanced, full crawl is
performed and the files that need to be healed and the size of each file
is first written to the index. Once building the index is completed, the
index is read and for each file the script executes equivalent of
setfattr -n trusted.distribute.migrate-data -v 1 <path/to/file>
Why does the script take two passes?
Printing a sensible ETA has been a primary goal of the script. Without
knowing the approximate size that will be rebalanced, it is difficult to
find ETA. Hence the script does one pass to find files, sizes which it
writes to the index file and then the next pass is done on the
index file. It takes a minute or two for the ETA to converge but in our
testing it has been giving a reasonable ETA
What versions does the script support?
For the script to work correctly, dht should handle
"trusted.distribute.migrate-data" setxattr correctly.
fixes: #1654
Change-Id: Ie5070127bd45f1a1b9cd18ed029e364420c971c1
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issuing a stop command for an ongoing rebalance process results in an error.
This issue was brought up in https://bugzilla.redhat.com/1286171 and a patch
(https://review.gluster.org/24103/) was submitted to resolve the issue.
However the submitted patch resolved only part of the
problem by reducing the number of log messages that were printed (since
rebalnace is currently a recursive process, an error message was printed
for every directory) but didn't fully resolve the root cause for the
failure.
This patch fixes the issue by modifying the code-path which handles the
termination of the rebalance process by issuing a stop command.
fixes: #1627
Change-Id: I604f2b0f8b1ccb1026b8425a14200bbd1dc5bd03
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|