| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The final test doesn't test what it means to test. It still fails as
expected, but only because at this point `THROTTLE_LEVEL` is still set
to `garbage`.
Easily fixed by correcting the typos in the variable names, and thus
fixes https://github.com/gluster/glusterfs/issues/2315
Signed-off-by: Jamie Nguyen <j@jamielinux.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment dht rebalance doesn't give any option to disable fsync
after data migration. Making this an option would give admins take
responsibility of data in a way that is suitable for their cluster.
Default value is still 'on', so that the behavior is intact for people
who don't care about this.
For example: If the data that is going to be migrated is already backed
up or snapshotted, there is no need for fsync to happen right after
migration which can affect active I/O on the volume from applications.
fixes: #2258
Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* features/index: Optimize link-count fetching code path
Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent->inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.
Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv->pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv->pending_count is -ve.
Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
directory, so that the next lookup can compute the link-count.
fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* addressed comments
Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* tests: Handle base index absence
Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
* Addressed LOCK based comments, .t comments
Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 8e7bfd6a58b444b26cb50fb98870e77302f3b9eb changed the syntax for
arbiter volume creation to 'replica 2 arbiter 1', while still allowing
the old syntax of 'replica 3 arbiter 1'. But while doing so, it also
removed a conditional check, thereby allowing replica count > 3. This
patch fixes it.
Fixes: #2192
Change-Id: Ie109325adb6d78e287e658fd5f59c26ad002e2d3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The test case ( tests/basic/glusterd-restart-shd-mux.t ) was
introduced as a part of shd mux feature but we observed the
feature is not stable and we already planned to revert a feature.
For the time being I am moving a test case to flaky to
avoid a frequent regression failure.
Fixes: #2190
Change-Id: I4a06a5d9212fb952a864d0f26db8323690978bfc
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
|
|
|
|
|
| |
fixes: #2159
Change-Id: Ibaaebc48b803ca6ad4335c11818c0c71a13e9f07
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* glusterd-volgen: Add functionality to accept any custom xlator
Add new function which allow users to insert any custom xlators.
It makes to provide a way to add any processing into file operations.
Users can deploy the plugin(xlator shared object) and integrate it to glusterfsd.
If users want to enable a custom xlator, do the follows:
1. put xlator object(.so file) into "XLATOR_DIR/user/"
2. set the option user.xlator.<xlator> to the existing xlator-name to specify of the position in graph
3. restart gluster volume
Options for custom xlator are able to set in "user.xlator.<xlator>.<optkey>".
Fixes: #1943
Signed-off-by:Ryo Furuhashi <ryo.furuhashi.nh@hitachi.com>
Co-authored-by: Yaniv Kaul <ykaul@redhat.com>
Co-authored-by: Xavi Hernandez <xhernandez@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
| |
TODO:
Remove 'slave-timeout' and 'slave-gluster-command-dir'.
These variables are defined in geo-replication/gsyncd.conf.in.
So I will remove them when I change that folder.
Change-Id: Ib9167ca586d83e01f8ec755cdf58b3438184c9dd
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a part of offensive language removal, we changed 'master' to 'primary' in
some parts of the code that are *not* related to geo-replication via
commits e4c9a14429c51d8d059287c2a2c7a76a5116a362 and
0fd92465333be674485b984e54b08df3e431bb0d.
But it is better to use 'root' in some places to distinguish it from the
geo-rep changes which use 'primary/secondary' instead of 'master/slave'.
This patch mainly changes glusterfs_ctx_t->primary to
glusterfs_ctx_t->root. Other places like meta xlator is also changed.
gf-changelog.c is not changed since it is related to geo-rep.
Updates: #1000
Change-Id: I3cd610f7bea06c7a28ae2c0104f34291023d1daf
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f5e1eb87d4af44be3b317b7f99ab88f89c2f0b1a meant to enable the
volume option only for replica volumes but inadvertently enabled
it for all volume types. Fixing it now.
Also found a bug in glusterd where disabling the option on plain
distribute was succeeding even though setting it in the fist place
fails. Fixed that too.
Fixes: #1483
Change-Id: Icb6c169a8eec44cc4fb4dd636405d3b3485e91b4
Reported-by: Sheetal Pamecha <spamecha@redhat.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
core:change xlator_t->ctx->master to xlator_t->ctx->primary
afr: just changed comments.
meta: change .meta/master to .meta/primary. Might break scripts.
changelog: variable/function name changes only.
These are unrelated to geo-rep.
Fixes: #1713
Change-Id: I58eb5fcd75d65fc8269633acc41313503dccf5ff
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* extras/rebalance: Script to perform directory rebalance
How should the script be executed?
$ /path/to/directory-rebalance.py <dir-to-rebalance>
will do rebalance just for that directory. The script assumes that fix-layout
operation is completed for all the directories present inside the
<dir-to-rebalance>
How does it work?
For the given directory path that needs to be rebalanced, full crawl is
performed and the files that need to be healed and the size of each file
is first written to the index. Once building the index is completed, the
index is read and for each file the script executes equivalent of
setfattr -n trusted.distribute.migrate-data -v 1 <path/to/file>
Why does the script take two passes?
Printing a sensible ETA has been a primary goal of the script. Without
knowing the approximate size that will be rebalanced, it is difficult to
find ETA. Hence the script does one pass to find files, sizes which it
writes to the index file and then the next pass is done on the
index file. It takes a minute or two for the ETA to converge but in our
testing it has been giving a reasonable ETA
What versions does the script support?
For the script to work correctly, dht should handle
"trusted.distribute.migrate-data" setxattr correctly.
fixes: #1654
Change-Id: Ie5070127bd45f1a1b9cd18ed029e364420c971c1
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cluster/dht: Perform migrate-file with lk-owner
1) Added GF_ASSERT() calls in client-xlator to find these
issues sooner.
2) Fuse is setting zero-lkowner with len as 8 when the fop
doesn't have any lk-owner. Changed this to have len as 0
just as we have in fops triggered from xlators lower to
fuse.
* syncop: Avoid frame allocation if we can
* cluster/dht: Set lkowner in daemon rebalance code path
* cluster/afr: Set lkowner for ta-selfheal
* cluster/ec: Destroy frame after heal is done
* Don't assert for lk-owner in lk call
* set lkowner for mandatory lock heal tests
fixes: #1529
Change-Id: Ia803db6b00869316893abb1cf435b898eec31228
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
| |
Exclude more contrib/fuse-lib objects to avoid
silly tests/basic/0symbol-check.t breakage.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1692
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. The option has been enabled and tested for quite some time now in RHHI-V
downstream and I think it is safe to make it 'on' by default. Since it
is not possible to simply change it from 'off' to 'on' without breaking
rolling upgrades, old clients etc., I have made it default only for new volumes
starting from op-verison GD_OP_VERSION_9_0.
Note: If you do a volume reset, the option will be turned back off.
This is okay as the dir's gfid will be captured in 'xattrop' folder and heals
will proceed. There might be stale entries inside entry-changes' folder,
which will be removed when we enable the option again.
2. I encountered a cust. issue where entry heal was pending on a dir. with
236436 files in it and the glustershd.log output was just stuck at
"performing entry selfheal", so I have added logs to give us
more info in DEBUG level about whether entry heal and data heal are
progressing (metadata heal doesn't take much time). That way, we have a
quick visual indication to say things are not 'stuck' if we briefly
enable debug logs, instead of taking statedumps or checking profile info
etc.
Fixes: #1483
Change-Id: I4f116f8c92f8cd33f209b758ff14f3c7e1981422
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The current graph-switch code sets priv->handle_graph_switch to false even
when graph-switch is in progress which leads to crashes in some cases
Fix:
priv->handle_graph_switch should be set to false only when graph-switch
completes.
fixes: #1539
Change-Id: I5b04f7220a0a6e65c5f5afa3e28d1afe9efcdc31
Signed-off-by: Pranith Kumar K <pranith.karampuri@phonepe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem1:
When a directory is renamed while a brick
is down entry-heal always did an rm -rf on that directory on
the sink on old location and did mkdir and created the directory
hierarchy again in the new location. This is inefficient.
Problem2:
Renamedir heal order may lead to a scenario where directory in
the new location could be created before deleting it from old
location leading to 2 directories with same gfid in posix.
Fix:
As part of heal, if oldlocation is healed first and is not present in
source-brick always rename it into a hidden directory inside the
sink-brick so that when heal is triggered in new-location shd can
rename it from this hidden directory to the new-location.
If new-location heal is triggered first and it detects that the
directory already exists in the brick, then it should skip healing the
directory until it appears in the hidden directory.
Credits: Ravi for rename-data-loss.t script
Fixes: #1211
Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
| |
glfs_mgmt_init is only called for glfs_set_volfile_server, but
secure_mgmt is also required to use glfs_set_volfile with SSL.
fixes: #829
Change-Id: Ibc769fe634d805e085232f85ce6e1c48bf4acc66
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
feature/metadisp is an xlator for performing "metadata dispersal" across
multiple children. it does this by flattening the complex
POSIX paths into /$GFID style paths, then forwarding the
metadata operations to its first child and forwarding the
data operations to its second child.
The purpose of this xlator is to allow separation of data and metadata,
in cases where metadata might be stored in another format (embedded kv?),
on another disk (ssd), on another host (dht2).
Change-Id: I392c8bd0c867a3237d144aea327323f700a2728d
Updates: #816
Signed-Off-By: Sheena Artrip <sheenobu@fb.com>
Tested-By: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.
Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi <amar@kadalu.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
If we set favourite child policy, then automatic split-brain resolution
should work in all cases. This was failing when quorum count was set to
a non-zero value. The initial lookup before the read txn was failing
with ENOTCONN. Since we don't have a readable subvol, we were failing it.
We were only looking to the split brain resolution choice set through the
cli command.
Fix:
We will now consider the favourite child policy if split-brain choice
has not been set via cli command.
Change-Id: Id2016c3a90d0763ac6f1a0131571053f595576f0
Fixes: #1404
Signed-off-by: Mohammed Rafi KC <rafi.kavungal@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.
Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.
Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a critical flaw in the previous implementation of open-behind.
When an open is done in the background, it's necessary to take a
reference on the fd_t object because once we "fake" the open answer,
the fd could be destroyed. However as long as there's a reference,
the release function won't be called. So, if the application closes
the file descriptor without having actually opened it, there will
always remain at least 1 reference, causing a leak.
To avoid this problem, the previous implementation didn't take a
reference on the fd_t, so there were races where the fd could be
destroyed while it was still in use.
To fix this, I've implemented a new xlator cbk that gets called from
fuse when the application closes a file descriptor.
The whole logic of handling background opens have been simplified and
it's more efficient now. Only if the fop needs to be delayed until an
open completes, a stub is created. Otherwise no memory allocations are
needed.
Correctly handling the close request while the open is still pending
has added a bit of complexity, but overall normal operation is simpler.
Change-Id: I6376a5491368e0e1c283cc452849032636261592
Fixes: #1225
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- performance.cache-size has a flawed semantics, as it's
dispatched on two independent translators, io-cache
and quick-read.
- performance.qr-cache-timeout has a confusing name, as
other options affecting quick-read have an unabbreviated
"quick-read-..." prefix in their names.
We keep these options with unchanged operation, but in the
help output we indicate their deprecation.
The following better alternatives are introduced:
- performance.io-cache-size to tune cache-size option of io-cache
- performance.quick-read-cache-size to tune cache-size option of
quick-read
- performance.quick-read-cache-timeout as a preferred synonym for
performance.qr-cache-timeout
Fixes: #952
Change-Id: Ibd04fb638de8cac450ba992ad8a415154f9f4281
Signed-off-by: Csaba Henk <csaba@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently data migration in rebalance reads sparse file sequentially,
disregarding which segments are holes and which are data. This can lead
to extremely long migration time for large sparse file.
Data migration mechanism needs to be enhanced so only data segments are
read and migrated. This can be achieved using lseek to seek for holes
and data in the file.
This enhancement is a consequence of
https://bugzilla.redhat.com/show_bug.cgi?id=1823703
fixes: #1222
Change-Id: If5f448a0c532926464e1f34f504c5c94749b08c3
Signed-off-by: Barak Sason Rofman <bsasonro@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In add-brick that increases replica count
SHD was restarted after pending xattrs are set on the new bricks and
adding bricks. But before restarting SHD there is a possibility that
old SHD would do a scan on root-directory see no heal is needed and
delete index for root-dir leading to no heals until lookup is executed
on the mount
Fix:
Stop shd, perform pending-xattr setting/adding new bricks and
then restart shd
Fixes: #1240
Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
ok 32 [ 11/ 9] < 46> 'gf_rm_file_and_gfid_link /d/backends/patchy0 del-file'
not ok 33 [ 13/ 131] < 48> '! dd if=/dev/zero of=/mnt/glusterfs/0/del-file bs=1M count=1 oflag=direct' -> ''
The assumption in the test above is that the file wouldn't exist when dd
happens. But heal can lead to creation of the file in some cases leading to
spurious failures.
Fix:
Disable client side heal.
Fixes: #1245
Change-Id: I96b2b45528f9dfb3199d503a467cafafba9b387f
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In brick-mux tests, all bricks of the volume have same pid.
"generate_brick_statedump" cleans up the older statedumps
with same brick pid. So successive calls to this function
will delete previous brick's statedump as all bricks share
same pid. So grep calls to the statedump were failing leading
to failure of the .t
To fix this, stored the result we need from statedump before calling
next brick's statedump
Fixes: #1234
Change-Id: I824ed4dff79e7242b3e980364836b9af0e87a6ee
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
Fixes: #1223
Change-Id: I36cb72d920ffd77405051546615c5262c392daef
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.
Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.
Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().
Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.
The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.
Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The test is failing at
14:56:41 ok 13, LINENUM:38
14:56:41 not ok 14 Got "test-message0" instead of "test-message1", LINENUM:41
14:56:41 FAILED COMMAND: test-message1 cat /mnt/glusterfs/1/test.txt
This happens because fuse sometimes doesn't send 'read' fop to glusterfs
and is served from cache.
Fix:
Mount with direct-io-mode=yes so that read is always received by
gluster
Fixes: #1190
Change-Id: I369e2024a85dc492dc24c7579b161fb965f55d19
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Do not truncate file offsets and sizes to 32-bit to
prevent tests from spurious failures on >2Gb files.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Change-Id: I2a77ea5f9f415249b23035eecf07129f19194ac2
Fixes: #1161
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Before executing a fop in POSIX xlator it builds an internal
path based on GFID.To validate the path it call's (l)stat
system call and while .glusterfs is heavily loaded kernel takes
time to lookup inode and due to that performance drops
Solution: In this patch we followed two ways to improve the performance.
1) Keep open fd specific to first level directory(gfid[0])
in .glusterfs, it would force to kernel keep the inodes
from all those files in cache. In case of memory pressure
kernel won't uncache first level inodes. We need to open
256 fd's per brick to access the entry faster.
2) Use at based call's to access relative path to reduce
path based lookup time.
Note: To verify the patch we have executed kernel untar 100 times on 6
different clients after enabling metadata group-cache and some
other option.We were getting more than 20 percent improvement in
kenel untar after applying the patch.
Credits: Xavi Hernandez <xhernandez@redhat.com>
Change-Id: I1643e6b01ed669b2bb148d02f4e6a8e08da45343
updates: #891
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Following tests are done -
1 - After finishing reset-brick all the bricks should be up.
2 - Heal should be completed.
3 - Check number of entries present on brick which was reset.
Change-Id: I9314bed180293a99d400d94bb8cc7ece999da29e
Updates: #1144
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
"snap_scheduler.py init" command failing with the below traceback:
[root@dhcp43-104 ~]# snap_scheduler.py init
Traceback (most recent call last):
File "/usr/sbin/snap_scheduler.py", line 941, in <module>
sys.exit(main(sys.argv[1:]))
File "/usr/sbin/snap_scheduler.py", line 851, in main
initLogger()
File "/usr/sbin/snap_scheduler.py", line 153, in initLogger
logfile = os.path.join(process.stdout.read()[:-1], SCRIPT_NAME + ".log")
File "/usr/lib64/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/usr/lib64/python3.6/genericpath.py", line 151, in _check_arg_types
raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components
Solution:
Added the 'universal_newlines' flag to Popen to support backward compatibility.
Added a basic test for snapshot scheduler.
Change-Id: I78e8fabd866fd96638747ecd21d292f5ca074a4e
Fixes: #1134
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When brick-mux is enabled:
i)brick statedumps seem to be listing the same lock information multiple times.
While that is getting fixed, make changes to the .ts to check for unique values.
ii)detecting a brick as online via brick_up_status() seems to be taking
longer time when delaygen is enabled. Hence bump up PROCESS_UP_TIMEOUT to
90 for afr-lock-heal-advanced.t
Updates: #1042
Change-Id: Ife76008f7a99dd1f1fe5791a32577366baaab4b3
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.
fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Description of problem:
server.statedump-path is the path where statedumps are stored,
by default it is /var/run/gluster. And can be set to any valid
directory path. It was observed that server.statedump-path was
also accepting file, non-existent file and non-existent paths
as well. And statedump command was successful even when
statedumps with all the invalid paths.
a. A file
b. A non-existent path
Solution:
Added a validation function in gluster-volume-set.c which will
allow volume set to success if it's a valid directory
and in all other cases, volume set should fail.
Fixes: bz#1787122
Change-Id: Ia66e2b3d35f23efc5444c829928779a79d827b42
Signed-off-by: yatipadia <ypadia@redhat.com>
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Adding disperse-data to gluster manual under
volume create command
Change-Id: Ic9eb47c9e71a1d7a11af9394c615c8e90f8d1d69
Fixes: bz#1668239
Signed-off-by: Rishubh Jain <risjain@redhat.com>
Signed-off-by: Sheetal Pamecha <spamecha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Changelog creates threads even if the changelog is not enabled
Background:
Changelog xlator broadly does two things
1. Journalling - Cosumers are geo-rep and glusterfind
2. Event Notification for registered events like (open, release etc) -
Consumers are bitrot, geo-rep
The existing option "changelog.changelog" controls journalling and
there is no option to control event notification and is enabled by
default. So when bitrot/geo-rep is not enabled on the volume, threads
and resources(rpc and rbuf) related to event notifications consumes
resources and cpu cycle which is unnecessary.
Solution:
The solution is to have two different options as below.
1. changelog-notification : Event notifications
2. changelog : Journalling
This patch introduces the option "changelog-notification" which is
not exposed to user. When either bitrot or changelog (journalling)
is enabled, it internally enbales 'changelog-notification'. But
once the 'changelog-notification' is enabled, it will not be disabled
for the life time of the brick process even after bitrot and changelog
is disabled. As of now, rpc resource cleanup has lot of races and is
difficult to cleanup cleanly. If allowed, it leads to memory leaks
and crashes on enable/disable of bitrot or changelog (journal) in a
loop. Hence to be safer, the event notification is not disabled within
lifetime of process once enabled.
Change-Id: Ifd00286e0966049e8eb9f21567fe407cf11bb02a
Updates: #475
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When touch is used to create a file, the ctime is not matching
atime and mtime which ideally should match. There is a difference
in nano seconds.
Cause:
When touch is used modify atime or mtime to current time (UTIME_NOW),
the current time is taken from kernel. The ctime gets updated to current
time when atime or mtime is updated. But the current time to update
ctime is taken from utime xlator. Hence the difference in nano seconds.
Fix:
When utimesat uses UTIME_NOW, use the current time from kernel.
fixes: bz#1773530
Change-Id: I9ccfa47dcd39df23396852b4216f1773c49250ce
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Both read and write tests required writing first. Either just
writing (write test) or write and then read (read test).
So the code is now unified.
2. There's no reason to read zeros from /dev/zero. Just use a
CALLOC'ed buffer.
I don't think we should read and write zeros, but I did not change
the code yet (I think compression and/or dedup will offset results)
It appears neither read-perf nor write-perf were tested, so added
basic tests for them.
Change-Id: I24b1f249fa0335ed652a8982e99c0687d940230e
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements lock healing for gluster-block fencing use case.
If mandatory lock is enabled:
- Add domain lock/unlock to afr_lk fop.
- Maintain a list of locks to be healed in afr_private_t.
- Add lock to the list if afr_lk(F_SETLK or F_SETLKW) was sucessful.
- Remove it from the list during afr_lk(F_UNLCK).
- On child_down, mark lock as needing heal on that child. If lock is
lost on quorum no. of bricks, remove it from the list and mark fd bad.
- For fds marked as bad, fail the subsequent fd based fops.
- On parent up, traverse the list and heal the locks IFF the client is
the lk owner and has quorum. (shd does not heal any locks).
updates: #613
Change-Id: I03c46ceaea30f5e6236d5ec13f71d843d827f1bc
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
| |
updates: bz#1193929
Change-Id: I517fa29e57bde970c2c22ebc2de80fec1509cd2d
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Total blocks of XFS partition are dependent on XFS parameters used
for formatting. So hardcoded lower-bound may not be the lower bound
for different default parameters.
Used blocks are lesser on my machine for a freshly formatted XFS partition
compared to where the test fails.
On my machine:
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/loop0 98980 5472 93508 6% /d/backends/patchy1
/dev/loop1 98980 5472 93508 6% /d/backends/patchy2
On a machine where this test fails:
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/loop0 96928 6112 90816 7% /d/backends/patchy1
/dev/loop1 96928 6112 90816 7% /d/backends/patchy2
Fix:
Make lower bound 2% less than the brick-blocks available
fixes: bz#1761759
Change-Id: I974d5e75766f7ff44780a2e4c2a19cd5d1d14a79
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
fixes: bz#1760189
Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
| |
fixes: #725
Change-Id: Iaaefe6f49c8193c476b987b92df6bab3e2f62601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've found perf xlators io-cache and read-ahead not adding any
performance improvement. At best read-ahead is redundant due to kernel
read-ahead and at worst io-cache is degrading the performance for
workloads that doesn't involve re-read. Given that VFS already have
both these functionalities, this patch makes these two
translators turned off by default for native fuse mounts.
For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have
these xlators on by having custom profiles.
Change-Id: Ie7535788909d4c741844473696f001274dc0bb60
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
fixes: bz#1676479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now we have two separate APIs, one
- 'glfs_h_creat_handle' to create handle & another
- 'glfs_h_open' to create a glfd to return to application
Having two separate routines can result in access errors
while trying to create and write into a read-only file.
Since a fd is opened even during file/directory creation,
introducing a new API to make these two operations atomic i.e,
which can create both handle & fd and pass them to application
Change-Id: Ibf513fcfcdad175f4d7eb6fa7a61b8feec6d33b5
Fixes: bz#1753569
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.
Note that ctime still doesn't support consistent time across
distribute sub-volume.
This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.
Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1734026
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|