glusterfs.git/tests/bugs, branch devel

protocol/client: Fix lock memory leak (#2338)

2021-04-22T05:53:05+00:00

Problem-1:
When an overlapping lock is issued the merged lock is not assigned the
owner. When flush is issued on the fd, this particular lock is not freed
leading to memory leak

Fix-1:
Assign the owner while merging the locks.

Problem-2:
On fd-destroy lock structs could be present in fdctx. For some reason
with flock -x command and closing of the bash fd, it leads to this code
path. Which leaks the lock structs.

Fix-2:
When fdctx is being destroyed in client, make sure to cleanup any lock
structs.

fixes: #2337
Change-Id: I298124213ce5a1cf2b1f1756d5e8a9745d9c0a1c
Signed-off-by: Pranith Kumar K

dht: fix rebalance of sparse files (#2318)

2021-04-09T16:13:30+00:00

Current implementation of rebalance for sparse files has a bug that,
in some cases, causes a read of 0 bytes from the source subvolume.
Posix xlator doesn't allow 0 byte reads and fails them with EINVAL,
which causes rebalance to abort the migration.

This patch implements a more robust way of finding data segments in
a sparse file that avoids 0 byte reads, allowing the file to be
migrated successfully.

Fixes: #2317
Change-Id: Iff168dda2fb0f2edf716b21eb04cc2cc8ac3915c
Signed-off-by: Xavi Hernandez

Removal of force option in snapshot create (#2110)

2021-04-06T14:10:50+00:00

The force option does fail for snapshot create command even though
the quorum is satisfied and is redundant.

The change deprecates the force option for snapshot create command
and checks if all bricks are online instead of checking for quorum
for creating a snapshot.

Fixes: #2099
Change-Id: I45d866e67052fef982a60aebe8dec069e78015bd
Signed-off-by: Nishith Vihar Sakinala

afr: don't reopen fds on which POSIX locks are held (#1980)

2021-03-27T10:14:04+00:00

When client.strict-locks is enabled on a volume and there are POSIX
locks held on the files, after disconnect and reconnection of the
clients do not re-open such fds which might lead to multiple clients
acquiring the locks and cause data corruption.

Change-Id: I8777ffbc2cc8d15ab57b58b72b56eb67521787c5
Fixes: #1977
Signed-off-by: karthik-us

cluster/dht: use readdir for fix-layout in rebalance (#2243)

2021-03-22T04:49:27+00:00

Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.

Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.

fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K

cluster/dht: Provide option to disable fsync in data migration (#2259)

2021-03-17T05:32:21+00:00

At the moment dht rebalance doesn't give any option to disable fsync
after data migration. Making this an option would give admins take
responsibility of data in a way that is suitable for their cluster.
Default value is still 'on', so that the behavior is intact for people
who don't care about this.

For example: If the data that is going to be migrated is already backed
up or snapshotted, there is no need for fsync to happen right after
migration which can affect active I/O on the volume from applications.

fixes: #2258
Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4
Signed-off-by: Pranith Kumar K

features/index: Optimize link-count fetching code path (#1789)

2021-03-10T05:13:24+00:00

* features/index: Optimize link-count fetching code path

Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent->inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.

Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv->pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv->pending_count is -ve.

Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
   directory, so that the next lookup can compute the link-count.

fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K 

* addressed comments

Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K 

* tests: Handle base index absence

Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K 

* Addressed LOCK based comments, .t comments

Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K

afr: fix directory entry count (#2233)

2021-03-08T23:24:07+00:00

AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.

This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.

This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.

Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez

cluster/dht: Fix stack overflow in readdir(p) (#2170)

2021-02-24T14:04:23+00:00

When parallel-readdir is enabled, readdir(p) requests sent by DHT can be
immediately processed and answered in the same thread before the call to
STACK_WIND_COOKIE() completes.

This means that the readdir(p) cbk is processed synchronously. In some
cases it may decide to send another readdir(p) request, which causes a
recursive call.

When some special conditions happen and the directories are big, it's
possible that the number of nested calls is so high that the process
crashes because of a stack overflow.

This patch fixes this by not allowing nested readdir(p) calls. When a
nested call is detected, it's queued instead of sending it. The queued
request is processed when the current call finishes by the top level
stack function.

Fixes: #2169
Change-Id: Id763a8a51fb3c3314588ec7c162f649babf33099
Signed-off-by: Xavi Hernandez

tests: Handle nanosecond duration in profile info (#2135)

2021-02-08T08:59:08+00:00

Problem:
volume profile info now prints duration in nano seconds. Tests were
written when the duration was printed in micro seconds. This leads to
spurious failures.

Fix:
Change tests to handle nano second durations

fixes: #2134
Change-Id: I94722be87000a485d98c8b0f6d8b7e1a526b07e7
Signed-off-by: Pranith Kumar K