glusterfs.git/tests/features, branch devel

features/index: Optimize link-count fetching code path (#1789)

2021-03-10T05:13:24+00:00

* features/index: Optimize link-count fetching code path

Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent->inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.

Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv->pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv->pending_count is -ve.

Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
   directory, so that the next lookup can compute the link-count.

fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K 

* addressed comments

Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K 

* tests: Handle base index absence

Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K 

* Addressed LOCK based comments, .t comments

Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K

Remove tests from components that are no longer in the tree (#2160)

2021-02-13T12:14:49+00:00

fixes: #2159
Change-Id: Ibaaebc48b803ca6ad4335c11818c0c71a13e9f07
Signed-off-by: Pranith Kumar K

tests: Handle nanosecond duration in profile info (#2135)

2021-02-08T08:59:08+00:00

Problem:
volume profile info now prints duration in nano seconds. Tests were
written when the duration was printed in micro seconds. This leads to
spurious failures.

Fix:
Change tests to handle nano second durations

fixes: #2134
Change-Id: I94722be87000a485d98c8b0f6d8b7e1a526b07e7
Signed-off-by: Pranith Kumar K

tests: Fix issues in CentOS 8 (#1756)

2020-11-06T11:00:18+00:00

* tests: Fix issues in CentOS 8

Due to some configuration changes in CentOS 8/RHEL 8, ssl-ciphers.t
and bug-1053579.t were failing.

The first one was failing because TLS v1.0 is disabled by default. The
test hash been updated to check that at least one of TLS v1.0, v1.1 or
v1.2 succeeds.

For the second case, the issue is that the test assumed that the
latest added group to a user should always be listed the last, but
this is not always true because nsswitch.conf now uses 'sss' before
'files', which means that data comes from a db that could not be
sorted.

Updates: #1009
Change-Id: I4ca01a099854ec25926c3d76b3a98072175bab06
Signed-off-by: Xavi Hernandez 

* tests: Fix TLS version detection

The old test didn't correctly determine which version of TLS should
be allowed by openssl.

Change-Id: Ic081c329d5ed1842fa9f5fd23742ae007738aec0
Signed-off-by: Xavi Hernandez

cluster/afr: Heal directory rename without rmdir/mkdir

2020-04-13T14:01:51+00:00

Problem1:
When a directory is renamed while a brick
is down entry-heal always did an rm -rf on that directory on
the sink on old location and did mkdir and created the directory
hierarchy again in the new location. This is inefficient.

Problem2:
Renamedir heal order may lead to a scenario where directory in
the new location could be created before deleting it from old
location leading to 2 directories with same gfid in posix.

Fix:
As part of heal, if oldlocation is healed first and is not present in
source-brick always rename it into a hidden directory inside the
sink-brick so that when heal is triggered in new-location shd can
rename it from this hidden directory to the new-location.

If new-location heal is triggered first and it detects that the
directory already exists in the brick, then it should skip healing the
directory until it appears in the hidden directory.

Credits: Ravi for rename-data-loss.t script

Fixes: #1211
Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843
Signed-off-by: Pranith Kumar K

tests: provide an option to mark tests as 'flaky'

2020-08-18T08:38:20+00:00

* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.

Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi

test: ./tests/features/ssl-ciphers.t fail on centos 8

2020-07-30T04:31:28+00:00

Check the tlsv1 openssl connection based on openssl version.
If openssl version is 1.1 it supports tls1 protocol otherwise
it supports tlsv1_2 protocol.

Fixes: #1403
Change-Id: I3ca286492049e6f84de70e3b969fa41db10378ab
Signed-off-by: Mohit Agrawal

tests/features/interrupt.t: fixes

2020-07-08T19:48:46+00:00

- Modify the patterns for which we grep the logs so
  that they don't match themselves. The test runner
  inserts the invocation of the cases to the log, thus
  the patterns will occur in the logs verbatim. So if
  the pattern matches itself, the test case will be
  moot (always reporting success).

- Invoke the test utility (open-and-sleep) on
  unique paths so that the file at the passed
  path shall be created on each invocation.

  The kernel does not send an interrupt if
  the file is extant. (This was shadowed by
  the above mistske with result evaluation.)

- Modify the pattern for which we grep the log in
  the test case where interrupt handling is expected
  so that it asserts that the interrupt was handled.
  (So far we did not exclude the possibility of the
  interrupt triggered but not handled due to a race;
  however, it seems to be the case that this theoretic
  race does not have the potential to prevent interrupt
  handling. And if this ever changes in the future we'd
  rather be notified about that.)

Change-Id: I606da2b4064c1ecc4781c7dfdefed95a433478ce
Updates: #1374
Signed-off-by: Csaba Henk

mount/fuse: use cookies to get fuse-interrupt-record instead of xdata

2020-06-17T05:15:19+00:00

Problem:
On executing tests/features/flock_interrupt.t the following error log
appears
[2020-06-16 11:51:54.631072 +0000] E
[fuse-bridge.c:4791:fuse_setlk_interrupt_handler_cbk] 0-glusterfs-fuse:
interrupt record not found

This happens because fuse-interrupt-record is never sent on the wire by
getxattr fop and there is no guarantee that in the cbk it will be
available in case of failures.

Fix:
wind getxattr fop with fuse-interrupt-record as cookie and recover it
in the cbk

Fixes: #1310
Change-Id: I4cfff154321a449114fc26e9440db0f08e5c7daa
Signed-off-by: Pranith Kumar K

tests: Fix for spurious failure for some test cases

2020-04-10T08:15:50+00:00

Problem: Sometimes test case is failing at the time of creating files
         on mount point after mounting the volume

Solution: After started the volume need to wait to make sure all
          bricks instances are completely started so put a online_brick_count
          check after just started the volume

Change-Id: I5020e7e417539377277ca00189f9c51d2cf877a6
Fixes: #1162
Signed-off-by: Mohit Agrawal