glusterfs.git/xlators, branch release-9

core: Avoid several dict OR key is NULL message in brick logs (#2344)

2021-04-22T13:26:28+00:00

Problem: dict_get_with_ref throw a message "dict or key is NULL"
if dict or key is NULL.

Solution: Before access a key check if dictionary is valid.

> Fixes: #1909
> Change-Id: I50911679142b52f854baf20c187962a2a3698f2d
> Signed-off-by: Mohit Agrawal 
> Cherry picked from commit de1b26d68e31b029a59e59a47b51a7e3e6fbfe22
> Reviewed on upstream link https://github.com/gluster/glusterfs/pull/1910

Fixes: #1909
Change-Id: I50911679142b52f854baf20c187962a2a3698f2d
Signed-off-by: Mohit Agrawal

cluster/afr: Fix race in lockinfo (f)getxattr

2021-04-12T13:03:39+00:00

* cluster/afr: Fix race in lockinfo (f)getxattr

A shared dictionary was updated outside the lock after having updated
the number of remaining answers. This means that one thread may be
processing the last answer and unwinding the request before another
thread completes updating the dict.

    Thread 1                           Thread 2

    LOCK()
    call_cnt-- (=1)
    UNLOCK()
                                       LOCK()
                                       call_cnt-- (=0)
                                       UNLOCK()
                                       update_dict(dict)
                                       if (call_cnt == 0) {
                                           STACK_UNWIND(dict);
                                       }
    update_dict(dict)
    if (call_cnt == 0) {
        STACK_UNWIND(dict);
    }

The updates from thread 1 are lost.

This patch also reduces the work done inside the locked region and
reduces code duplication.

Fixes: #2161
Change-Id: Idc0d34ab19ea6031de0641f7b05c624d90fac8fa
Signed-off-by: Xavi Hernandez

afr: fix directory entry count

2021-04-09T16:30:14+00:00

AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.

This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.

This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.

Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez

afr: make fsync post-op aware of inodelk count (#2273) (#2297)

2021-03-29T05:35:13+00:00

Problem:
Since commit bd540db1e, eager-locking was enabled for fsync. But on
certain VM workloads wit sharding enabled, shard xlator keeps sending
fsync on the base shard. This can cause blocked inodelks from other
clients (including shd) to time out due to call bail.

Fix:
Make afr fsync aware of inodelk count and not delay post-op + unlock
when inodelk count > 1, just like writev.

Code is restructured so that any fd based AFR_DATA_TRANSACTION can be made
aware by setting GLUSTERFS_INODELK_DOM_COUNT in xdata request.

Note: We do not know yet why VMs go in to paused state because of the
blocked inodelks but this patch should be a first step in reducing the
occurence.

Updates: #2198
Change-Id: Ib91ebdd3101d590c326e69c829cf9335003e260b
Signed-off-by: Ravishankar N

afr: remove priv->root_inode (#2244) (#2279)

2021-03-23T07:45:33+00:00

priv->root_inode seems to be a remenant of pump xlator and was getting
populated in discover code path. thin-arbiter code used it to populate
loc info but it seems that in case of some daemons like quotad, the
discover path for root gfid is not hit, causing it to crash.

Fix:
root inode can be accessed via this->itable->root, so use that and
remove priv->rot_inode instances from the afr code.

Fixes: #2234
Change-Id: Iec59c157f963a4dc455652a5c85a797d00cba52a
Signed-off-by: Ravishankar N

posix: fix chmod error on symlinks (#2158)

2021-02-12T14:59:36+00:00

After glibc 2.32, lchmod() is returning EOPNOTSUPP instead of ENOSYS when
called on symlinks. The man page says that the returned code is ENOTSUP.
They are the same in linux, but this patch correctly handles all errors.

Fixes: #2154
Change-Id: Ib3bb3d86d421cba3d7ec8d66b6beb131ef6e0925
Signed-off-by: Xavi Hernandez

cluster/ec: Change self-heal-window-size to 4MiB by default (#2071)

2021-02-11T16:02:32+00:00

The current block size used for self-heal by default is 128 KiB. This
requires a significant amount of management requests for a very small
portion of data healed.

With this patch the block size is increased to 4 MiB. For a standard
EC volume configuration of 4+2, this means that each healed block of
a file will update 1 MiB on each brick.

Change-Id: Ifeec4a2d54988017d038085720513c121b03445b
Updates: #2067
Signed-off-by: Xavi Hernandez

dht: don't ignore xdata in fgetxattr (#2020) (#2031)

2021-02-06T00:46:42+00:00

DHT was passing NULL for xdata in fgetxattr() request, ignoring any
data sent by upper xlators.

This patch fixes the issue by sending the received xdata to lower
xlators, as it's currently done for getxattr().

Fixes: #1991
Change-Id: If3d3f1f2ce6215f3b1acc46480e133cb4294eaec
Signed-off-by: Xavi Hernandez

cluster/afr: Change default self-heal-window-size to 1MB (#2113)

2021-02-04T13:00:26+00:00

At the moment self-heal-window-size is 128KB. This leads to healing data
in 128KB chunks. With the growth of data and the avg file sizes
nowadays, 1MB seems like a better default.

Change-Id: I70c42c83b16c7adb53d6b5762969e878477efb5c
Fixes: #2067
Signed-off-by: Pranith Kumar K

cluster/dht: Allow fix-layout only on directories (#2109) (#2114)

2021-02-04T12:59:33+00:00

Problem:
fix-layout operation assumes that the directory passed is directory i.e.
layout->cnt == conf->subvolume_cnt. This will lead to a crash when
fix-layout is attempted on a file.

Fix:
Disallow fix-layout on files

fixes: #2107
Change-Id: I2116b8773059f67e3260e9207e20eab3de711417
Signed-off-by: Pranith Kumar K