<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests, branch devel</title>
<subtitle>GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/'/>
<entry>
<title>tests: avoid empty paths in environment variables (#2349)</title>
<updated>2021-04-23T12:52:56+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2021-04-23T12:52:56+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=6a43697a6c2f41336a4ce6b85ea2b50c449fc7d0'/>
<id>6a43697a6c2f41336a4ce6b85ea2b50c449fc7d0</id>
<content type='text'>
* tests: avoid empty paths in environment variables

Many variables containing paths in env.rc.in are defined in a way
that leave a trailing ':' in the variable when the previous value
was empty or undefined.

In the particular case of 'LD_PRELOAD_PATH' variable, this causes
that the system looks for dynamic libraries in the current working
directory. When this directory is inside a Gluster mount point, a
significant delay is caused each time a program is run (and testing
framework can run lots of programs for each test).

This patch prevents that variables containing paths could end with
a trailing ':'.

Fixes: #2348
Change-Id: I669f5a78e14f176c0a58824ba577330989d84769
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;

* Fix PYTHONPATH name and duplicity

Change-Id: Iaa0b092118bb86856bbe621eb03fef6fa7478971
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* tests: avoid empty paths in environment variables

Many variables containing paths in env.rc.in are defined in a way
that leave a trailing ':' in the variable when the previous value
was empty or undefined.

In the particular case of 'LD_PRELOAD_PATH' variable, this causes
that the system looks for dynamic libraries in the current working
directory. When this directory is inside a Gluster mount point, a
significant delay is caused each time a program is run (and testing
framework can run lots of programs for each test).

This patch prevents that variables containing paths could end with
a trailing ':'.

Fixes: #2348
Change-Id: I669f5a78e14f176c0a58824ba577330989d84769
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;

* Fix PYTHONPATH name and duplicity

Change-Id: Iaa0b092118bb86856bbe621eb03fef6fa7478971
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>protocol/client: Fix lock memory leak (#2338)</title>
<updated>2021-04-22T05:53:05+00:00</updated>
<author>
<name>Pranith Kumar Karampuri</name>
<email>pranith.karampuri@phonepe.com</email>
</author>
<published>2021-04-22T05:53:05+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=4230e21f7c60387f818bdbbd3008484ca61640b2'/>
<id>4230e21f7c60387f818bdbbd3008484ca61640b2</id>
<content type='text'>
Problem-1:
When an overlapping lock is issued the merged lock is not assigned the
owner. When flush is issued on the fd, this particular lock is not freed
leading to memory leak

Fix-1:
Assign the owner while merging the locks.

Problem-2:
On fd-destroy lock structs could be present in fdctx. For some reason
with flock -x command and closing of the bash fd, it leads to this code
path. Which leaks the lock structs.

Fix-2:
When fdctx is being destroyed in client, make sure to cleanup any lock
structs.

fixes: #2337
Change-Id: I298124213ce5a1cf2b1f1756d5e8a9745d9c0a1c
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem-1:
When an overlapping lock is issued the merged lock is not assigned the
owner. When flush is issued on the fd, this particular lock is not freed
leading to memory leak

Fix-1:
Assign the owner while merging the locks.

Problem-2:
On fd-destroy lock structs could be present in fdctx. For some reason
with flock -x command and closing of the bash fd, it leads to this code
path. Which leaks the lock structs.

Fix-2:
When fdctx is being destroyed in client, make sure to cleanup any lock
structs.

fixes: #2337
Change-Id: I298124213ce5a1cf2b1f1756d5e8a9745d9c0a1c
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>dht: fix rebalance of sparse files (#2318)</title>
<updated>2021-04-09T16:13:30+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@gmail.com</email>
</author>
<published>2021-04-09T16:13:30+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=a62c5666e2b58668d0ddec7f3331446313539e72'/>
<id>a62c5666e2b58668d0ddec7f3331446313539e72</id>
<content type='text'>
Current implementation of rebalance for sparse files has a bug that,
in some cases, causes a read of 0 bytes from the source subvolume.
Posix xlator doesn't allow 0 byte reads and fails them with EINVAL,
which causes rebalance to abort the migration.

This patch implements a more robust way of finding data segments in
a sparse file that avoids 0 byte reads, allowing the file to be
migrated successfully.

Fixes: #2317
Change-Id: Iff168dda2fb0f2edf716b21eb04cc2cc8ac3915c
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current implementation of rebalance for sparse files has a bug that,
in some cases, causes a read of 0 bytes from the source subvolume.
Posix xlator doesn't allow 0 byte reads and fails them with EINVAL,
which causes rebalance to abort the migration.

This patch implements a more robust way of finding data segments in
a sparse file that avoids 0 byte reads, allowing the file to be
migrated successfully.

Fixes: #2317
Change-Id: Iff168dda2fb0f2edf716b21eb04cc2cc8ac3915c
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Fix incorrect variables in throttle-rebal.t (#2316)</title>
<updated>2021-04-08T02:23:44+00:00</updated>
<author>
<name>Jamie Nguyen</name>
<email>j@jamielinux.com</email>
</author>
<published>2021-04-08T02:23:44+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=7a9f7cb3f6929ed2d4af603f1aa64ca967140940'/>
<id>7a9f7cb3f6929ed2d4af603f1aa64ca967140940</id>
<content type='text'>
The final test doesn't test what it means to test. It still fails as
expected, but only because at this point `THROTTLE_LEVEL` is still set
to `garbage`.

Easily fixed by correcting the typos in the variable names, and thus
fixes https://github.com/gluster/glusterfs/issues/2315

Signed-off-by: Jamie Nguyen &lt;j@jamielinux.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The final test doesn't test what it means to test. It still fails as
expected, but only because at this point `THROTTLE_LEVEL` is still set
to `garbage`.

Easily fixed by correcting the typos in the variable names, and thus
fixes https://github.com/gluster/glusterfs/issues/2315

Signed-off-by: Jamie Nguyen &lt;j@jamielinux.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>Removal of force option in snapshot create (#2110)</title>
<updated>2021-04-06T14:10:50+00:00</updated>
<author>
<name>nishith-vihar</name>
<email>77044911+nishith-vihar@users.noreply.github.com</email>
</author>
<published>2021-04-06T14:10:50+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=5fd3d4175fc2ba16fe92129ff43fb7fb23d8615a'/>
<id>5fd3d4175fc2ba16fe92129ff43fb7fb23d8615a</id>
<content type='text'>
The force option does fail for snapshot create command even though
the quorum is satisfied and is redundant.

The change deprecates the force option for snapshot create command
and checks if all bricks are online instead of checking for quorum
for creating a snapshot.

Fixes: #2099
Change-Id: I45d866e67052fef982a60aebe8dec069e78015bd
Signed-off-by: Nishith Vihar Sakinala &lt;nsakinal@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The force option does fail for snapshot create command even though
the quorum is satisfied and is redundant.

The change deprecates the force option for snapshot create command
and checks if all bricks are online instead of checking for quorum
for creating a snapshot.

Fixes: #2099
Change-Id: I45d866e67052fef982a60aebe8dec069e78015bd
Signed-off-by: Nishith Vihar Sakinala &lt;nsakinal@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>afr: don't reopen fds on which POSIX locks are held (#1980)</title>
<updated>2021-03-27T10:14:04+00:00</updated>
<author>
<name>Karthik Subrahmanya</name>
<email>ksubrahm@redhat.com</email>
</author>
<published>2021-03-27T10:14:04+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=2a524ad0738be491dcac3cc96db1411320168c72'/>
<id>2a524ad0738be491dcac3cc96db1411320168c72</id>
<content type='text'>
When client.strict-locks is enabled on a volume and there are POSIX
locks held on the files, after disconnect and reconnection of the
clients do not re-open such fds which might lead to multiple clients
acquiring the locks and cause data corruption.

Change-Id: I8777ffbc2cc8d15ab57b58b72b56eb67521787c5
Fixes: #1977
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When client.strict-locks is enabled on a volume and there are POSIX
locks held on the files, after disconnect and reconnection of the
clients do not re-open such fds which might lead to multiple clients
acquiring the locks and cause data corruption.

Change-Id: I8777ffbc2cc8d15ab57b58b72b56eb67521787c5
Fixes: #1977
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/dht: use readdir for fix-layout in rebalance (#2243)</title>
<updated>2021-03-22T04:49:27+00:00</updated>
<author>
<name>Pranith Kumar Karampuri</name>
<email>pranith.karampuri@phonepe.com</email>
</author>
<published>2021-03-22T04:49:27+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=ec189a499d85c2aad1d54e55e47df6b95ba02922'/>
<id>ec189a499d85c2aad1d54e55e47df6b95ba02922</id>
<content type='text'>
Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.

Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.

fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.

Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.

fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/dht: Provide option to disable fsync in data migration (#2259)</title>
<updated>2021-03-17T05:32:21+00:00</updated>
<author>
<name>Pranith Kumar Karampuri</name>
<email>pranith.karampuri@phonepe.com</email>
</author>
<published>2021-03-17T05:32:21+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=088d8a575c59479defb5dbe8bf03f4211156df7f'/>
<id>088d8a575c59479defb5dbe8bf03f4211156df7f</id>
<content type='text'>
At the moment dht rebalance doesn't give any option to disable fsync
after data migration. Making this an option would give admins take
responsibility of data in a way that is suitable for their cluster.
Default value is still 'on', so that the behavior is intact for people
who don't care about this.

For example: If the data that is going to be migrated is already backed
up or snapshotted, there is no need for fsync to happen right after
migration which can affect active I/O on the volume from applications.

fixes: #2258
Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At the moment dht rebalance doesn't give any option to disable fsync
after data migration. Making this an option would give admins take
responsibility of data in a way that is suitable for their cluster.
Default value is still 'on', so that the behavior is intact for people
who don't care about this.

For example: If the data that is going to be migrated is already backed
up or snapshotted, there is no need for fsync to happen right after
migration which can affect active I/O on the volume from applications.

fixes: #2258
Change-Id: I7a50b8d3a2f270d79920ef306ceb6ba6451150c4
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>features/index: Optimize link-count fetching code path (#1789)</title>
<updated>2021-03-10T05:13:24+00:00</updated>
<author>
<name>Pranith Kumar Karampuri</name>
<email>pranith.karampuri@phonepe.com</email>
</author>
<published>2021-03-10T05:13:24+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=46949c4951eb1d2eb0a90c21db66c31e444bffe8'/>
<id>46949c4951eb1d2eb0a90c21db66c31e444bffe8</id>
<content type='text'>
* features/index: Optimize link-count fetching code path

Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent-&gt;inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.

Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv-&gt;pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv-&gt;pending_count is -ve.

Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
   directory, so that the next lookup can compute the link-count.

fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* addressed comments

Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* tests: Handle base index absence

Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* Addressed LOCK based comments, .t comments

Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* features/index: Optimize link-count fetching code path

Problem:
AFR requests 'link-count' in lookup to check if there are any pending
heals. Based on this information, afr will set dirent-&gt;inode to NULL in
readdirp when heals are ongoing to prevent serving bad data. When heals
are completed, link-count xattr is leading to doing an opendir of
xattrop directory and then reading the contents to figure out that there
is no healing needed for every lookup. This was not detected until this
github issue because ZFS in some cases can lead to very slow readdir()
calls. Since Glusterfs does lot of lookups, this was slowing down
all operations increasing load on the system.

Code problem:
index xlator on any xattrop operation adds index to the relevant dirs
and after the xattrop operation is done, will delete/keep the index in
that directory based on the value fetched in xattrop from posix. AFR
sends all-zero xattrop for changelog xattrs. This is leading to
priv-&gt;pending_count manipulation which sets the count back to -1. Next
Lookup operation triggers opendir/readdir to find the actual link-count in
lookup because in memory priv-&gt;pending_count is -ve.

Fix:
1) Don't add to index on all-zero xattrop for a key.
2) Set pending-count to -1 when the first gfid is added into xattrop
   directory, so that the next lookup can compute the link-count.

fixes: #1764
Change-Id: I8a02c7e811a72c46d78ddb2d9d4fdc2222a444e9
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* addressed comments

Change-Id: Ide42bb1c1237b525d168bf1a9b82eb1bdc3bc283
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* tests: Handle base index absence

Change-Id: I3cf11a8644ccf23e01537228766f864b63c49556
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;

* Addressed LOCK based comments, .t comments

Change-Id: I5f53e40820cade3a44259c1ac1a7f3c5f2f0f310
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>afr: fix directory entry count (#2233)</title>
<updated>2021-03-08T23:24:07+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2021-03-08T23:24:07+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=dc9bab7959b068617ef00f355c63bdca060b9605'/>
<id>dc9bab7959b068617ef00f355c63bdca060b9605</id>
<content type='text'>
AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.

This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.

This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.

Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AFR may hide some existing entries from a directory when reading it
because they are generated internally for private management. However
the returned number of entries from readdir() function is not updated
accordingly. So it may return a number higher than the real entries
present in the gf_dirent list.

This may cause unexpected behavior of clients, including gfapi which
incorrectly assumes that there was an entry when the list was actually
empty.

This patch also makes the check in gfapi more robust to avoid similar
issues that could appear in the future.

Fixes: #2232
Change-Id: I81ba3699248a53ebb0ee4e6e6231a4301436f763
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;</pre>
</div>
</content>
</entry>
</feed>
