<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/basic/afr, branch master</title>
<subtitle>GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/'/>
<entry>
<title>cluster/afr: Heal directory rename without rmdir/mkdir</title>
<updated>2020-10-01T12:03:25+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2020-04-13T14:01:51+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=9ecbd69127d373ac000e9e1be00c1829e49e64a4'/>
<id>9ecbd69127d373ac000e9e1be00c1829e49e64a4</id>
<content type='text'>
Problem1:
When a directory is renamed while a brick
is down entry-heal always did an rm -rf on that directory on
the sink on old location and did mkdir and created the directory
hierarchy again in the new location. This is inefficient.

Problem2:
Renamedir heal order may lead to a scenario where directory in
the new location could be created before deleting it from old
location leading to 2 directories with same gfid in posix.

Fix:
As part of heal, if oldlocation is healed first and is not present in
source-brick always rename it into a hidden directory inside the
sink-brick so that when heal is triggered in new-location shd can
rename it from this hidden directory to the new-location.

If new-location heal is triggered first and it detects that the
directory already exists in the brick, then it should skip healing the
directory until it appears in the hidden directory.

Credits: Ravi for rename-data-loss.t script

Fixes: #1211
Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem1:
When a directory is renamed while a brick
is down entry-heal always did an rm -rf on that directory on
the sink on old location and did mkdir and created the directory
hierarchy again in the new location. This is inefficient.

Problem2:
Renamedir heal order may lead to a scenario where directory in
the new location could be created before deleting it from old
location leading to 2 directories with same gfid in posix.

Fix:
As part of heal, if oldlocation is healed first and is not present in
source-brick always rename it into a hidden directory inside the
sink-brick so that when heal is triggered in new-location shd can
rename it from this hidden directory to the new-location.

If new-location heal is triggered first and it detects that the
directory already exists in the brick, then it should skip healing the
directory until it appears in the hidden directory.

Credits: Ravi for rename-data-loss.t script

Fixes: #1211
Change-Id: I0cba2006f35cd03d314d18211ce0bd530e254843
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: provide an option to mark tests as 'flaky'</title>
<updated>2020-08-20T08:01:07+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amar@kadalu.io</email>
</author>
<published>2020-08-18T08:38:20+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=097db13c11390174c5b9f11aa0fd87eca1735871'/>
<id>097db13c11390174c5b9f11aa0fd87eca1735871</id>
<content type='text'>
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.

Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi &lt;amar@kadalu.io&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.

Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi &lt;amar@kadalu.io&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr/split-brain: fix client side split-brain resolution when quorum is enabled</title>
<updated>2020-08-13T04:20:36+00:00</updated>
<author>
<name>Mohammed Rafi KC</name>
<email>rafi.kavungal@iternity.com</email>
</author>
<published>2020-07-29T08:12:53+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=320bf93d36a1c20d661cb11968825fc055fcc168'/>
<id>320bf93d36a1c20d661cb11968825fc055fcc168</id>
<content type='text'>
Problem:
If we set favourite child policy, then automatic split-brain resolution
should work in all cases. This was failing when quorum count was set to
a non-zero value. The initial lookup before the read txn was failing
with ENOTCONN. Since we don't have a readable subvol, we were failing it.
We were only looking to the split brain resolution choice set through the
cli command.

Fix:
We will now consider the favourite child policy if split-brain choice
has not been set via cli command.

Change-Id: Id2016c3a90d0763ac6f1a0131571053f595576f0
Fixes: #1404
Signed-off-by: Mohammed Rafi KC &lt;rafi.kavungal@iternity.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
If we set favourite child policy, then automatic split-brain resolution
should work in all cases. This was failing when quorum count was set to
a non-zero value. The initial lookup before the read txn was failing
with ENOTCONN. Since we don't have a readable subvol, we were failing it.
We were only looking to the split brain resolution choice set through the
cli command.

Fix:
We will now consider the favourite child policy if split-brain choice
has not been set via cli command.

Change-Id: Id2016c3a90d0763ac6f1a0131571053f595576f0
Fixes: #1404
Signed-off-by: Mohammed Rafi KC &lt;rafi.kavungal@iternity.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Delay post-op for fsync</title>
<updated>2020-06-08T13:49:12+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2020-05-29T08:54:53+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=af89d9e623cd99d3cec55cd650304d7ff9fae7e5'/>
<id>af89d9e623cd99d3cec55cd650304d7ff9fae7e5</id>
<content type='text'>
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
AFR doesn't delay post-op for fsync fop. For fsync heavy workloads
this leads to un-necessary fxattrop/finodelk for every fsync leading
to bad performance.

Fix:
Have delayed post-op for fsync. Add special flag in xdata to indicate
that afr shouldn't delay post-op in cases where either the
process will terminate or graph-switch would happen. Otherwise it leads
to un-necessary heals when the graph-switch/process-termination
happens before delayed-post-op completes.

Fixes: #1253
Change-Id: I531940d13269a111c49e0510d49514dc169f4577
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mgmt/glusterd: Stop old shd before increasing replica count</title>
<updated>2020-05-16T11:22:09+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2020-05-13T16:23:23+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=dc1d549d0e776ecd4c8381577eeed990250a7f6a'/>
<id>dc1d549d0e776ecd4c8381577eeed990250a7f6a</id>
<content type='text'>
Problem:
In add-brick that increases replica count
SHD was restarted after pending xattrs are set on the new bricks and
adding bricks. But before restarting SHD there is a possibility that
old SHD would do a scan on root-directory see no heal is needed and
delete index for root-dir leading to no heals until lookup is executed
on the mount

Fix:
Stop shd, perform pending-xattr setting/adding new bricks and
then restart shd

Fixes: #1240
Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In add-brick that increases replica count
SHD was restarted after pending xattrs are set on the new bricks and
adding bricks. But before restarting SHD there is a possibility that
old SHD would do a scan on root-directory see no heal is needed and
delete index for root-dir leading to no heals until lookup is executed
on the mount

Fix:
Stop shd, perform pending-xattr setting/adding new bricks and
then restart shd

Fixes: #1240
Change-Id: I94fd7c6c909211b597185dfe097a559db6c0d00f
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: event gen changes</title>
<updated>2020-04-24T12:47:14+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2020-04-10T09:00:57+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=f0fcd909ad4535b60c9208d4804ebe6afe421a09'/>
<id>f0fcd909ad4535b60c9208d4804ebe6afe421a09</id>
<content type='text'>
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Erik Jacobson &lt;erik.jacobson at hpe.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reported-by: Erik Jacobson &lt;erik.jacobson at hpe.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Fixes for halo</title>
<updated>2020-03-13T13:20:37+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2020-02-04T13:12:33+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=b164a74884becef281b57ef93428bb740e3e342e'/>
<id>b164a74884becef281b57ef93428bb740e3e342e</id>
<content type='text'>
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.

fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current implementation assumes that ping-event will come after connect event
but that may not be the case in the cases where after socket connection fds
need to be re-opened which would consume more time. So handle any order of the
ping/child-up events.

fixes: bz#1800583
Change-Id: I6bcdc0caa503bdc039ef2b4739fbf4afae121f05
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Add afr_seek to fops table</title>
<updated>2019-10-14T11:26:56+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-10-10T04:07:24+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=2f43a1a4b567ec02ec03847b6eaec5ac59e68808'/>
<id>2f43a1a4b567ec02ec03847b6eaec5ac59e68808</id>
<content type='text'>
fixes: bz#1760189
Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fixes: bz#1760189
Change-Id: Iffbf8d6f4c50b8e2de8364658697bdbe96549f5d
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ctime/rebalance: Heal ctime xattr on directory during rebalance</title>
<updated>2019-09-16T06:42:29+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-07-29T13:00:42+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=304640e55c0f3c6d15f4e230dc6376e4f5020fea'/>
<id>304640e55c0f3c6d15f4e230dc6376e4f5020fea</id>
<content type='text'>
After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1734026
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After add-brick and rebalance, the ctime xattr is not present
on rebalanced directories on new brick. This patch fixes the
same.

Note that ctime still doesn't support consistent time across
distribute sub-volume.

This patch also fixes the in-memory inconsistency of time attributes
when metadata is self healed.

Change-Id: Ia20506f1839021bf61d4753191e7dc34b31bb2df
fixes: bz#1734026
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>posix/ctime: Fix ctime upgrade issue</title>
<updated>2019-06-21T11:09:32+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2019-06-13T10:53:21+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=2959fbfbb77e2b1b2ccbc90610f8ff9362109ae3'/>
<id>2959fbfbb77e2b1b2ccbc90610f8ff9362109ae3</id>
<content type='text'>
Problem:
On a EC volume, during upgrade from the older version where
ctime feature is not enabled(or not present) to the newer
version where the ctime feature is available (enabled default),
the self heal hangs and doesn't complete.

Cause:
The ctime feature has both client side code (utime) and
server side code (posix). The feature is driven from client.
Only if the client side sets the time in the frame, should
the server side sets the time attributes in xattr. But posix
setattr/fseattr was not doing that. When one of the server
nodes is updated, since ctime is enabled by default, it
starts setting xattr on setattr/fseattr on the updated node/brick.

On a EC volume the first two updated nodes(bricks) are not a
problem because there are 4 other bricks with consistent data.
However once the third brick is updated, the new attribute(mdata xattr)
will cause an inconsistency on metadata on 3 bricks, which
prevents the file to be repaired.

Fix:
Don't create mdata xattr with utimes/utimensat system call.
Only update if already present.

Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
fixes: bz#1720201
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
On a EC volume, during upgrade from the older version where
ctime feature is not enabled(or not present) to the newer
version where the ctime feature is available (enabled default),
the self heal hangs and doesn't complete.

Cause:
The ctime feature has both client side code (utime) and
server side code (posix). The feature is driven from client.
Only if the client side sets the time in the frame, should
the server side sets the time attributes in xattr. But posix
setattr/fseattr was not doing that. When one of the server
nodes is updated, since ctime is enabled by default, it
starts setting xattr on setattr/fseattr on the updated node/brick.

On a EC volume the first two updated nodes(bricks) are not a
problem because there are 4 other bricks with consistent data.
However once the third brick is updated, the new attribute(mdata xattr)
will cause an inconsistency on metadata on 3 bricks, which
prevents the file to be repaired.

Fix:
Don't create mdata xattr with utimes/utimensat system call.
Only update if already present.

Change-Id: Ieacedecb8a738bb437283ef3e0f042fd49dc4c8c
fixes: bz#1720201
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
