<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster, branch v3.10dev</title>
<subtitle>GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/'/>
<entry>
<title>dht, md-cache, upcall: Add invalidation of IATT when the layout changes</title>
<updated>2016-08-31T06:08:54+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-08-23T12:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=065a27948c4e0651f5bdac1703939adf34e5380e'/>
<id>065a27948c4e0651f5bdac1703939adf34e5380e</id>
<content type='text'>
Issue:
dht_layout is built as a part of lookup only. The layout can be
modified by rebalance process. Since every IO fop is preceded
by a lookup, there are very less issues of stale layout. But
with enhancements of aggressive caching of stats in md-cache,
the lookup will reduce and expose the stale layout issue often.

Solution:
Since stale layout is already an issue on dht, there is already
a plan to fix this at the dht layer, but this fix is not currently
planned for any release. Until this fix comes out, we can have
a workaround where, the upcall will send a notification to md-cache
when a layout xattr is changed. As a part of layout change notification
the existing cache is invalidated and the next lookup will fetch the
latest layout.

This is not a foolproof solution as the window between the layout change
and the next lookup(after invalidation of stat), where there will be stale
layout. But until the final fix comes in, this reduces the stale layout
window.

Change-Id: Iacf871a38b35880c1fc0bc68fe7ce291265e71d4
BUG: 1369638
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15300
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Issue:
dht_layout is built as a part of lookup only. The layout can be
modified by rebalance process. Since every IO fop is preceded
by a lookup, there are very less issues of stale layout. But
with enhancements of aggressive caching of stats in md-cache,
the lookup will reduce and expose the stale layout issue often.

Solution:
Since stale layout is already an issue on dht, there is already
a plan to fix this at the dht layer, but this fix is not currently
planned for any release. Until this fix comes out, we can have
a workaround where, the upcall will send a notification to md-cache
when a layout xattr is changed. As a part of layout change notification
the existing cache is invalidated and the next lookup will fetch the
latest layout.

This is not a foolproof solution as the window between the layout change
and the next lookup(after invalidation of stat), where there will be stale
layout. But until the final fix comes in, this reduces the stale layout
window.

Change-Id: Iacf871a38b35880c1fc0bc68fe7ce291265e71d4
BUG: 1369638
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15300
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dht/tiering: fix unused variable warnings/errors</title>
<updated>2016-08-30T18:45:03+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2016-08-22T16:11:24+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=68cc90552b7109ee8d740decc33d65cfdfe0bf81'/>
<id>68cc90552b7109ee8d740decc33d65cfdfe0bf81</id>
<content type='text'>
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.

However 14085 won't pass the smoke test until all the warnings are
fixed.

Change-Id: I367a737570dd7d2f6cc25f4bf4299d31bb6826aa
BUG: 1369124
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15242
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Prashanth Pai &lt;ppai@redhat.com&gt;
Reviewed-by: Dan Lambright &lt;dlambrig@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.

However 14085 won't pass the smoke test until all the warnings are
fixed.

Change-Id: I367a737570dd7d2f6cc25f4bf4299d31bb6826aa
BUG: 1369124
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15242
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Prashanth Pai &lt;ppai@redhat.com&gt;
Reviewed-by: Dan Lambright &lt;dlambrig@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd : Introduce reset brick</title>
<updated>2016-08-30T02:55:53+00:00</updated>
<author>
<name>Anuradha Talur</name>
<email>atalur@redhat.com</email>
</author>
<published>2016-08-22T17:22:03+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=936f8aeac3252951e7fa0cdaa5d260fad3bd5ea0'/>
<id>936f8aeac3252951e7fa0cdaa5d260fad3bd5ea0</id>
<content type='text'>
The command basically allows replace brick with src and
dst bricks as same.

Usage:
gluster v reset-brick &lt;volname&gt; &lt;hostname:brick-path&gt; start
This command kills the brick to be reset. Once this command is run,
admin can do other manual operations that they need to do,
like configuring some options for the brick. Once this is done,
resetting the brick can be continued with the following options.

gluster v reset-brick &lt;vname&gt; &lt;hostname:brick&gt; &lt;hostname:brick&gt; commit {force}

Does the job of resetting the brick. 'force' option should be used
when the brick already contains volinfo id.

Problem: On doing a disk-replacement of a brick in a replicate volume
the following 2 scenarios may occur :

a) there is a chance that reads are served from this replaced-disk brick,
which leads to empty reads. b) potential data loss if next writes succeed
only on replaced brick, and heal is done to other bricks from this one.

Solution: After disk-replacement, make sure that reset-brick command is
run for that brick so that pending markers are set for the brick and it
is not chosen as source for reads and heal. But, as of now replace-brick
for the same brick-path is not allowed. In order to fix the above
mentioned problem, same brick-path replace-brick is needed.
With this patch reset-brick commit {force} will be allowed even when
source and destination &lt;hostname:brickpath&gt; are identical as long as
1) destination brick is not alive
2) source and destination brick have the same brick uuid and path.
Also, the destination brick after replace-brick will use the same port
as the source brick.

Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de
BUG: 1266876
Signed-off-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12250
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The command basically allows replace brick with src and
dst bricks as same.

Usage:
gluster v reset-brick &lt;volname&gt; &lt;hostname:brick-path&gt; start
This command kills the brick to be reset. Once this command is run,
admin can do other manual operations that they need to do,
like configuring some options for the brick. Once this is done,
resetting the brick can be continued with the following options.

gluster v reset-brick &lt;vname&gt; &lt;hostname:brick&gt; &lt;hostname:brick&gt; commit {force}

Does the job of resetting the brick. 'force' option should be used
when the brick already contains volinfo id.

Problem: On doing a disk-replacement of a brick in a replicate volume
the following 2 scenarios may occur :

a) there is a chance that reads are served from this replaced-disk brick,
which leads to empty reads. b) potential data loss if next writes succeed
only on replaced brick, and heal is done to other bricks from this one.

Solution: After disk-replacement, make sure that reset-brick command is
run for that brick so that pending markers are set for the brick and it
is not chosen as source for reads and heal. But, as of now replace-brick
for the same brick-path is not allowed. In order to fix the above
mentioned problem, same brick-path replace-brick is needed.
With this patch reset-brick commit {force} will be allowed even when
source and destination &lt;hostname:brickpath&gt; are identical as long as
1) destination brick is not alive
2) source and destination brick have the same brick uuid and path.
Also, the destination brick after replace-brick will use the same port
as the source brick.

Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de
BUG: 1266876
Signed-off-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/12250
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: fix unused variable warnings/errors</title>
<updated>2016-08-29T16:21:23+00:00</updated>
<author>
<name>Kaleb S. KEITHLEY</name>
<email>kkeithle@redhat.com</email>
</author>
<published>2016-08-22T16:11:24+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=5751f294fc8c6ea256fc5a842c7a6eef14d35fd3'/>
<id>5751f294fc8c6ea256fc5a842c7a6eef14d35fd3</id>
<content type='text'>
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.

However 14085 won't pass the smoke test until all the warnings are
fixed.

Change-Id: I98e3308a2548ae095048caa99c86edec15b5e782
BUG: 1369124
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15241
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
http://review.gluster.org/14085 fixes a/the "leak" - via the
generated rpc/xdr headers - of pragmas that mask these warnings.

However 14085 won't pass the smoke test until all the warnings are
fixed.

Change-Id: I98e3308a2548ae095048caa99c86edec15b5e782
BUG: 1369124
Signed-off-by: Kaleb S. KEITHLEY &lt;kkeithle@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15241
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dht: Implement ipc fop</title>
<updated>2016-08-27T12:48:36+00:00</updated>
<author>
<name>Poornima G</name>
<email>pgurusid@redhat.com</email>
</author>
<published>2016-07-18T15:49:34+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=b85c648a6b236ca03494cb61b97e2e703be0950c'/>
<id>b85c648a6b236ca03494cb61b97e2e703be0950c</id>
<content type='text'>
ipc is used by md-cache to communicate the list of xattrs that
it is caching, to the upcall xlator. Hence implement this in
dht, such that it winds to all the bricks if the ipc op is
GF_IPC_MDC_TARGET_UPCALL. The ips should not fail if any of
the bricks is down, as md-cache will replay the ipc late when
the brick comes back up.

Change-Id: Ica551a550c04cbb1240c0d211fe831c2e5eb6017
BUG: 1211863
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15225
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipc is used by md-cache to communicate the list of xattrs that
it is caching, to the upcall xlator. Hence implement this in
dht, such that it winds to all the bricks if the ipc op is
GF_IPC_MDC_TARGET_UPCALL. The ips should not fail if any of
the bricks is down, as md-cache will replay the ipc late when
the brick comes back up.

Change-Id: Ica551a550c04cbb1240c0d211fe831c2e5eb6017
BUG: 1211863
Signed-off-by: Poornima G &lt;pgurusid@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15225
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Use locks for opendir</title>
<updated>2016-08-25T13:48:33+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-24T15:31:05+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=f013335400d033a9677797377b90b968803135f4'/>
<id>f013335400d033a9677797377b90b968803135f4</id>
<content type='text'>
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.

Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.

BUG: 1346719
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15309
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.

Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.

BUG: 1346719
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15309
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>quotad: fix potential buffer overflows</title>
<updated>2016-08-25T12:18:09+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2016-05-06T06:56:29+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=c68b561f048a02f479819b1c9cb3b5b896db18a6'/>
<id>c68b561f048a02f479819b1c9cb3b5b896db18a6</id>
<content type='text'>
This converts sprintf to gf_asprintf in following components:                                                                                                          * quotad.c
* dht
* afr
* protocol/client
* rpc/rpc-lib
* rpc/rpc-transport

Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee
BUG: 1332073
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14102
Tested-by: Manikandan Selvaganesh &lt;mselvaga@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Manikandan Selvaganesh &lt;mselvaga@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This converts sprintf to gf_asprintf in following components:                                                                                                          * quotad.c
* dht
* afr
* protocol/client
* rpc/rpc-lib
* rpc/rpc-transport

Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee
BUG: 1332073
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on: http://review.gluster.org/14102
Tested-by: Manikandan Selvaganesh &lt;mselvaga@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Manikandan Selvaganesh &lt;mselvaga@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Do multi-threaded self-heal</title>
<updated>2016-08-24T22:24:22+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-03T19:11:16+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=56a79b357e09d91305994fcc0b2d250cb9ac243d'/>
<id>56a79b357e09d91305994fcc0b2d250cb9ac243d</id>
<content type='text'>
BUG: 1368451
Change-Id: I5d6b91d714ad6906dc478a401e614115c89a8fbb
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15083
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
BUG: 1368451
Change-Id: I5d6b91d714ad6906dc478a401e614115c89a8fbb
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15083
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Give option to do consistent-io</title>
<updated>2016-08-22T20:55:42+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2016-08-16T10:34:37+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=413594ed647400f1b39e05d4f1b12ad846e48800'/>
<id>413594ed647400f1b39e05d4f1b12ad846e48800</id>
<content type='text'>
Problem:
When tiering/rebalance does migrations and afr with 2-way replica is in
picture, migration can read stale data if the source brick goes down and writes
to the destination. After this deletion of the file leads to permanent loss of
data after migration.

Fix:
Rebalance/tiering should migrate only when the data is definitely not stale. So
introduce an option in afr called consistent-io which will be enabled in
migration daemons.

BUG: 1306398
Change-Id: I750f65091cc70a3ed4bf3c12f83d0949af43920a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13425
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When tiering/rebalance does migrations and afr with 2-way replica is in
picture, migration can read stale data if the source brick goes down and writes
to the destination. After this deletion of the file leads to permanent loss of
data after migration.

Fix:
Rebalance/tiering should migrate only when the data is definitely not stale. So
introduce an option in afr called consistent-io which will be enabled in
migration daemons.

BUG: 1306398
Change-Id: I750f65091cc70a3ed4bf3c12f83d0949af43920a
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/13425
Reviewed-by: Anuradha Talur &lt;atalur@redhat.com&gt;
Reviewed-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/afr: Prevent split-brain when bricks are brought off and on in cyclic order</title>
<updated>2016-08-22T09:38:36+00:00</updated>
<author>
<name>Krutika Dhananjay</name>
<email>kdhananj@redhat.com</email>
</author>
<published>2016-07-28T15:59:59+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=fcb5b70b1099d0379b40c81f35750df8bb9545a5'/>
<id>fcb5b70b1099d0379b40c81f35750df8bb9545a5</id>
<content type='text'>
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
    in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.

This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.

Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay &lt;kdhananj@redhat.com&gt;
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Smoke: Gluster Build System &lt;jenkins@build.gluster.org&gt;
CentOS-regression: Gluster Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
Reviewed-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
NetBSD-regression: NetBSD Build System &lt;jenkins@build.gluster.org&gt;
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
