<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/libglusterfs/src/syncop.c, branch v3.6.3beta1</title>
<subtitle>GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/'/>
<entry>
<title>synctask: add backtrace per waiting task</title>
<updated>2014-09-26T10:20:07+00:00</updated>
<author>
<name>Krishnan Parthasarathi</name>
<email>kparthas@redhat.com</email>
</author>
<published>2014-09-05T09:25:01+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=04edacc5a0cc0a4c8c6bf5188a5ca4afd2fa4965'/>
<id>04edacc5a0cc0a4c8c6bf5188a5ca4afd2fa4965</id>
<content type='text'>
The backtrace is 'saved' in a per-task buffer.
This would come handy while debugging code using
synctasks.

Change-Id: I732b275f6d15b31f31361f5ecf2ba47cacde9b54
BUG: 1145093
Signed-off-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8795
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The backtrace is 'saved' in a per-task buffer.
This would come handy while debugging code using
synctasks.

Change-Id: I732b275f6d15b31f31361f5ecf2ba47cacde9b54
BUG: 1145093
Signed-off-by: Krishnan Parthasarathi &lt;kparthas@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8795
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Niels de Vos &lt;ndevos@redhat.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncop: Invoke dict_unref() in inodelk only if dictionary is not NULL</title>
<updated>2014-09-20T08:54:55+00:00</updated>
<author>
<name>Vijay Bellur</name>
<email>vbellur@redhat.com</email>
</author>
<published>2014-09-19T14:04:22+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=10d45b9783b751c1f78b670281fd1d57ae377642'/>
<id>10d45b9783b751c1f78b670281fd1d57ae377642</id>
<content type='text'>
In the absence of this check, logs can get flooded with messages like this when rebalance is run:

[2014-09-04 17:48:07.148262] W [dict.c:480:dict_unref] (--&gt;/lib64/libc.so.6()
[0x30daa47a00] (--&gt;/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x12)
[0x7fa20b7c6ec2]
(--&gt;/usr/local/lib/glusterfs/3.7dev/xlator/cluster/distribute.so(dht_migrate_file+0x23f)
[0x7fa200fdb58f]))) 0-dict: dict is NULL

Change-Id: I4c93e4485293b35d86ba07df4d583d2758ec3f49
BUG: 1138395
Signed-off-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
Reviewed-on-master: http://review.gluster.org/8601
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Shyamsundar Ranganathan &lt;srangana@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8782
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the absence of this check, logs can get flooded with messages like this when rebalance is run:

[2014-09-04 17:48:07.148262] W [dict.c:480:dict_unref] (--&gt;/lib64/libc.so.6()
[0x30daa47a00] (--&gt;/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x12)
[0x7fa20b7c6ec2]
(--&gt;/usr/local/lib/glusterfs/3.7dev/xlator/cluster/distribute.so(dht_migrate_file+0x23f)
[0x7fa200fdb58f]))) 0-dict: dict is NULL

Change-Id: I4c93e4485293b35d86ba07df4d583d2758ec3f49
BUG: 1138395
Signed-off-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
Reviewed-on-master: http://review.gluster.org/8601
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Shyamsundar Ranganathan &lt;srangana@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8782
</pre>
</div>
</content>
</entry>
<entry>
<title>libglusterfs/syncop: implement inodelk</title>
<updated>2014-09-12T08:10:43+00:00</updated>
<author>
<name>Raghavendra G</name>
<email>rgowdapp@redhat.com</email>
</author>
<published>2014-09-04T18:12:39+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=b0dfe8da3fdda31b57177a08de8acec31ac749bf'/>
<id>b0dfe8da3fdda31b57177a08de8acec31ac749bf</id>
<content type='text'>
Change-Id: Iea489157490b70cb2bb03576b0d4943c6d8f052d
BUG: 1138395
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on-master: http://review.gluster.org/8522
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8610
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Iea489157490b70cb2bb03576b0d4943c6d8f052d
BUG: 1138395
Signed-off-by: Raghavendra G &lt;rgowdapp@redhat.com&gt;
Reviewed-on-master: http://review.gluster.org/8522
Reviewed-by: Pranith Kumar Karampuri &lt;pkarampu@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
Reviewed-on: http://review.gluster.org/8610
</pre>
</div>
</content>
</entry>
<entry>
<title>user servicable snapshots</title>
<updated>2014-05-29T16:25:46+00:00</updated>
<author>
<name>Raghavendra Bhat</name>
<email>raghavendra@redhat.com</email>
</author>
<published>2014-05-07T14:43:43+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=cc0378d39f4082f51d5ef6e02b3007fe9e78cb31'/>
<id>cc0378d39f4082f51d5ef6e02b3007fe9e78cb31</id>
<content type='text'>
Change-Id: Idbf27dbe088e646a8ab81cedc5818413795895ea
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Signed-off-by: Anand Subramanian &lt;anands@redhat.com&gt;
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/7700
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Idbf27dbe088e646a8ab81cedc5818413795895ea
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Signed-off-by: Anand Subramanian &lt;anands@redhat.com&gt;
Signed-off-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
Reviewed-on: http://review.gluster.org/7700
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncops: add support for custom PID</title>
<updated>2014-02-13T19:19:33+00:00</updated>
<author>
<name>Anand Avati</name>
<email>avati@redhat.com</email>
</author>
<published>2014-01-27T08:58:45+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=3571066deedfe858ef37f09d6ad2160e5dd7b803'/>
<id>3571066deedfe858ef37f09d6ad2160e5dd7b803</id>
<content type='text'>
AFR self-heal needs to issue syncops with special PID. Extend
the custom UID/GID support to include custom PIDs

Change-Id: I736c0e177f862b029f203acc87f9eb46c8cb839b
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6888
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AFR self-heal needs to issue syncops with special PID. Extend
the custom UID/GID support to include custom PIDs

Change-Id: I736c0e177f862b029f203acc87f9eb46c8cb839b
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6888
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: add @xdata parameter to syncop_[f]removexattr()</title>
<updated>2014-02-13T19:17:05+00:00</updated>
<author>
<name>Anand Avati</name>
<email>avati@redhat.com</email>
</author>
<published>2014-02-07T22:29:34+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=0cab34b3a5e94267bf6b39669b6e85af1fab8f3d'/>
<id>0cab34b3a5e94267bf6b39669b6e85af1fab8f3d</id>
<content type='text'>
To be used in afr metadata self-heal

Change-Id: I8dac4b19d61e331702427eeb5b606aab3d20b328
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6941
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To be used in afr metadata self-heal

Change-Id: I8dac4b19d61e331702427eeb5b606aab3d20b328
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6941
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Raghavendra Bhat &lt;raghavendra@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncop: Change return value of syncop</title>
<updated>2014-01-20T07:05:15+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2013-12-11T04:03:53+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=8d55c25f158921b508bff0e7f25158991913f922'/>
<id>8d55c25f158921b508bff0e7f25158991913f922</id>
<content type='text'>
Problem:
We found a day-1 bug when syncop_xxx() infra is used inside a synctask with
compilation optimization (CFLAGS -O2).

Detailed explanation of the Root cause:
We found the bug in 'gf_defrag_migrate_data' in rebalance operation:

Lets look at interesting parts of the function:

int
gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t *loc,
                        dict_t *migrate_data)
{
.....
code section - [ Loop ]
        while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL,
                                       &amp;entries)) != 0) {
.....
code section - [ ERRNO-1 ] (errno of readdirp is stored in readdir_operrno by a
thread)
                /* Need to keep track of ENOENT errno, that means, there is no
                   need to send more readdirp() */
                readdir_operrno = errno;
.....
code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread)
                        ret = syncop_getxattr (this, &amp;entry_loc, &amp;dict,
                                               GF_XATTR_LINKINFO_KEY);
code section - [ ERRNO-2]   (checking for failures of syncop_getxattr(). This
may not always be executed in same thread which executed [SYNCOP-1])
                        if (ret &lt; 0) {
                                if (errno != ENODATA) {
                                        loglevel = GF_LOG_ERROR;
                                        defrag-&gt;total_failures += 1;
.....
}

the function above could be executed by thread(t1) till [SYNCOP-1] and code
from [ERRNO-2] can be executed by a different thread(t2) because of the way
syncop-infra schedules the tasks.

when the code is compiled with -O2 optimization this is the assembly code that
is generated:
 [ERRNO-1]
1165                        readdir_operrno = errno; &lt;&lt;---- errno gets expanded
as *(__errno_location())
   0x00007fd149d48b60 &lt;+496&gt;:        callq  0x7fd149d410c0 &lt;address@hidden&gt;
   0x00007fd149d48b72 &lt;+514&gt;:        mov    %rax,0x50(%rsp) &lt;&lt;------ Address
returned by __errno_location() is stored in a special location in stack for
later use.
   0x00007fd149d48b77 &lt;+519&gt;:        mov    (%rax),%eax
   0x00007fd149d48b79 &lt;+521&gt;:        mov    %eax,0x78(%rsp)
....
 [ERRNO-2]
1281                                        if (errno != ENODATA) {
   0x00007fd149d492ae &lt;+2366&gt;:        mov    0x50(%rsp),%rax &lt;&lt;-----  Because
it already stored the address returned by __errno_location(), it just
dereferences the address to get the errno value. BUT THIS CODE NEED NOT BE
EXECUTED BY SAME THREAD!!!
   0x00007fd149d492b3 &lt;+2371&gt;:        mov    $0x9,%ebp
   0x00007fd149d492b8 &lt;+2376&gt;:        mov    (%rax),%edi
   0x00007fd149d492ba &lt;+2378&gt;:        cmp    $0x3d,%edi

The problem is that __errno_location() value of t1 and t2 are different. So
[ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though t2 is
executing [ERRNO-2] code section.

When code is compiled without any optimization for [ERRNO-2]:
1281                                        if (errno != ENODATA) {
   0x00007fd58e7a326f &lt;+2237&gt;:        callq  0x7fd58e797300
&lt;address@hidden&gt;&lt;&lt;--- As it is calling __errno_location() again it gets the
location from t2 so it works as intended.
   0x00007fd58e7a3274 &lt;+2242&gt;:        mov    (%rax),%eax
   0x00007fd58e7a3276 &lt;+2244&gt;:        cmp    $0x3d,%eax
   0x00007fd58e7a3279 &lt;+2247&gt;:        je     0x7fd58e7a32a1
&lt;gf_defrag_migrate_data+2287&gt;

Fix:
Make syncop_xxx() return (-errno) value as the return value in
case of errors and all the functions which make syncop_xxx() will need to use
(-ret) to figure out the reason for failure in case of syncop_xxx() failures.

Change-Id: I314d20dabe55d3e62ff66f3b4adb1cac2eaebb57
BUG: 1040356
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6475
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
We found a day-1 bug when syncop_xxx() infra is used inside a synctask with
compilation optimization (CFLAGS -O2).

Detailed explanation of the Root cause:
We found the bug in 'gf_defrag_migrate_data' in rebalance operation:

Lets look at interesting parts of the function:

int
gf_defrag_migrate_data (xlator_t *this, gf_defrag_info_t *defrag, loc_t *loc,
                        dict_t *migrate_data)
{
.....
code section - [ Loop ]
        while ((ret = syncop_readdirp (this, fd, 131072, offset, NULL,
                                       &amp;entries)) != 0) {
.....
code section - [ ERRNO-1 ] (errno of readdirp is stored in readdir_operrno by a
thread)
                /* Need to keep track of ENOENT errno, that means, there is no
                   need to send more readdirp() */
                readdir_operrno = errno;
.....
code section - [ SYNCOP-1 ] (syncop_getxattr is called by a thread)
                        ret = syncop_getxattr (this, &amp;entry_loc, &amp;dict,
                                               GF_XATTR_LINKINFO_KEY);
code section - [ ERRNO-2]   (checking for failures of syncop_getxattr(). This
may not always be executed in same thread which executed [SYNCOP-1])
                        if (ret &lt; 0) {
                                if (errno != ENODATA) {
                                        loglevel = GF_LOG_ERROR;
                                        defrag-&gt;total_failures += 1;
.....
}

the function above could be executed by thread(t1) till [SYNCOP-1] and code
from [ERRNO-2] can be executed by a different thread(t2) because of the way
syncop-infra schedules the tasks.

when the code is compiled with -O2 optimization this is the assembly code that
is generated:
 [ERRNO-1]
1165                        readdir_operrno = errno; &lt;&lt;---- errno gets expanded
as *(__errno_location())
   0x00007fd149d48b60 &lt;+496&gt;:        callq  0x7fd149d410c0 &lt;address@hidden&gt;
   0x00007fd149d48b72 &lt;+514&gt;:        mov    %rax,0x50(%rsp) &lt;&lt;------ Address
returned by __errno_location() is stored in a special location in stack for
later use.
   0x00007fd149d48b77 &lt;+519&gt;:        mov    (%rax),%eax
   0x00007fd149d48b79 &lt;+521&gt;:        mov    %eax,0x78(%rsp)
....
 [ERRNO-2]
1281                                        if (errno != ENODATA) {
   0x00007fd149d492ae &lt;+2366&gt;:        mov    0x50(%rsp),%rax &lt;&lt;-----  Because
it already stored the address returned by __errno_location(), it just
dereferences the address to get the errno value. BUT THIS CODE NEED NOT BE
EXECUTED BY SAME THREAD!!!
   0x00007fd149d492b3 &lt;+2371&gt;:        mov    $0x9,%ebp
   0x00007fd149d492b8 &lt;+2376&gt;:        mov    (%rax),%edi
   0x00007fd149d492ba &lt;+2378&gt;:        cmp    $0x3d,%edi

The problem is that __errno_location() value of t1 and t2 are different. So
[ERRNO-2] ends up reading errno of t1 instead of errno of t2 even though t2 is
executing [ERRNO-2] code section.

When code is compiled without any optimization for [ERRNO-2]:
1281                                        if (errno != ENODATA) {
   0x00007fd58e7a326f &lt;+2237&gt;:        callq  0x7fd58e797300
&lt;address@hidden&gt;&lt;&lt;--- As it is calling __errno_location() again it gets the
location from t2 so it works as intended.
   0x00007fd58e7a3274 &lt;+2242&gt;:        mov    (%rax),%eax
   0x00007fd58e7a3276 &lt;+2244&gt;:        cmp    $0x3d,%eax
   0x00007fd58e7a3279 &lt;+2247&gt;:        je     0x7fd58e7a32a1
&lt;gf_defrag_migrate_data+2287&gt;

Fix:
Make syncop_xxx() return (-errno) value as the return value in
case of errors and all the functions which make syncop_xxx() will need to use
(-ret) to figure out the reason for failure in case of syncop_xxx() failures.

Change-Id: I314d20dabe55d3e62ff66f3b4adb1cac2eaebb57
BUG: 1040356
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6475
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>syncops: expose @flags in syncop_rmdir()</title>
<updated>2013-11-21T21:08:32+00:00</updated>
<author>
<name>Anand Avati</name>
<email>avati@redhat.com</email>
</author>
<published>2013-10-11T07:20:15+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=544dee895a43ec9bb98fc8ace3d124d44bb617f2'/>
<id>544dee895a43ec9bb98fc8ace3d124d44bb617f2</id>
<content type='text'>
Change-Id: I9b73c1db728e4cb3948fc118cceb292b21d48b96
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6112
Reviewed-by: Amar Tumballi &lt;amarts@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I9b73c1db728e4cb3948fc118cceb292b21d48b96
BUG: 1021686
Signed-off-by: Anand Avati &lt;avati@redhat.com&gt;
Reviewed-on: http://review.gluster.org/6112
Reviewed-by: Amar Tumballi &lt;amarts@redhat.com&gt;
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>zerofill: Change the type of len argument of glfs_zerofill() to off_t</title>
<updated>2013-11-15T07:29:48+00:00</updated>
<author>
<name>Bharata B Rao</name>
<email>bharata@linux.vnet.ibm.com</email>
</author>
<published>2013-11-15T04:41:58+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=884a668a9c3e12e17d64ebd5ccd9fbf3d203fd1e'/>
<id>884a668a9c3e12e17d64ebd5ccd9fbf3d203fd1e</id>
<content type='text'>
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.

Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao &lt;bharata@linux.vnet.ibm.com&gt;
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Tested-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.

Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao &lt;bharata@linux.vnet.ibm.com&gt;
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Tested-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-by: Anand Avati &lt;avati@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterfs: zerofill support</title>
<updated>2013-11-11T05:25:49+00:00</updated>
<author>
<name>M. Mohan Kumar</name>
<email>mohan@in.ibm.com</email>
</author>
<published>2013-11-09T09:21:53+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=c8fef37c5d566c906728b5f6f27baaa9a8d2a20d'/>
<id>c8fef37c5d566c906728b5f6f27baaa9a8d2a20d</id>
<content type='text'>
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or  during scrubbing of VM disk images).

Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop,  client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls  and acknowledgements.

WRITESAME is a  SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl.  BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.

The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.

This patch adds zerofill support to the following areas:
	- libglusterfs
	- io-stats
	- performance/md-cache,open-behind
	- quota
	- cluster/afr,dht,stripe
	- rpc/xdr
	- protocol/client,server
	- io-threads
	- marker
	- storage/posix
	- libgfapi

Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.

Changes from previous version 3:
* Removed redundant memory failure log messages

Changes from previous version 2:
* Rebased and fixed build error

Changes from previous version 1:
* Rebased for latest master

TODO :
     * Add zerofill support to trace xlator
     * Expose zerofill capability as part of gluster volume info

Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.

[root@llmvm02 remote]# time ./offloaded aakash-test log 20

real	3m34.155s
user	0m0.018s
sys	0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20

real	4m23.043s
user	0m2.197s
sys	0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;

real	4m28.363s
user	0m0.021s
sys	0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25

real	5m34.278s
user	0m2.957s
sys	0m18.808s

The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .

As we can see there is a performance improvement of around 20% with this
fop.

Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das &lt;aakash@linux.vnet.ibm.com&gt;
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or  during scrubbing of VM disk images).

Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop,  client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls  and acknowledgements.

WRITESAME is a  SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl.  BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.

The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.

This patch adds zerofill support to the following areas:
	- libglusterfs
	- io-stats
	- performance/md-cache,open-behind
	- quota
	- cluster/afr,dht,stripe
	- rpc/xdr
	- protocol/client,server
	- io-threads
	- marker
	- storage/posix
	- libgfapi

Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.

Changes from previous version 3:
* Removed redundant memory failure log messages

Changes from previous version 2:
* Rebased and fixed build error

Changes from previous version 1:
* Rebased for latest master

TODO :
     * Add zerofill support to trace xlator
     * Expose zerofill capability as part of gluster volume info

Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.

[root@llmvm02 remote]# time ./offloaded aakash-test log 20

real	3m34.155s
user	0m0.018s
sys	0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20

real	4m23.043s
user	0m2.197s
sys	0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;

real	4m28.363s
user	0m0.021s
sys	0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25

real	5m34.278s
user	0m2.957s
sys	0m18.808s

The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .

As we can see there is a performance improvement of around 20% with this
fop.

Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das &lt;aakash@linux.vnet.ibm.com&gt;
Signed-off-by: M. Mohan Kumar &lt;mohan@in.ibm.com&gt;
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System &lt;jenkins@build.gluster.com&gt;
Reviewed-by: Vijay Bellur &lt;vbellur@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
