<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/bugs/glusterd, branch devel</title>
<subtitle>GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/'/>
<entry>
<title>cluster/dht: use readdir for fix-layout in rebalance (#2243)</title>
<updated>2021-03-22T04:49:27+00:00</updated>
<author>
<name>Pranith Kumar Karampuri</name>
<email>pranith.karampuri@phonepe.com</email>
</author>
<published>2021-03-22T04:49:27+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=ec189a499d85c2aad1d54e55e47df6b95ba02922'/>
<id>ec189a499d85c2aad1d54e55e47df6b95ba02922</id>
<content type='text'>
Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.

Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.

fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
On a cluster with 15 million files, when fix-layout was started, it was
not progressing at all. So we tried to do a os.walk() + os.stat() on the
backend filesystem directly. It took 2.5 days. We removed os.stat() and
re-ran it on another brick with similar data-set. It took 15 minutes. We
realized that readdirp is extremely costly compared to readdir if the
stat is not useful. fix-layout operation only needs to know that the
entry is a directory so that fix-layout operation can be triggered on
it. Most of the modern filesystems provide this information in readdir
operation. We don't need readdirp i.e. readdir+stat.

Fix:
Use readdir operation in fix-layout. Do readdir+stat/lookup for
filesystems that don't provide d_type in readdir operation.

fixes: #2241
Change-Id: I5fe2ecea25a399ad58e31a2e322caf69fc7f49eb
Signed-off-by: Pranith Kumar K &lt;pranith.karampuri@phonepe.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>core: Implement gracefull shutdown for a brick process (#1751)</title>
<updated>2020-12-16T06:05:31+00:00</updated>
<author>
<name>mohit84</name>
<email>moagrawa@redhat.com</email>
</author>
<published>2020-12-16T06:05:31+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=a7761a483dd43e5973e00d79cd0947aab789179a'/>
<id>a7761a483dd43e5973e00d79cd0947aab789179a</id>
<content type='text'>
* core: Implement gracefull shutdown for a brick process

glusterd sends a SIGTERM to brick process at the time
of stopping a volume if brick_mux is not enabled.In case
of brick_mux at the time of getting a terminate signal
for last brick a brick process sends a SIGTERM to own
process for stop a brick process.The current approach
does not cleanup resources in case of either last brick
is detached or brick_mux is not enabled.

Solution: glusterd sends a terminate notification to a
brick process at the time of stopping a volume for gracefull
shutdown

Change-Id: I49b729e1205e75760f6eff9bf6803ed0dbf876ae
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement gracefull shutdown for a brick process

Resolve some reviwere comment
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

Change-Id: I50e6a9e2ec86256b349aef5b127cc5bbf32d2561

* core: Implement graceful shutdown for a brick process

Implement a key cluster.brick-graceful-cleanup to enable graceful
shutdown for a brick process.If key value is on glusterd sends a
detach request to stop the brick.

Fixes: #1749
Change-Id: Iba8fb27ba15cc37ecd3eb48f0ea8f981633465c3
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement graceful shutdown for a brick process

Resolve reviewer comments
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

Change-Id: I2a8eb4cf25cd8fca98d099889e4cae3954c8579e

* core: Implement gracefull shutdown for a brick process

Resolve reviewer comment specific to avoid memory leak

Fixes: #1749
Change-Id: Ic2f09efe6190fd3776f712afc2d49b4e63de7d1f
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement gracefull shutdown for a brick process

Resolve reviewer comment specific to avoid memory leak

Fixes: #1749
Change-Id: I68fbbb39160a4595fb8b1b19836f44b356e89716
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* core: Implement gracefull shutdown for a brick process

glusterd sends a SIGTERM to brick process at the time
of stopping a volume if brick_mux is not enabled.In case
of brick_mux at the time of getting a terminate signal
for last brick a brick process sends a SIGTERM to own
process for stop a brick process.The current approach
does not cleanup resources in case of either last brick
is detached or brick_mux is not enabled.

Solution: glusterd sends a terminate notification to a
brick process at the time of stopping a volume for gracefull
shutdown

Change-Id: I49b729e1205e75760f6eff9bf6803ed0dbf876ae
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement gracefull shutdown for a brick process

Resolve some reviwere comment
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

Change-Id: I50e6a9e2ec86256b349aef5b127cc5bbf32d2561

* core: Implement graceful shutdown for a brick process

Implement a key cluster.brick-graceful-cleanup to enable graceful
shutdown for a brick process.If key value is on glusterd sends a
detach request to stop the brick.

Fixes: #1749
Change-Id: Iba8fb27ba15cc37ecd3eb48f0ea8f981633465c3
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement graceful shutdown for a brick process

Resolve reviewer comments
Fixes: #1749
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

Change-Id: I2a8eb4cf25cd8fca98d099889e4cae3954c8579e

* core: Implement gracefull shutdown for a brick process

Resolve reviewer comment specific to avoid memory leak

Fixes: #1749
Change-Id: Ic2f09efe6190fd3776f712afc2d49b4e63de7d1f
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;

* core: Implement gracefull shutdown for a brick process

Resolve reviewer comment specific to avoid memory leak

Fixes: #1749
Change-Id: I68fbbb39160a4595fb8b1b19836f44b356e89716
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd/cli: enhance rebalance-status after replace/reset-brick (#1869)</title>
<updated>2020-12-08T10:51:35+00:00</updated>
<author>
<name>Tamar Shacked</name>
<email>tshacked@redhat.com</email>
</author>
<published>2020-12-08T10:51:35+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=ae8cfe5baaff5b3e4c55f49ec71811e32a885271'/>
<id>ae8cfe5baaff5b3e4c55f49ec71811e32a885271</id>
<content type='text'>
* glusterd/cli: enhance rebalance-status after replace/reset-brick

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix:
change rebalance-status to "reset due to (replace|reset)-brick"

Change-Id: I6e3372d67355eb76c5965984a23f073289d4ff23
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;

* glusterd/cli: enhance rebalance-status after replace/reset-brick

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix: change rebalance-status to "reset due to (replace|reset)-brick"

Fixes: #1717
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;

Change-Id: I1e3e373ca3b2007b5b7005b6c757fb43801fde33

* cli: changing rebal task ID to "None" in case status is being reset

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix:
change rebalance-status to "reset due to (replace|reset)-brick"

Fixes: #1717

Change-Id: Ia73a8bea3dcd8e51acf4faa6434c3cb0d09856d0
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* glusterd/cli: enhance rebalance-status after replace/reset-brick

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix:
change rebalance-status to "reset due to (replace|reset)-brick"

Change-Id: I6e3372d67355eb76c5965984a23f073289d4ff23
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;

* glusterd/cli: enhance rebalance-status after replace/reset-brick

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix: change rebalance-status to "reset due to (replace|reset)-brick"

Fixes: #1717
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;

Change-Id: I1e3e373ca3b2007b5b7005b6c757fb43801fde33

* cli: changing rebal task ID to "None" in case status is being reset

Rebalance status is being reset during replace/reset-brick operations.
This cause 'volume status' to shows rebalance as "not started".

Fix:
change rebalance-status to "reset due to (replace|reset)-brick"

Fixes: #1717

Change-Id: Ia73a8bea3dcd8e51acf4faa6434c3cb0d09856d0
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: modify logic for checking hostname in add-brick (#1781)</title>
<updated>2020-12-07T05:54:43+00:00</updated>
<author>
<name>Sheetal Pamecha</name>
<email>spamecha@redhat.com</email>
</author>
<published>2020-12-07T05:54:43+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=f8bd04fb3ba0fff04abd0c6fff19f42c59376617'/>
<id>f8bd04fb3ba0fff04abd0c6fff19f42c59376617</id>
<content type='text'>
* glusterd: modify logic for checking hostname in add-brick

Problem: add-brick command parses only the bricks provided
in cli for a subvolume. If in same subvolume bricks are
increased, these are not checked with present volume bricks.

Fixes: #1779
Change-Id: I768bcf7359a008f2d6baccef50e582536473a9dc
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* removed assignment of unused variable

Fixes: #1779
Change-Id: Id5ed776b28343e1225b9898e81502ce29fb480fa
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* few more changes

Change-Id: I7bacedb984f968939b214f9d13546f4bf92e9df7
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* few more changes

Change-Id: I7bacedb984f968939b214f9d13546f4bf92e9df7
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* correction in last commit
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

Change-Id: I1fd0d941cf3f32aa6e8c7850def78e5af0d88782</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* glusterd: modify logic for checking hostname in add-brick

Problem: add-brick command parses only the bricks provided
in cli for a subvolume. If in same subvolume bricks are
increased, these are not checked with present volume bricks.

Fixes: #1779
Change-Id: I768bcf7359a008f2d6baccef50e582536473a9dc
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* removed assignment of unused variable

Fixes: #1779
Change-Id: Id5ed776b28343e1225b9898e81502ce29fb480fa
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* few more changes

Change-Id: I7bacedb984f968939b214f9d13546f4bf92e9df7
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* few more changes

Change-Id: I7bacedb984f968939b214f9d13546f4bf92e9df7
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

* correction in last commit
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;

Change-Id: I1fd0d941cf3f32aa6e8c7850def78e5af0d88782</pre>
</div>
</content>
</entry>
<entry>
<title>io-stats: Configure ios_sample_buf_size based on sample_interval value (#1574)</title>
<updated>2020-10-15T10:58:58+00:00</updated>
<author>
<name>mohit84</name>
<email>moagrawa@redhat.com</email>
</author>
<published>2020-10-15T10:58:58+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=f71660eb879a9cd5761e5adbf10c783e959a990a'/>
<id>f71660eb879a9cd5761e5adbf10c783e959a990a</id>
<content type='text'>
io-stats xlator declares a ios_sample_buf_size 64k object(10M) per xlator
but in case of sample_interval is 0 this big buffer is not required so
declare the default value only while sample_interval is not 0.The new
change would be helpful to reduce RSS size for a brick and shd process
while the number of volumes are huge.

Change-Id: I3e82cca92e40549355edfac32580169f3ce51af8
Fixes: #1542
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
io-stats xlator declares a ios_sample_buf_size 64k object(10M) per xlator
but in case of sample_interval is 0 this big buffer is not required so
declare the default value only while sample_interval is not 0.The new
change would be helpful to reduce RSS size for a brick and shd process
while the number of volumes are huge.

Change-Id: I3e82cca92e40549355edfac32580169f3ce51af8
Fixes: #1542
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: Fix Add-brick with increasing replica count failure</title>
<updated>2020-09-23T12:11:46+00:00</updated>
<author>
<name>Sheetal Pamecha</name>
<email>spamecha@redhat.com</email>
</author>
<published>2020-09-23T12:11:46+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=adbb679370810b1802ac1791f44f9232c8f15f65'/>
<id>adbb679370810b1802ac1791f44f9232c8f15f65</id>
<content type='text'>
Problem: add-brick operation fails with multiple bricks on same
server error when replica count is increased.

This was happening because of extra runs in a loop to compare
hostnames and if bricks supplied were less than "replica" count,
the bricks will get compared to itself resulting in above error.

Fixes: #1508
Change-Id: I8668e964340b7bf59728bb838525d2db062197ed
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: add-brick operation fails with multiple bricks on same
server error when replica count is increased.

This was happening because of extra runs in a loop to compare
hostnames and if bricks supplied were less than "replica" count,
the bricks will get compared to itself resulting in above error.

Fixes: #1508
Change-Id: I8668e964340b7bf59728bb838525d2db062197ed
Signed-off-by: Sheetal Pamecha &lt;spamecha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: provide an option to mark tests as 'flaky'</title>
<updated>2020-08-18T08:38:20+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amar@kadalu.io</email>
</author>
<published>2020-08-18T08:38:20+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=5731d25c9ff6a907fe68b99d1e79505b2331259d'/>
<id>5731d25c9ff6a907fe68b99d1e79505b2331259d</id>
<content type='text'>
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.

Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi &lt;amar@kadalu.io&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* also add some time gap in other tests to see if we get things properly
* create a directory 'tests/000/', which can host any tests, which are flaky.
* move all the tests mentioned in the issue to above directory.
* as the above dir gets tested first, all flaky tests would be reported quickly.
* change `run-tests.sh` to continue tests even if flaky tests fail.

Reference: gluster/project-infrastructure#72
Updates: #1000
Change-Id: Ifdafa38d083ebd80f7ae3cbbc9aa3b68b6d21d0e
Signed-off-by: Amar Tumballi &lt;amar@kadalu.io&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: getspec() returns wrong response when volfile not found</title>
<updated>2020-07-21T06:31:18+00:00</updated>
<author>
<name>Tamar Shacked</name>
<email>tshacked@redhat.com</email>
</author>
<published>2020-07-21T06:31:18+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=bb10f9f86fcdf6fb9b4c9dc0e4c7ef3a88ccd44b'/>
<id>bb10f9f86fcdf6fb9b4c9dc0e4c7ef3a88ccd44b</id>
<content type='text'>
In a cluster env: getspec() detects that volfile not found.
but further on, this return code is set by another call
so the error is lost and not handled.
As a result the server responds with ambiguous message:
{op_ret = -1, op_errno = 0..} - which cause the client to stuck.

Fix:
server side: don't override the failure error.

fixes: #1375
Change-Id: Id394954d4d0746570c1ee7d98969649c305c6b0d
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a cluster env: getspec() detects that volfile not found.
but further on, this return code is set by another call
so the error is lost and not handled.
As a result the server responds with ambiguous message:
{op_ret = -1, op_errno = 0..} - which cause the client to stuck.

Fix:
server side: don't override the failure error.

fixes: #1375
Change-Id: Id394954d4d0746570c1ee7d98969649c305c6b0d
Signed-off-by: Tamar Shacked &lt;tshacked@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: added volume operations to increase code coverage</title>
<updated>2020-05-26T13:15:46+00:00</updated>
<author>
<name>nik-redhat</name>
<email>nladha@redhat.com</email>
</author>
<published>2020-05-26T13:15:46+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=dbaa39f0e4704323bccf9e6ee31498ec2dc725ab'/>
<id>dbaa39f0e4704323bccf9e6ee31498ec2dc725ab</id>
<content type='text'>
Added test for volume options like
localtime-logging, fixed enable-shared-storage
to include function coverage and few negative
tests for other volume options to increase the
code coverage in the glusterd component.

Change-Id: Ib1706c1fd5bc98a64dcb5c8b15a121d639a597d7
Updates: #1052
Signed-off-by: nik-redhat &lt;nladha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Added test for volume options like
localtime-logging, fixed enable-shared-storage
to include function coverage and few negative
tests for other volume options to increase the
code coverage in the glusterd component.

Change-Id: Ib1706c1fd5bc98a64dcb5c8b15a121d639a597d7
Updates: #1052
Signed-off-by: nik-redhat &lt;nladha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>glusterd: add-brick command failure</title>
<updated>2020-06-16T12:33:21+00:00</updated>
<author>
<name>Sanju Rakonde</name>
<email>srakonde@redhat.com</email>
</author>
<published>2020-06-16T12:33:21+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/anoopcs/public_git/glusterfs.git/commit/?id=3ba1feb1608c94d34367fe903ec4bed4871db0ea'/>
<id>3ba1feb1608c94d34367fe903ec4bed4871db0ea</id>
<content type='text'>
Problem: add-brick operation is failing when replica or disperse
count is not mentioned in the add-brick command.

Reason: with commit a113d93 we are checking brick order while
doing add-brick operation for replica and disperse volumes. If
replica count or disperse count is not mentioned in the command,
the dict get is failing and resulting add-brick operation failure.

fixes: #1306

Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353
Signed-off-by: Sanju Rakonde &lt;srakonde@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem: add-brick operation is failing when replica or disperse
count is not mentioned in the add-brick command.

Reason: with commit a113d93 we are checking brick order while
doing add-brick operation for replica and disperse volumes. If
replica count or disperse count is not mentioned in the command,
the dict get is failing and resulting add-brick operation failure.

fixes: #1306

Change-Id: Ie957540e303bfb5f2d69015661a60d7e72557353
Signed-off-by: Sanju Rakonde &lt;srakonde@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
