glusterfs.git - GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Bump up timeout for tests on AWS	Nigel Babu	2019-02-07	2	-2/+3
\| \| \| \| \| \|	Fixes: bz#1673267 Change-Id: I2b9be45f199f6436b858536c6f49be85902217f0 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	glusterd: manage upgrade to current master	Amar Tumballi	2019-02-04	3	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scenarios tested: * Upgrade the node when there are stripe / tiering and regular type of volumes are present. - All volumes are started fine (as the change was not on brick volfile) - For tier, the functionality may not even work, as changetimerecorder is not present. - 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and tier type of volume. * Upgrade in a rolling upgrade scenario, where an old version is able to connect to higher master. - on a normal volume, if the volfile-server was new, the newer client volfiles needed to have utime xlator conditionally. - with this one change, all other changes seem to work fine. Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8 updates: bz#1635688 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	readdir-ahead: do not zero-out iatt in fop cbk	Ravishankar N	2019-01-31	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...when ctime is zero. ia_type and ia_gfid always need to be non-zero for things to work correctly. Problem: Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt buffer in the cbks of modification fops before unwinding if the ctime in the buffer was zero. This was causing the fops to fail: noticeable when AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime when the option is set. See commit 4c4624c9bad2edf27128cb122c64f15d7d63bbc8). Fixes: -Do not zero out the ia_type and ia_gfid of the iatt buff under any circumstance. -Also, fixed _rda_inode_ctx_update_iatts() to always update these values from the incoming buf when ctime is zero. Otherwise we end up with zero ia_type and ia_gfid the first time the function is called and the incoming buf has ctime set to zero. fixes: bz#1670253 Reported-By:Michael Hanselmann <public@hansmi.ch> Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	features/shard: Ref shard inode while adding to fsync list	Krutika Dhananjay	2019-01-24	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROBLEM: Lot of the earlier changes in the management of shards in lru, fsync lists assumed that if a given shard exists in fsync list, it must be part of lru list as well. This was found to be not true. Consider this - a file is FALLOCATE'd to a size which would make the number of participant shards to be greater than the lru list size. In this case, some of the resolved shards that are to participate in this fop will be evicted from lru list to give way to the rest of the shards. And once FALLOCATE completes, these shards are added to fsync list but without a ref. After the fop completes, these shard inodes are unref'd and destroyed while their inode ctxs are still part of fsync list. Now when an FSYNC is called on the base file and the fsync-list traversed, the client crashes due to illegal memory access. FIX: Hold a ref on the shard inode when adding to fsync list as well. And unref under following conditions: 1. when the shard is evicted from lru list 2. when the base file is fsync'd 3. when the shards are deleted. Change-Id: Iab460667d091b8388322f59b6cb27ce69299b1b2 fixes: bz#1669077 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	tests: run nfs tests only if --enable-gnfs is provided	Amar Tumballi	2019-01-24	39	-0/+79
\| \| \| \| \| \|	Fixes: bz#1665358 Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests/bug-brick-mux-restart: add extra information	Amar Tumballi	2019-01-24	1	-1/+12
\| \| \| \| \| \| \| \| \| \|	so that we can understand more about process memory and thread consumptions With this, we will also be able to understand more about the process details with brick-mux. updates: bz#1193929 Change-Id: I147a3e3814fc37dfb635217d0a0f0184fae0994f Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	afr: not resolve splitbrains when copies are of same size	Iraj Jamali	2019-01-22	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \|	Automatic Splitbrain with size as policy must not resolve splitbrains when the copies are of same size. Determining if the sizes of copies are same and returning -1 in that case. updates: bz#1655052 Change-Id: I3d8e8b4d7962b070ed16c3ee02a1e5a926fd5eab Signed-off-by: Iraj Jamali <ijamali@redhat.com>
*	cluster/dht: Delete invalid linkto files in rmdir	N Balachandran	2019-01-22	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \|	rm -rf <dir> fails on dirs which contain linkto files that point to themselves because dht incorrectly thought that they were cached files after looking them up. The fix now treats them as invalid linkto files and deletes them. Change-Id: I376c72a5309714ee339c74485e02cfb4e29be643 fixes: bz#1667804 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	afr: Splitbrain with size as policy must not resolve for directory	Sheetal Pamecha	2019-01-21	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In automatic Splitbrain resolution when favorite child policy is set as size, split brain resolution must not work for directories. Currently, if a directory is in split brain with both copies having same size, the source is selected arbitrarily and healed. fixes: bz#1655050 Change-Id: I5739498639c17c89874cc577362e543adab55f5d Signed-off-by: Sheetal Pamecha <sheetal.pamecha08@gmail.com>
*	core: Feature added to accept CidrIp in auth.allow	Rinku Kothiya	2019-01-18	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added functionality to gluster volume set auth.allow command to accept CIDR IP addresses. Modified few functions to isolate cidr feature so that it prevents other gluster commands such as peer probe to use cidr format ip. The functions are modified in such a way that they have an option to enable accepting of cidr format for other gluster commands if required in furture. updates: bz#1138841 Change-Id: Ie6734002a7078f1820e5df42d404411cce945e8b Credits: Mohit Agrawal Signed-off-by: Rinku Kothiya <rkothiya@redhat.com>
*	core: Resolve dict_leak at the time of destroying graph	Mohit Agrawal	2019-01-14	1	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \|	Problem: In gluster code some of the places it call's get_new_dict to create a dictionary without taking reference so at the time of dict_unref it has become a leak Solution: To resolve the same call dict_new instead of get_new_dict updates bz#1650403 Change-Id: I3ccbbf5af07079a4fa09aad2cd0458c8625b2f06 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	libglusterfs/common-utils.c: Fix buffer size for checksum computation	Varsha Rao	2019-01-11	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When quorum count option is updated, the change is not reflected in the nfs-server.vol file. This is because in get_checksum_for_file(), when the last part of the file read has size less than buffer size, the read buffer stores old data value along with correct data value. Solution: Pass the bytes read instead of fixed buffer size, for calculating checksum. Change-Id: I4b641607c8a262961b3f3da0028a54e08c3f8589 fixes: bz#1657744 Signed-off-by: Varsha Rao <varao@redhat.com>
*	performance/md-cache: Fix a crash when statfs caching is enabled	Vijay Bellur	2019-01-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	mem_put() in STACK_UNWIND_STRICT causes a crash if frame->local is not null as md-cache obtains local from CALLOC. Changed two occurrences of STACK_UNWIND_STRICT to MDC_STACK_UNWIND as the latter macro does not rely on STACK_UNWIND_STRICT for cleaning up frame->local. fixes: bz#1632503 Change-Id: I1b3edcb9372a164ef73119e99a49e747765d7166 Signed-off-by: Vijay Bellur <vbellur@redhat.com>
*	tests: increase the timeout for distribute bug 1117851.t	Amar Tumballi	2019-01-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	The test is in borderline of 200seconds, and many a times, randomly takes little more time, and fails the whole regression. Better to keep timeout high, so we don't 'randomly' fail regression tests. updates: bz#1193929 Change-Id: Ib0d3a9f7a75ee44446ec6da5e0510cccf83eecaa Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	cluster/afr: Disable client side heals in AFR by default.	Sunil Kumar Acharya	2019-01-10	6	-0/+20
\| \| \| \| \| \| \| \| \|	With this changeset, default value for the AFR client side heal volume option is set to "off" fixes: bz#1663102 Change-Id: Ie4016932339c4896487e3e7cb5caca68739b7ba2 Signed-off-by: Sunil Kumar Acharya <sheggodu@redhat.com>
*	tests: Brick is getting OOM in ./tests/bugs/core/bug-1432542-mpx-restart-crash.t	Mohit Agrawal	2018-12-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This test "tests/bugs/core/bug-1432542-mpx-restart-crash.t" case creates 20 2x3 volumes after enabling brick_mux.At the time of creating last volume brick is getting OOM because brick consumption has increased from previous consumption due to these patches https://review.gluster.org/#/c/glusterfs/+/19997/, https://review.gluster.org/#/c/glusterfs/+/20362/ To avoid OOM reduce NUM_VOLS to 15 so that brick consumption has reduced Change-Id: Ib98b47a3db6b990ff22c7e57396d51e7fef5c7e8 fixes: bz#1661214 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	tests: Fix zero-flag.t script	Krutika Dhananjay	2018-12-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default value of shard-block-size was changed from 4MB to 64MB sometime back. The script "fallocate"s a 6MB file and expects it to have 1 shard under .shard. This worked when the shard-block-size was 4MB. With the default value now at 64MB, file "file1" won't have any shards under .shard and the stat on the 1st shard's path fails with ENOENT. Changed the script to explicitly set shard-block-size to 4MB. Change-Id: I7f1785922287d16d74c95fa57cbbe12e6e66e4f7 fixes: bz#1656264 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	cluster/afr: Allow lookup on root if it is from ADD_REPLICA_MOUNT	karthik-us	2018-12-18	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When trying to convert a plain distribute volume to replica-3 or arbiter type it is failing with ENOTCONN error as the lookup on the root will fail as there is no quorum. Fix: Allow lookup on root if it is coming from the ADD_REPLICA_MOUNT which is used while adding bricks to a volume. It will try to set the pending xattrs for the newly added bricks to allow the heal to happen in the right direction and avoid data loss scenarios. Note: This fix will solve the problem of type conversion only in the case where the volume was mounted at least once. The conversion of non mounted volumes will still fail since the dht selfheal tries to set the directory layout will fail as they do that with the PID GF_CLIENT_PID_NO_ROOT_SQUASH set in the frame->root. Change-Id: Ic511939981dad118cc946754341318b164954b3b fixes: bz#1655854 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	iobuf: Get rid of pre allocated iobuf_pool and use per thread mem pool	Poornima G	2018-12-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current implementation of iobuf_pool has two problems: - prealloc of 12.5MB memory, this limits the scale factor of the gluster processes due to RAM requirements - lock contention, as the current implementation has one global iobuf_pool lock. Credits for debugging and addressing the same goes to Krutika Dhananjay <kdhananj@redhat.com>. Issue: #410 Hence changing the iobuf implementation to use per thread mem pool. This may theoritically appear to cause perf dip as there is no preallocation. But per thread mem pool will not have significant perf impact as the last allocated memory is kept alive for subsequent allocs, for some time. The worst case would be if iobufs requested are of random sizes each time. The best case is, if we get iobuf request of the same size. From the perf tests, this patch did not seem to cause any perf decrease. Note that, with this patch, the rdma performance is going to degrade drastically. In one of the previous patchsets we had fixes to not degrade rdma perf, but rdma is not supported and also not tested [1]. Hence the decision was to not have code in rdma that is not tested and not supported. [1] https://lists.gluster.org/pipermail/gluster-users.old/2018-July/034400.html Updates: #325 Change-Id: Ic2ef3bd498f9250dea25f25ba0c01fde19584b27 Signed-off-by: Poornima G <pgurusid@redhat.com>
*	[geo-rep]: Worker still ACTIVE after killing bricks	Mohit Agrawal	2018-12-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In changelog xlator after destroying listener it call's unlink to delete changelog socket file but socket file reference is not cleaned up from process memory Solution: 1) To cleanup reference completely from process memory serialize transport cleanup for changelog and then unlink socket file 2) Brick xlator will notify GF_EVENT_PARENT_DOWN to next xlator only after cleanup all xprts Test: To test the same run below steps 1) Setup some volume and enable brick mux 2) kill anyone brick with gf_attach 3) check changelog socket for specific to killed brick in lsof, it should cleanup completely fixes: bz#1600145 Change-Id: Iba06cbf77d8a87b34a60fce50f6d8c0d427fa491 Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	cluster/afr: Do not update read_subvol in inode_ctx after rename/link fop	karthik-us	2018-12-12	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \|	Since rename/link fops on a file will not change any data in it, it should not update the read_subvol values in the inode_ctx, which interprets the data & metadata readable subvols for that file. The old read_subvol values should be retained even after the rename/link operations. Change-Id: I068044a426823a566f5bea8aa063cd689199d6dd fixes: bz#1657783 Signed-off-by: karthik-us <ksubrahm@redhat.com>
*	tests: Mark tests/bugs/shard/zero-flag.t bad	Atin Mukherjee	2018-12-05	1	-0/+1
\| \| \| \| \| \|	Change-Id: I2f4ca470c6666584e0feb129ab712f06772a86c2 Updates: bz#1656264 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	afr: assign gfid during name heal when no 'source' is present.	Ravishankar N	2018-12-03	1	-0/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: If parent dir is in split-brain or has dirty xattrs set, and the file has gfid missing on one of the bricks, then name heal won't assign the gfid. Fix: Use the brick we select the gfid from as the 'source'. Note: Problem was found while trying to debug a split-brain issue on Cynthia Zhou's setup. updates: bz#1637249 Change-Id: Id088d4f0fb017aa35122de426654194e581ed742 Reported-by: Cynthia Zhou <cynthia.zhou@nokia-sbell.com> Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	cluster/dht: sync brick root perms on add brick	N Balachandran	2018-11-19	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a single brick is added to the volume and the newly added brick is the first to respond to a dht_revalidate call, its stbuf will not be merged into local->stbuf as the brick does not yet have a layout. The is_permission_different check therefore fails to detect that an attr heal is required as it only considers the stbuf values from existing bricks. To fix this, merge all stbuf values into local->stbuf and use local->prebuf to store the correct directory attributes. Change-Id: Ic9e8b04a1ab9ed1248b6b056e3450bbafe32e1bc fixes: bz#1648298 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	glusterd: coverity fixes	Atin Mukherjee	2018-11-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Addresses CIDs : 1124769, 1124852, 1124864, 1134024, 1229876, 1382382 Also addressed a spurious failure in tests/bugs/glusterd/df-results-post-replace-brick-operations.t to ensure post replace brick operation and before triggering 'df' from mount, client has connection to the newly replaced bricks. Change-Id: Ie5d7e02f89400a661491d7fc2a120d6f6a83a1cc Updates: bz#789278 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	tiering: remove the translator from build and glusterd	Amar Tumballi	2018-11-02	5	-347/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing tier translator from the build. Also make sure there are no regression tests involving tiering feature are present. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Change-Id: I2c177f711f9b54b7b24e1a13525ff3132bd9a9c5 updates: bz#1642807 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterd: set fsid while performing replace brick	Sanju Rakonde	2018-11-02	1	-0/+58
\| \| \| \| \| \| \| \| \| \|	While performing the replace-brick operation, we should set fsid value to the new brick. fixes: bz#1637196 Change-Id: I9e9a4962fc0c2f5dff43e4ac11767814a0c0beaf Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: brick-mux-fd-cleanup.t should be under core directory	Atin Mukherjee	2018-10-31	1	-0/+0
\| \| \| \| \| \|	Fixes: bz#1637934 Change-Id: I5f95beab62bd2bdde3bbee94c308b0ad03e94379 Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
*	stripe: remove the translator from build and glusterd	Amar Tumballi	2018-10-31	15	-259/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on the proposal to remove few features as they are not actively maintained [1], removing stripe translator from the build. Also make sure there are no regression tests involving stripe translator. [1] https://lists.gluster.org/pipermail/gluster-users/2018-July/034400.html Note that this patch aims at removing the translator from build, and a followup patch is needed to remove the code from repository. Updates: bz#1364707 Change-Id: I235b305338f138e29e9f30cba65bc0dadbebbbd5 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	glusterd: ensure volinfo->caps is set to correct value.	Sanju Rakonde	2018-10-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the commit febf5ed4848, during the volume create op, we are setting volinfo->caps to 0, only if any of the bricks belong to the same node and brickinfo->vg[0] is null. Previously, we used to set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. With this patch, we set volinfo->caps to 0, when either brick doesn't belong to the same node or brickinfo->vg[0] is null. (as we do earlier without commit febf5ed4848). fixes: bz#1635820 Change-Id: I00a97415786b775fb088ac45566ad52b402f1a49 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: correction in tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t	Sanju Rakonde	2018-10-25	1	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch https://review.gluster.org/#/c/glusterfs/+/19135/ has optimised glusterd test cases by clubbing the similar test cases into a single test case. https://review.gluster.org/#/c/glusterfs/+/19135/15/tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t test case has been deleted and added as a part of tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t In the original test case, we create a volume with two bricks, each on a separate node(N1 & N2). From another node in cluster(N3), we try to detach a node which is hosting bricks. It fails. In the new test, we created volume with single brick on N1. and from another node in cluster, we tried to detach N1. we expect peer detach to fail, but peer detach was success as the node is hosting all the bricks of volume. Now, changing the new test case to cover the original test case scenario. Please refer https://bugzilla.redhat.com/show_bug.cgi?id=1642597#c1 to understand why the new test case is not failing in centos-regression. fixes: bz#1642597 Change-Id: Ifda12b5677143095f263fbb97a6808573f513234 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	tests: check for shd up status in bug-1637802-arbiter-stale-data-heal-lock.t	Ravishankar N	2018-10-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: https://review.gluster.org/#/c/glusterfs/+/21427/ seems to be failing this .t spuriously. On checking one of the failure logs, I see: 22:05:44 Launching heal operation to perform index self heal on volume patchy has been unsuccessful: 22:05:44 Self-heal daemon is not running. Check self-heal daemon log file. 22:05:44 not ok 20 , LINENUM:38 In glusterd log: [2018-10-18 22:05:44.298832] E [MSGID: 106301] [glusterd-syncop.c:1352:gd_stage_op_phase] 0-management: Staging of operation 'Volume Heal' failed on localhost : Self-heal daemon is not running. Check self-heal daemon log file But the tests which preceed this check whether via a statedump if the shd is conected to the bricks, and they have succeeded and even started healing. From glustershd.log: [2018-10-18 22:05:40.975268] I [MSGID: 108026] [afr-self-heal-common.c:1732:afr_log_selfheal] 0-patchy-replicate-0: Completed data selfheal on 3b83d2dd-4cf2-4ea3-a33e-4275be40f440. sources=[0] 1 sinks=2 So the only reason I can see launching heal via cli failing is a race where shd has been spawned but glusterd has not yet updated in-memory that it is up, and hence failing the CLI. Fix: Check for shd up status before launching heal via CLI Change-Id: Ic88abf14ad3d51c89cb438db601fae4df179e8f4 fixes: bz#1641344 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	api: fill out attribute information if not valid	Raghavendra Gowdappa	2018-10-17	2	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	translators like readdir-ahead selectively retain entry information of iatt (gfid and type) when rest of the iatt is invalidated (for write invalidating ia_size, (m)(c)times etc). Fuse-bridge uses this information and sends only entry information in readdirplus response. However such option doesn't exist in gfapi. This patch modifies gfapi to populate the stat by forcing an extra lookup. Thanks to Shyamsundar Ranganathan <srangana@redhat.com> and Prashanth Pai <ppai@redhat.com> for tests. Change-Id: Ieb5f8fc76359c327627b7d8420aaf20810e53000 Fixes: bz#1630804 Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com> Signed-off-by: Soumya Koduri <skoduri@redhat.com>
*	features/shard: Hold a ref on base inode when adding a shard to lru list	Krutika Dhananjay	2018-10-16	3	-1/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In __shard_update_shards_inode_list(), previously shard translator was not holding a ref on the base inode whenever a shard was added to the lru list. But if the base shard is forgotten and destroyed either by fuse due to memory pressure or due to the file being deleted at some point by a different client with this client still containing stale shards in its lru list, the client would crash at the time of locking lru_base_inode->lock owing to illegal memory access. So now the base shard is ref'd into the inode ctx of every shard that is added to lru list until it gets lru'd out. The patch also handles the case where none of the shards associated with a file that is about to be deleted are part of the LRU list and where an unlink at the beginning of the operation destroys the base inode (because there are no refkeepers) and hence all of the shards that are about to be deleted will be resolved without the existence of a base shard in-memory. This, if not handled properly, could lead to a crash. Change-Id: Ic15ca41444dd04684a9458bd4a526b1d3e160499 updates: bz#1605056 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
*	core: glusterfsd keeping fd open in index xlator	Mohit Agrawal	2018-10-12	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: At the time of processing GF_EVENT_PARENT_DOWN at brick xlator, it forwards the event to next xlator only while xlator ensures no stub is in progress. At io-thread xlator it decreases stub_cnt before the process a stub and notify EVENT to next xlator Solution: Introduce a new counter to save stub_cnt and decrease the counter after process the stub completely at io-thread xlator. To avoid brick crash at the time of call xlator_mem_cleanup move only brick xlator if detach brick name has found in the graph Note: Thanks to pranith for sharing a simple reproducer to reproduce the same fixes bz#1637934 Change-Id: I1a694a001f7a5417e8771e3adf92c518969b6baa Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
*	afr: prevent winding inodelks twice for arbiter volumes	Ravishankar N	2018-10-10	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In an arbiter volume, if there is a pending data heal of a file only on arbiter brick, self-heal takes inodelks twice due to a code-bug but unlocks it only once, leaving behind a stale lock on the brick. This causes the next write to the file to hang. Fix: Fix the code-bug to take lock only once. This bug was introduced master with commit eb472d82a083883335bc494b87ea175ac43471ff Thanks to Pranith Kumar K <pkarampu@redhat.com> for finding the RCA. fixes: bz#1637802 Change-Id: I15ad969e10a6a3c4bd255e2948b6be6dcddc61e1 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	Reduce execution time of bug-1559004-EMLINK-handling.t	Xavi Hernandez	2018-10-04	1	-12/+51
\| \| \| \| \| \| \| \| \| \| \|	This patch reduces the execution time of bug-1559004-EMLINK-handling.t from ~14 minutes to ~90 seconds. To do so, it creates some fake hard links directly on the brick instead of creating them through the volume. Change-Id: I9715ff1a4eba47574c733d4f28e68f42f56a7d3f updates: bz#1193929 Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
*	python3: assume python3 unless building _packages_ on sys without py3	Kaleb S. KEITHLEY	2018-09-27	3	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The jenkins release-new job runs on a CentOS 7 box, which does not have python3. As a result it runs (autogen.sh and) configure before producing the dist tar file, converting all the python3 shebangs to python2 shebangs in the dist tar file. Then when that tar file is "carried" to, e.g. Fedora koji build system to build packages, the shebangs are incorrect, despite having originally been correct in the git repo. Change-Id: I5154baba3f6d29d3c4823bafc2b57abecbf90e5b updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	afr: fix incorrect reporting of directory split-brain	Ravishankar N	2018-09-21	1	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When a directory has dirty xattrs due to failed post-ops or when replace/reset brick is performed, AFR does a conservative merge as expected, but heal-info reports it as split-brain because there are no clear sources. Fix: Modify pending flag to contain information about pending heals and split-brains. For directories, if spit-brain flag is not set,just show them as needing heal and not being in split-brain. Fixes: bz#1626994 Change-Id: I09ef821f6887c87d315ae99e6b1de05103cd9383 Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	tests: fix test case failure	Sanju Rakonde	2018-09-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests/bugs/glusterd/bug-1595320.t is failing in downstream. In downstream repo, enabling the brick multiplexing made interactive, so it will throw an prompt for the user input. As no input is provided during the test case execution, the test is failing. Using macro CLI instead of using gluster command, will bypass the interacive commands. so replacing the gluster command with CLI macro will address the issue. Change-Id: I6b39052d8e415a8ed08de7c80a91dadce155146a updates: bz#1193929 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	dht: utilize the framework to pass-through xlator tasks	Amar Tumballi	2018-09-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also fixes the issue caused due to not converting back the fn function to after getting its address. We wanted the value of the field, not the address of the pt_fop field. With this patch, DHT will always be started in pass-through mode if the number of subvols is just 1. Fixes some tests to make sure DHT is in full config (ie, subvols > 1). - increased timeout of brick-mux test as it was bordering on 300 seconds. - Also change the volume type to supported 'replica 3' from 'replica 2'. - also no DHT tests should assume presence of DHT when there is just 1 brick in volume Credits: Nithya B <nbalacha@redhat.com> fixes: #405 Change-Id: I8e55239ce58d6ac6ae1901e2e384be1ecbd33d6e Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	tests: fixes to bug-1015990-rep.t	Ravishankar N	2018-09-18	1	-14/+7
\| \| \| \| \| \| \| \| \| \| \|	- check that the shd is connected to brick before running statistics command - remove sleep statements - remove unneeded ($count-$value==0) test when it is known that both values will be same Fixes: bz#1625850 Change-Id: Ifcd4887f0238031e5bca803cd9bfdb75a6e6c01b Signed-off-by: Ravishankar N <ravishankar@redhat.com>
*	Land part 2 of clang-format changes	Gluster Ant	2018-09-12	32	-2308/+2366
\| \| \| \| \|	Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4 Signed-off-by: Nigel Babu <nigelb@redhat.com>
*	glusterd: avoid using glusterd's working directory as a brick	Sanju Rakonde	2018-09-08	1	-0/+9
\| \| \| \| \| \| \| \| \|	Adding checks for avoiding glusterd's working directory used as a brick for volume creation. fixes: bz#853601 Change-Id: I4b16a05f752e92216aa628f542a4fdbf59b3c669 Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
*	io-stats: dump io-stats info in /var/run/gluster	Amar Tumballi	2018-09-05	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It wouldn't make sense to allow iostats file to be written in any directory. While the formating makes sure we try to append io-stats-name for the file, so overwriting existing file is slim, but in any case it makes sense to restrict dumping to one directory. Below are the sample commands, and files created for the corresponding values: $ setfattr -n trusted.io-stats-dump -v file-for-dump $M0 In this case, the file would be in /var/run/gluster/file-for-dump $ setfattr -n trusted.io-stats-dump -v /dir1/dir2/file-for-dump $M0 In this case, then the dump file is in /var/run/gluster/dir1-dir2-file-for-dump Note that the value passed for this virtual xattr would be treated as a file, and even if the value has '/' in it, it would be changed to '-' for sanity. Fixes: bz#1625106 Change-Id: Id9ae6a40a190b8937c51662e6e1c2a0f6c86a0e0 Signed-off-by: Amar Tumballi <amarts@redhat.com>
*	multiple files: calloc -> malloc	Yaniv Kaul	2018-09-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible It doesn't make sense to calloc (allocate and clear) memory when the code right away fills that memory with data. It may be optimized by the compiler, or have a microscopic performance improvement. In some cases, also changed allocation size to be sizeof some struct or type instead of a pointer - easier to read. In some cases, removed redundant strlen() calls by saving the result into a variable. 1. Only done for the straightforward cases. There's room for improvement. 2. Please review carefully, especially for string allocation, with the terminating NULL string. Only compile-tested! updates: bz#1193929 Original-Author: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Yaniv Kaul <ykaul@redhat.com> Signed-off-by: Amar Tumballi <amarts@redhat.com> Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
*	core: python3	Kaleb S. KEITHLEY	2018-09-03	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	see https://review.gluster.org/#/c/19788/, https://review.gluster.org/#/c/19871/, https://review.gluster.org/#/c/19952/, https://review.gluster.org/#/c/20104/, https://review.gluster.org/#/c/20162/, https://review.gluster.org/#/c/20185/, https://review.gluster.org/#/c/20207/, https://review.gluster.org/#/c/20227/, https://review.gluster.org/#/c/20307/, https://review.gluster.org/#/c/20320/, https://review.gluster.org/#/c/20332/, https://review.gluster.org/#/c/20364/, https://review.gluster.org/#/c/20441/, and https://review.gluster.org/#/c/20484 shebangs changed from /usr/bin/python2 to /usr/bin/python3. (Reminder, various distribution packaging guidelines require use of explicit python version and don't allow '#!/usr/bin/env python', regardless of how handy that idiom may be.) glusterfs.spec(.in) package python{2,3}-gluster and python2 or python3 dependencies as appropriate. configure(.ac): + test for and use python2 or python3 as appropriate. If build machine has python2 and python3, use python3. Override by setting PYTHON=/usr/bin/python2 when running configure. + PYTHONDEV_CPPFLAGS from python[23]-config --includes is a better match to the original python sysconfig.get_python_inc(). All those other extraneous flags breaks the build. + Only change the shebangs once. Changing them over and over again, e.g., during a `make glusterrpms` in extras/LinuxRPM just sends make (is it really make that's looping?) into an infinite loop. If you figure out why, let me know. + Oldest python2 is python2.6 on CentOS 6 and Debian 8 (Jessie). Everything else has 2.7 or 3.x + logic from https://review.gluster.org/c/glusterfs/+/21050, which needs to be removed/merged after that patch is merged. Builds on CentOS 6, CentOS 7, Fedora 28, Fedora rawhide, and the mysterious RHEL > 7. Change-Id: Idae21d3b6f58b32372e1daa0d234e491e563198f updates: #411 Signed-off-by: Kaleb S. KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: In rename, unlink after creating linkto file	N Balachandran	2018-09-03	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The linkto file creation for the dst was done in parallel with the unlink of the old src linkto. If these operations reached the brick out of order, we end up with a dst linkto file without a .glusterfs handle. Fixed by the unlinking only after the linkto file creation has completed. Change-Id: I4246f7655f5bc180f5ded7fd34d263b7828a8110 fixes: bz#1621981 Signed-off-by: N Balachandran <nbalacha@redhat.com>
*	cluster/afr: Delegate metadata heal with pending xattrs to SHD	Pranith Kumar K	2018-08-28	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: When metadata-self-heal is triggered on the mount, it blocks lookup until metadata-self-heal completes. But that can lead to hangs when lot of clients are accessing a directory which needs metadata heal and all of them trigger heals waiting for other clients to complete heal. Fix: Only when the heal is needed but the pending xattrs are not set, trigger metadata heal that could block lookup. This is the only case where different clients may give different metadata to the clients without heals, which should be avoided. Updates bz#1622821 Change-Id: I6089e9fda0770a83fb287941b229c882711f4e66 Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
*	storage/posix: Increment trusted.pgfid in posix_mknod	N Balachandran	2018-08-23	1	-0/+57
\| \| \| \| \| \| \| \| \| \|	The value of trusted.pgfid.xx was always set to 1 in posix_mknod. This is incorrect if posix_mknod calls posix_create_link_if_gfid_exists. Change-Id: Ibe87ca6f155846b9a7c7abbfb1eb8b6a99a5eb68 fixes: bz#1619720 Signed-off-by: N Balachandran <nbalacha@redhat.com>