glusterfs.git - GlusterFS is a distributed file-system capable of scaling to several petabytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system.

	Commit message (Collapse)	Author	Age	Files	Lines
*	cluster/tier: add watermarks and policy driver	Dan Lambright	2015-10-10	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fix introduces infrastructure to support different policies for promotion and demotion. Currently the tier feature automatically promotes and demotes files periodically based on access. This is good for testing but too stringent for most real workloads. It makes it difficult to fully utilize a hot tier- data will be demoted before it is touched- its unlikely a 100GB hot SSD will have all its data touched in a window of time. A new parameter "mode" allows the user to pick promotion/demotion polcies. The "test mode" will be used for *.t and other general testing. This is the current mechanism. The "cache mode" introduces watermarks. The watermarks represent levels of data residing on the hot tier. "cache mode" policy: The % the hot tier is full is called P. Do not promote or demote more than D MB or F files. A random number [0-100] is called R. Rules for migration: if (P < watermark_low) don't demote, always promote. if (P >= watermark_low) && (P < watermark_hi) demote if R < P; promote if R > P. if (P > watermark_hi) always demote, don't promote. gluster volume set {vol} cluster.watermark-hi % gluster volume set {vol} cluster.watermark-low % gluster volume set {vol} cluster.tier-max-mb {D} gluster volume set {vol} cluster.tier-max-files {F} gluster volume set {vol} cluster.tier-mode {test\|cache} Change-Id: I157f19667ec95aa1d53406041c1e3b073be127c2 BUG: 1257911 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12039 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
*	cluster/tier: fix transpoint endpoint not connected in tier.t (rare)	Dan Lambright	2015-10-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The script did not cleanly unmount/mount gluster and change the current working directory when stopping and starting the volume. Most of the time this problem would self-resolve before subsequent tests, but very occasionally races would lead to the errors/failures. Change-Id: I128b913a71e2745512ee81c3d71852311e3b4a1b BUG: 1270328 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12327 Reviewed-by: Joseph Fernandes Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/tier re-enable tier.t in automatic tests	Dan Lambright	2015-09-29	1	-15/+29
\| \| \| \| \| \| \| \| \| \| \| \| \|	Re-enable tier.t in automatic tests. Disable check for BSD until recurring problem with SQLlite on it is understood. Change-Id: Ib13b269ab841a59a0a41d8478c8627b180b16c61 BUG: 1231268 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/12208 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	tier/dht: unlink fails after lookup in a directory	Mohammed Rafi KC	2015-09-17	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unlink fails with invalid argument for files that are being present on cold tier, before attaching. All of the fops will be hashed to hot_tier after attach-tier (unless explicitly set the "rule" option). Lookups sent to directory, will eventually search the directory using readdirp, and will populate inode_ctx for the inodes based on the output, in respective dht_xlators. So the readdirp will populate inodes_ctx for the files (that is already present in volume before attaching) in cold-dht only because it got the entries from the cold-tier. So when an unlink comes on such an inode, the lookup associated with the unlink will be send as a re validate request to cold-tier only, since already a lookup was performed on the inode, and the new lookup will succeed. So from the unlink of dht, it will hash to cold-tier but the cached_subvol will be cold, since there is a mismatch in hash and cach , it chose hashed subvolume and will sent the fop to hot dht, and the fops fail with EINVAL from the hot-dht since it does not have inode_ctx stored for that inode (because, no lookup was performed from hot-dht). Change-Id: Ib7c14a9297a22d615f7a890a060be4809b5a745a BUG: 1236032 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11675 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
*	cluster/tier: add gluster v tier <vol>	Dan Lambright	2015-09-09	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the tier feature piggy backs off the rebalance command syntax to obtain status and this is clumsy. Introduce a new tier command that can do tier specific operations, starting with volume status to display counters. Old commands: gluster volume attach-tier <vol> [replica count] {bricklist..} gluster volume detach-tier <vol> {start\|stop\|commit} New commands: gluster volume tier <vol> attach [replica count] {bricklist} \| detach {start\|stop\|commit} \| status Change-Id: Ic07b3c6260588162de7d34380f8cbd3d8a7f35d3 BUG: 1255693 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11984 Reviewed-by: Gaurav Kumar Garg <ggarg@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com>
*	cluster/dht: Don't set posix acls on linkto files	Nithya Balachandran	2015-08-31	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Posix acls on a linkto file change the file's permission bits and cause DHT to treat it as a non-linkto file.This happens on the migration failure of a file on which posix acls were set. The fix prevents posix acls from being set on a linkto file and copies them across only after a file has been successfully migrated. Change-Id: Iccf7ff6fba49fe05d691d9b83bf76a240848b212 BUG: 1247563 Signed-off-by: Nithya Balachandran <nbalacha@redhat.com> Signed-off-by: N Balachandran <nbalacha@redhat.com> Reviewed-on: http://review.gluster.org/12025 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Raghavendra G <rgowdapp@redhat.com> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
*	tiering/glusterd: start tier daemon during volume start	Mohammed Rafi KC	2015-08-17	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tier daemon should always run with tier volume. If volume is stopped and started again, we manually need to start the tier-daemon, instead this patch will automatically trigger tier process along with volume start. A snapshot restored volume will not have node_state_info, so we need to create and store it dynamically Change-Id: I659387c914bec7a1b6929ee5cb61f7b406402075 BUG: 1238593 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11525 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
*	cluster/tier: fixed pattern matching error in tier.t	Pamela Ousley	2015-08-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The check_counters function contained "grep -o '[0-9]'", which was in error. This patch corrects it to "grep -o '[0-9]'". This fix was necessary to accommodate for double-digit counters. Change-Id: Idaa09de4403bf66e741176a7377eba264819ca3b BUG: 1252121 Signed-off-by: Pamela Ousley <pousley@redhat.com> Reviewed-on: http://review.gluster.org/11877 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier : fix for logical bugs/timing errors in tier.t	Pamela Ousley	2015-07-13	1	-41/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The md5sum fingerprints were not correctly compared after moving files between the hot and cold tiers. This version of tier.t uses a new function, "check_counters", to ensure that the number of promotions/demotions is as expected. This is intended to avoid spurious timing-related errors that were seen with the old script. Change-Id: I4a0ae7315493bfd307a0f68f21fa3ea33c88b08f BUG: 1231268 Signed-off-by: Pamela Ousley <pousley@redhat.com> Reviewed-on: http://review.gluster.org/11285 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	cluster/tier: stop tier migration after graph switch	Dan Lambright	2015-06-26	1	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a graph switch, a new xlator and private structures are created. The tier migration daemon must stop using the old xlator and private structures and begin using the new ones. Otherwise, when RPCs arrive (such as counter queries from glusterd), the new xlator will be consulted but it will not have up to date information. The fix detects a graph switch and exits the daemon in this case. Typical graph switches for the tier case would be turning off performance translators. Change-Id: Ibfbd4720dc82ea179b77c81b8f534abced21e3c8 BUG: 1226005 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/11372
*	tiering/rebalance: tier daemon stopped with out updating status	Mohammed Rafi KC	2015-06-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a subvol goes down, tier daemon stopped immediately, and the status shows as "Progressing". With this change, with respect to tier xlator, when a subvol goes offline it will update the status as failed. Change-Id: I9f722ed0d35cda8c7fc1a7e75af52222e2d0fdb7 BUG: 1227803 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/11068 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Dan Lambright <dlambrig@redhat.com>
*	tier/volume set: Validate volume set option for tier	Mohammed Rafi KC	2015-06-10	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Volume set option related to tier volume can only be set for tier volume, also currently all volume set i for tier option accepts a non-negative integer. This patch validate both condition. Change-Id: I3611af048ff4ab193544058cace8db205ea92336 BUG: 1216960 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10751 Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Joseph Fernandes
*	Tests portability: umount(8)	Emmanuel Dreyfus	2015-06-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) Avoid hangs on unmounting NFS on NetBSD NetBSD umount(8) on a NFS mount whose server is gone will wait forever because umount(8) calls realpath(3) and tries to access the mount before it calls unmount(2). The non-portable, NetBSD-specific umount -R flag prevent that behavior. We therefore introduce UMOUNT_F, defined as "umount -f" on Linux and "umount -f -R" on NetBSD to take care of forced unmounts, especially in the NFS case. 2) Enforce usage of force_umount wrapper with timeout Whenever umount is used it should be wrapped in force_umount with tiemout handling. That saves us timing issues, and it handles the NetBSD NFS case. 3) Cleanup kernel cache flush. We used (cd $M0 && umount $M0 ) as a portable kernel cache flush trick, but it does not flush everything we need on Linux. Introduce a drop_cache() shell function that reverts to previously used echo 3 > /proc/sys/vm/drop_caches on Linux, and keeps (cd $M0 && umount $M0 ) on other systems. BUG: 1129939 Change-Id: Iab1f5a023405f1f7270c42b595573702ca1eb6f3 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/11114 Tested-by: NetBSD Build System <jenkins@build.gluster.org> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	tiering/nfs: duplication of nodes in client graph	Mohammed Rafi KC	2015-05-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When creating client volfiles, xlator tier-dht will be loaded for each volume. So for services like nfs have one or more volumes . So for each volume in the graph a tier-dht xlator will be created. So the graph parser will fail because of the redundant node in graph. By this change tier-dht will be renamed as volname-tier-dht Change-Id: I3c9b9c23ddcb853773a8a02be7fd8a5d09a7f972 BUG: 1222840 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10820 Reviewed-by: Atin Mukherjee <amukherj@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Kaushal M <kaushal@redhat.com>
*	cli/tiering: Enhance cli output for tiering	Mohammed Rafi KC	2015-05-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix for handling cli output for attach-tier and detach-tier Change-Id: I4d17f4b09612754fe1b8cec6c2e14927029b9678 BUG: 1211562 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10284 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	Tests: use a portable way to flush kernel cache	Emmanuel Dreyfus	2015-05-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Linux, kernel cache can be flushed using echo 3 > /proc/sys/vm/drop_caches This non-portable approach can be replaced by an on-purpose failed attempt to unmount: if the mount point is the current directory and umount is called, the kernel will flush inodes until it realize it cannot complete the operation because root of filesystem is busy: ( cd $M0 ; umount $M0 ) Unfortunately this does not flush everything. Entries may still be present in the kenrel FUSE cache. Using $GFS to mount the filesystem ensure --entry-timeout=0 and clears this problem. Some stall information may also remain in glusterfs caches, and that may have to be adressed by appropriate volume option. For instance tests/bugs/rpc/bug-954057.t needs to disable performance.stat-prefetch. Qtherwise, root's new credentials are not evaluated after root-quash is enabled. The test could also be done with performance.stat-prefetch enabled using various tricks: copying the file to read, creating a hard link on it, or just waiting long enough for metadata cache to expire. BUG: 1129939 Change-Id: I54929e899d55c04dcd9d947809133549f01fd0e1 Signed-off-by: Emmanuel Dreyfus <manu@netbsd.org> Reviewed-on: http://review.gluster.org/10411 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	tiering: Send both attach-tier and tier-start together	Mohammed Rafi KC	2015-05-05	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \|	After attaching tier, we have to start tier rebalance process. This patch is to trigger tier start along with attch-tier. Change-Id: I39380f95123f0087a82213ef263f9f33adcc5adc BUG: 1214222 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10363 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
*	tiering/cli: Check replica count and bricks are proper or not	Mohammed Rafi KC	2015-05-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, attach-tier calls parsing function for add-brick. Add-brick does not have any check for brick count and replca count compatibility. Change-Id: I44ec13eadffc003a9ebf8c4eb0193df559933a68 BUG: 1215122 Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com> Reviewed-on: http://review.gluster.org/10428 Reviewed-by: Dan Lambright <dlambrig@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
*	glusterd: support for tier volumes 'detach start' and 'detach commit'	Dan Lambright	2015-04-22	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These commands work in a manner analagous to rebalancing when removing a brick. The existing migration daemon detects "detach start" and switches to moving data off the hot tier. While in this state all lookups are directed to the cold tier. gluster v detach-tier <vol> start gluster v detach-tier <vol> commit The status and stop cli commands shall be submitted separately. Change-Id: I24fda5cc3ba74f5fb8aa9a3234ad51f18b80a8a0 BUG: 1205540 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Signed-off-by: root <root@localhost.localdomain> Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10108 Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com> Tested-by: NetBSD Build System
*	glusterd: Support distributed replicated volumes on hot tier	Dan Lambright	2015-04-08	1	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We did not set up the graph properly for hot tiers with replicated subvolumes. Also add check that the file has not already been moved by another replicated brick on the same node. Change-Id: I9adef565ab60f6774810962d912168b77a6032fa BUG: 1206517 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/10054 Reviewed-by: Joseph Fernandes <josferna@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
*	cluster/dht: fix tier.c problems found prior to feature freeze	Dan Lambright	2015-04-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch resolves tiering translator issues taken from the list in bug 1203776. These issues have been selected to be fixed first. The rest will be fixed in a subsequent patch (or are not a problem). 3. Replace hardcoded #defines of promote/demote file names 6. Use loc_wipe() in migrate_using_query_file() 9. Only promote/demote files on the same node on which they reside. 14. Replace calloc with GF_CALLOC in tier.c and ensure freeing done properly. 15. Handle if parse_query_str fails 22. Only load gfdb library on server side, remove SQL references from client. Change-Id: I6563b11e58ab2e4c6b1ce44db755781ad6d930fb BUG: 1203776 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9987 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: N Balachandran <nbalacha@redhat.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/dht: Fix spurious failure in tier test.	Dan Lambright	2015-03-23	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Need to wait for a few seconds for rebalancing to complete before stopping volume. Change-Id: Ib81c02645240e7d74ebfb3e31ccbc612fc77b119 BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9966 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com>
*	cluster/dht: Add tier translator.	Dan Lambright	2015-03-21	1	-0/+116
	The tier translator shares most of DHT's code. It differs in how subvolumes are chosen for I/Os, and how file migration (cache promotion and demotion) is managed. That different functionality is split to either DHT or tier logic according to the "tier_methods" structure. A cache promotion and demotion thread is created in a manner similar to the rebalance daemon. The thread operates a timing wheel which periodically checks for promotion and demotion candidates (files). Candidates are queued and then migrated. Candidates must exist on the same node as the daemon and meet other critera per caching policies. This patch has two authors (Dan Lambright and Joseph Fernandes). Dan did the DHT changes and Joe wrote the cache policies. The fix depends on DHT readidr changes and the database library which have been submitted separately. Header files in libglusterfs/src/gfdb should be reviewed in patch 9683. For more background and design see the feature page [1]. [1] http://www.gluster.org/community/documentation/index.php/Features/data-classification Change-Id: Icc26c517ccecf5c42aef039f5b9c6f7afe83e46c BUG: 1194753 Signed-off-by: Dan Lambright <dlambrig@redhat.com> Reviewed-on: http://review.gluster.org/9724 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>