| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before going into the lock, verify stub_cnt != 0.
Otherwise, let's skip this code.
Unrelated, switch a CALLOC to MALLOC, as we
initialize all members right away. This allocation
is done also under lock, so also should help a bit.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ie2fe6adff41ae4969abff95eff945b54e1a01d32
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In an arbiter volume configuration SHD will not send any writes onto the arbiter
brick even if there is data pending marker for the arbiter brick. If we have a
arbiter setup on the geo-rep master and there are data pending markers for the files
on arbiter brick, SHD will not mark any data changelog during healing. While syncing
the data from master to slave, if the arbiter-brick is considered as ACTIVE, then
there is a chance that slave will miss out some data. If the arbiter brick is being
newly added or replaced there is a chance of slave missing all the data during sync.
Fix:
If there is data pending marker for the arbiter brick, send truncate on the arbiter
brick during heal, so that it will record truncate as the data transaction in changelog.
Change-Id: I3242ba6cea6da495c418ef860d9c3359c5459dec
fixes: bz#1686568
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussion on thin arbiter volume -
https://github.com/gluster/glusterfs/issues/352#issuecomment-350981148
Main idea of having this rpm package is to deploy thin-arbiter
without glusterd and other commands on a node, and all we need
on that tie-breaker node is to run a single glusterfs command.
Also note that, no other glusterfs installation needs
thin-arbiter.so.
Make sure RPM contains sample vol file, which can work by default,
and a script to configure that volfile, along with translator image.
Change-Id: Ibace758373d8a991b6a19b2ecc60c93b2f8fc489
updates: bz#1674389
Signed-off-by: Amar Tumballi <amarts@redhat.com>
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently even if open & opendir fails on quorum number of bricks,
but succeeds on atleast one brick, it will result in success. This leads
to inconsistency in the behaviour with other operations following the
open, which has quorum checks.
Fix:
Add quorum checks to open & opendir fops to avoid inconsistency.
Change-Id: If8fcb82072a6dc45ea6d4a6754b79763215eba2a
fixes: bz#1634664
Signed-off-by: karthik-us <ksubrahm@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This deadlock happens while processing dentry corresponding to current
directory (.) in rda_fill_readdirp. In this case following order is
followed:
LOCK(directory_fd_ctx->lock);
rda_inode_ctx_get_iatt -> LOCK(directory_inode->lock);
However, in rda_mark_inode_dirty following lock order is followed:
LOCK(directory_inode->lock);
LOCK(directory_fd_ctx->lock);
these two codepaths when executed concurrently resulted in a deadlock.
Current patch fixes this by removing locking directory inode and
fd-ctx in rda_fill_readdirp. This is fine as directory inode's stat
won't change due to writes to files within directory.
Change-Id: Ic93a67a0dac8229bb0d79582e526a512e6f2569c
fixes: bz#1674412
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Fixes:bz#1674412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The structs worm_reten_state_t and read_only_priv_t from read-only.h are
using uint64_t values to store periods of retention and autocommmit.
This seems to be dangerous since in worm-helper.c the function
worm_set_state computes in line 97:
stbuf->ia_atime = time(NULL) + retention_state->ret_period;
stbuf->ia_atime is using int64_t because of the settings of struct
iattr. So if there is a very very high retention period stored, there
is maybe an integer overflow.
What can be the solution? Using int64_t instead if uint64_t may reduce
the probability of the occurance.
Change-Id: Id1e86c6b20edd53f171c4cfcb528804ba7881f65
fixes: bz#1685944
Signed-off-by: David Spisla <david.spisla@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: commit 5a152a changed the mechansim of computing the
checksum. In heterogeneous cluster, peers are running into
rejected state because we have different cksum computation
mechansims in upgraded and non-upgraded nodes.
Solution: add a check for op-version so that all the nodes
in the cluster follow the same mechanism for computing the
cksum.
Change-Id: I1508f000e8c9895588b6011b8b6cc0eda7102193
fixes: bz#1685120
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
fops marked internal are used to maintain data integrity
and ideally do not intervene with application client leases.
Hence it seems safe to ignore them by lease xlator.
Change-Id: I887b6f2da7ec0081442cc4b572a7a9e110f79eb2
updates: bz#1648768
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: glusterd has memory leak while running "gluster v profile"
in a loop
Solution: Resolve leak code path to avoid leak
Change-Id: Id608703ff6d0ad34ed8f921a5d25544e24cfadcd
fixes: bz#1685414
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was 30% regression observed in mkdir path with commit
b139bc58eb504adf5ef81658896c9283ae21f390. On analysis it is found
that io-threads xlator deprioritzes fops with all -ve pid.
Some context in to the no-root-squash pid requirement:
DHT xlator does some of the internal fops with root privileges. This is
needed so that operations like layout healing should not be abandoned
because a non root user is operating. If root-squash option is enabled
the layout set operation looses its root privilege as server xlator
converts the uid and pid to random numbers. Hence, the above mentioned
commit converted pid to GF_CLIENT_PID_NO_ROOT_SQUASH to continue fops
as root.
Combining the above I am proposing not to deprioritize fops with
no-root-squash pid.
Change-Id: I54d056c01b25729304a77f9242fbaff39c5672ba
fixes: bz#1676430
Signed-off-by: Susant Palai <spalai@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
As afr_changelog_fsync is used for internal operations, use
GLUSTERFS_INTERNAL_FOP_KEY so that lease xlator can avoid treating
it as conflicting fop and recall lease.
Change-Id: I52cdc161002e840199d24439231a8bfa4f98b1b6
updates: bz#1648768
Signed-off-by: Soumya Koduri <skoduri@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
quotad prints many logs as,
[glusterfs3.h:752:dict_to_xdr] 0-dict: key 'trusted.glusterfs.quota.size' is not sent on wire [Invalid argument]
[glusterfs3.h:752:dict_to_xdr] 0-dict: key 'volume-uuid' is not sent on wire [Invalid argument]
For quota, there is a deamon named quotad which has a rpcsvc_program
quotad_aggregator_prog that only supports v3 right now.
Quotad has two actors (LOOKUP,GETLIMIT) that contains a dict in request,
quotad just decodes the dict by dict_unserialize, those dict dates's type
is GF_DATA_TYPE_STR_OLD, which type is not supported at glusterfs v4.
Change-Id: Ib649d7a2e3c68c32dc26bc0f88923a0ba967ebd7
Updates: bz#1596787
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
| |
Dictionary object is not being unref'd when an error happens
in __glusterd_handle_cli_deprobe(). This patch addresses that problem.
Change-Id: I11e1f92d06dc9edd1260845256f435ea31ef1a87
fixes: bz#1683816
Signed-off-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
experimental xlators have been removed from the codebase. But we
missed to remove the options related to experimental xlators from
the codebase. This patch removes those options.
fixes: bz#1683352
Change-Id: I3fa7e14c6cd8ebde5cebc8d2b0cb2409bf37c1ae
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minor changes to reduce work done under a lock.
Changed few CALLOC() to MALLOC(), and moved some
time(NULL) outside the lock.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I4683d0d6e0b653a6adefff87b43ae717fd46843a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes memory leak reported by ASan.
Tracebacks:
ERROR: LeakSanitizer: detected memory leaks
Direct leak of 712 byte(s) in 1 object(s) allocated from:
#0 0x7f35139dc848 in __interceptor_malloc (/lib64/libasan.so.5+0xef848)
#1 0x7f35136efb29 in __gf_malloc ../libglusterfs/src/mem-pool.c:136
#2 0x7f3510591ce9 in fuse_thread_proc ../xlators/mount/fuse/src/fuse-bridge.c:5929
#3 0x7f351336d58d in start_thread (/lib64/libpthread.so.0+0x858d)
SUMMARY: AddressSanitizer: 712 byte(s) leaked in 1 allocation(s).
updates: bz#1633930
Change-Id: Ie5b4da6b338d8e5fc770c5b2da1238e3462468ac
Signed-off-by: Sunny Kumar <sunkumar@redhat.com>
|
|
|
|
|
|
| |
Updates: bz#1193929
Change-Id: I95897fd4d3102b4fa2b8b2864116b1bf24491cf9
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "struct iatt" in iatt.h is using int64_t types for storing
the atime, mtime and ctime. Therefore the struct 'struct md_cache' in
md-cache.c should also use this types to avoid an integer overflow.
This can happen e.g. if someone uses a very high default-retention-period
in the WORM-Xlator.
Change-Id: I605268d300ab622b9c8ab30e459dc00d9340aad1
fixes: bz#1678726
Signed-off-by: David Spisla <david.spisla@iternity.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Reduced the number of times we call time(). This may affect accuracy
of access time and so on - please review carefully. I think the resolution is OK'ish.
2. Removed dead code.
3. Changed from CALLOC() to MALLOC() where it made sense.
4. Moved some bits of work outside of a lock.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I9fb8ca5d79b0e9126c1eb07e1a1ab5dbd8bf3f79
|
|
|
|
|
|
| |
Change-Id: I7be9a5f48dcad1b136c479c58b1dca1e0488166d
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Fixes: bz#1674406
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two issues were found:
1. in wb_readdirp_cbk, inode should unrefed after wb_inode is
unlocked. Otherwise, inode and hence the context wb_inode can be freed
by the type we try to unlock wb_inode
2. wb_readdirp_mark_end iterates over a list of wb_inodes of children
of a directory. But inodes could've been freed and hence the list
might be corrupted. To fix take a reference on inode before adding it
to invalidate_list of parent.
Change-Id: I911b0e0b2060f7f41ded0b05db11af6f9b7c09c5
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
Updates: bz#1674406
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a thread pool that is wait-free for adding jobs to
the queue and uses a very small locked region to get jobs. This makes it
possible to decrease contention drastically. It's based on wfcqueue
structure provided by urcu library.
It automatically enables more threads when load demands it, and stops
them when not needed. There's a maximum number of threads that can be
used. This value can be configured.
Depending on the workload, the maximum number of threads plays an
important role. So it needs to be configured for optimal performance.
Currently the thread pool doesn't self adjust the maximum for the
workload, so this configuration needs to be changed manually.
For this reason, the global thread pool has been made optional, so that
volumes can still use the thread pool provided by io-threads.
To enable it for bricks, the following option needs to be set:
config.global-threading = on
This option has no effect if bricks are already running. A restart is
required to activate it. It's recommended to also enable the following
option when running bricks with the global thread pool:
performance.iot-pass-through = on
To enable it for a FUSE mount point, the option '--global-threading'
must be added to the mount command. To change it, an umount and remount
is needed. It's recommended to disable the following option when using
global threading on a mount point:
performance.client-io-threads = off
To enable it for services managed by glusterd, glusterd needs to be
started with option '--global-threading'. In this case all daemons, like
self-heal, will be using the global thread pool.
Currently it can only be enabled for bricks, FUSE mounts and glusterd
services.
The maximum number of threads for clients and bricks can be configured
using the following options:
config.client-threads
config.brick-threads
These options can be applied online and its effect is immediate most of
the times. If one of them is set to 0, the maximum number of threads
will be calcutated as #cores * 2.
Some distributions use a very old userspace-rcu library (version 0.7)
for this reason, some header files from version 0.10 have been copied
into contrib/userspace-rcu and are used if the detected version is 0.7
or older.
An additional change has been made to io-threads to prevent that threads
are started when iot-pass-through is set.
Change-Id: I09d19e246b9e6d53c6247b29dfca6af6ee00a24b
updates: #532
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Minor change to reduce work done under a lock.
Also, remove unused variable (unrelated to the above).
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I1dfb55823c3db7c638d8a34288423bd1faa37c32
|
|
|
|
|
|
|
|
|
|
|
| |
Changed to use the dict_() funcs which take the key length.
This happens to also reduce work under the lock in one case as well.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I958fcc29e95286fe3c74178cae3f01a8b2db26f2
|
|
|
|
|
|
|
|
|
|
| |
Take the time before taking the lock, not under lock.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I6cd05d8556a9bcc015e1be53f6ba46854e52a380
|
|
|
|
|
|
|
|
|
|
| |
Removed op_errno based SERVER_REQ_SET_ERROR() calls which was
dead-code. xdr_to_dict() calls have this check which is used
in 4.0 version of xdr-to-dict.
fixes bz#1676797
Change-Id: I6f56907c85576f1263a6ec04ed7e37f723b01ac3
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Minor changes to reduce work done under a lock.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: Ia58adfb5125129e5d1f3bbf2202f38520fdbc29f
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
posix converts incoming operations on files to operations on
corresponding gfid handles. While this in itself is not a problem,
logging of those gfid handles in place of actual file paths can
create confusions during debugging. The best way would be to
print both the actual file (recieved as an argument) for path
based operations and the gfid handle associated with it.
Change-Id: I408c36ca6456f2e3981b93151c19ef7f60085ad6
fixes: bz#1675076
Signed-off-by: Raghavendra Bhat <raghavendra@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If parallel-readdir is enabled, the rda xlator is loaded
below dht in the graph and proactively lists and caches
entries when an opendir is performed. dht_rmdir checks if
the directory being deleted contains stale linkto files by
performing a readdirp on its child subvols. However, as
the entries are actually read in during the opendir operation
which does not request the linkto xattr,no linkto xattrs are
present for the entries causing dht to incorrectly identify
them as data files and fail the rmdir operation with ENOTEMPTY.
DHT now always adds the linkto xattr in the list of xattrs
requested in the opendir.
Change-Id: I0711198e66c59146282eb8b88084170bedfb4018
fixes: bz#1672851
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The loc_wipe is done in the _out_ section, inode_unref(loc.parent) here
casues a double extra unref of loc.parent.
Change-Id: I2dc809328d3d34bf7b02c7df9a4f97788af511e6
updates: bz#1651439
Signed-off-by: Kinglong Mee <mijinlong@open-fs.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race between the lookup selfheal and rmdir can cause
directories to be healed only on non-hashed subvols.
This can prevent the directory from being listed from
the mount point and in turn causes rm -rf to fail with
ENOTEMPTY.
Fix: Update the layout information correctly and reduce
the call count only after processing the response.
Change-Id: I812779aaf3d7bcf24aab1cb158cb6ed50d212451
fixes: bz#1676400
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
| |
We were not properly cleaning self-heal daemon resources
during afr fini. This patch will clean the same.
Change-Id: I597860be6f781b195449e695d871b8667a418d5a
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Since release-6 is not done yet, this option can be introduced with
GD_OP_VERSION_6_0.
Change-Id: I8a0867e5b8b23d0d485704a2fc7a3efc4a90f637
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
| |
During disconnect cleanup, we are not cancelling reconnect
timer, which causes a ref leak each time when a disconnect
happen.
Change-Id: I9d05d1f368d080e04836bf6a0bb018bf8f7b5b8a
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Explicit invalidation by calling inode_invalidate is necessary when
same (meta)data is shared/access across multiple mounts. Without an
explicit inode_invalidate call, caches in the mount which didn't
witness writes wouldn't be aware of changes as writes wouldn't have
passed through them. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, explicit inode
invalidation can be disabled for this case which gives a huge
performance boost for workloads that write data and then immediately
read the data they just wrote. Note that otherwise, local writes
(which pass through the cache) will change ctime and cause unnecessary
invalidations.
The name of the option that controls this behavior is
"performance.global-cache-invalidation". This option is global and it
purges caches both in glusterfs and kernel stack for native FUSE
mounts. For non-native FUSE mounts, it purges cache only from
glusterfs stack. This option is effective only when
performance.stat-prefetch is on.
Note that there is a similar option "performance.cache-invalidation",
but the scope of that option is limited to quick-read and md-cache.
Change-Id: I462bb4b65ff9aae1f6ba76f50b1f2f94fb10323b
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
|
|
| |
When "auto-invalidation" option was not specified for mount script,
glusterfs cmdline ended with "--auto-invalidation=" option. This patch
fixes that bug in mount script.
Thanks to Amar for reporting it.
Change-Id: Ie5cd4c6ffb3ac644d9d2b032035f914a935d05a8
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
| |
glusterd_resolve_all_bricks failure log should highlight the brick
identifier.
Updates: bz#1193929
Change-Id: I035b4650ef6a14bb1e1221d3bad1c40f9d43dbdd
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The setxattr function receives a pointer to raw data, which may not be
null-terminated. When this data needs to be interpreted as a string, an
explicit null termination needs to be added before using the value.
Change-Id: Id110f9b215b22786da5782adec9449ce38d0d563
updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: get-state command will error out, if any of the underlying
brick(s) of volume(s) in the cluster go bad.
It is expected that get-state command should not error out, but
should generate an output successfully.
Solution: In glusterd_get_state(), a statfs call is made on the
brick path for every bricks of the volumes to calculate the total
and free memory available. If any of statfs call fails on any
brick, we should not error out and should report total memory and free
memory of that brick as 0.
This patch also handles a statfs failure scenario in
glusterd_store_retrieve_bricks().
fixes: bz#1672205
Change-Id: Ia9e8a1d8843b65949d72fd6809bd21d39b31ad83
Signed-off-by: Sanju Rakonde <srakonde@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scenarios tested:
* Upgrade the node when there are stripe / tiering and regular
type of volumes are present.
- All volumes are started fine (as the change was not on brick volfile)
- For tier, the functionality may not even work, as changetimerecorder
is not present.
- 'gluster volume info' properly shows as 'NOT SUPPORTED' for stripe and
tier type of volume.
* Upgrade in a rolling upgrade scenario, where an old version is
able to connect to higher master.
- on a normal volume, if the volfile-server was new, the newer client
volfiles needed to have utime xlator conditionally.
- with this one change, all other changes seem to work fine.
Change-Id: Ib2d3b69dafa02b2c695a735b13c1aa70aba07cb8
updates: bz#1635688
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fuse sets a random gfid-req value for a fresh lookup. Posix
lookup will set this gfid on entries with missing gfids causing
a GFID mismatch for directories.
DHT will now ignore the Fuse provided gfid-req and use the GFID
returned from other subvols to heal the missing gfid.
Change-Id: I5f541978808f246ba4542564251e341ec490db14
fixes: bz#1670259
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Auto invalidation is necessary when same (meta)data is shared/access
across multiple mounts. However, if (meta)data is not shared, all
relevant I/O goes through the cache of single mount and hence is
coherent with (meta)data on bricks always. So, fuse-auto-invalidation
can be disabled for this case which gives a huge performance boost for
workloads that write data and then immediately read the data they just
wrote.
From glusterfs --help,
<snip>
--auto-invalidation[=BOOL] controls whether fuse-kernel can
auto-invalidate attribute, dentry and page-cache.
Disable this only if same files/directories are
not accessed across two different mounts
concurrently [default: "on"]
</snip>
Details on how disabling auto-invalidation helped to reduce pgbench
init times can be found at [1]. Time taken for pgbench init of scale
8000 was 8340s. That will be an improvement of 86% (59280s vs 8340s)
with auto-invalidations turned off along with other
optimizations. Just disabling auto-invalidation contributed 56%
improvement by reducing the total time taken by 33260s.
[1] https://www.spinics.net/lists/gluster-devel/msg25907.html
Change-Id: I0ed730dba9064bd9c576ad1800170a21e100e1ce
Signed-off-by: Raghavendra Gowdappa <rgowdapp@redhat.com>
updates: bz#1664934
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch creates a specific function to set the thread name using a
string format and a variable argument list, like printf().
This function is used to set the thread name from gf_thread_create(),
which now accepts a variable argument list to create the full name. It's
not necessary anymore to use a local array to build the name of the
thread. This is done automatically.
Change-Id: Idd8d01fd462c227359b96e98699f8c6d962dc17c
Updates: bz#1193929
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a fop to create an entry fails on one of the data brick,
we mark the pending changelog on the entry on brick for which
it was successful. This is done as part of post op phase to
make sure that entry gets healed even if it gets renamed to
some other path where its parent was not marked as bad.
As it happens as part of post op, we should consider thin-arbiter
to check if the brick, which was successful, is the good brick or not.
This will avoide split brain and other issues.
Change-Id: I12686675be98f02f70a5186b3ed748c541514d53
updates: bz#1662264
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rebalance sets the sgid and t bits on a file
that is being migrated. These permissions are
not removed in dht_readdirp_cbk when listing files
causing them to show up on the mountpoint.
We now remove these permissions if a non-linkto
file has the linkto xattr set.
Change-Id: I5c69b2ecfe2df804fe50faea903b242d01729596
fixes: bz#1669937
Signed-off-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Avoid thread creation for bitrot-stub
for a volume if feature is not enabled
Solution: Before thread creation check the flag if feature
is enabled
Updates: #475
Change-Id: I2c6cc35bba142d4b418cc986ada588e558512c8e
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
Signed-off-by: Kotresh HR <khiremat@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: When rpc-transport-disconnect happens, server_connection_cleanup_flush_cbk()
is supposed to call rpc_transport_unref() after open-files on
that transport are flushed per transport.But open-fd-count is
maintained in bound_xl->fd_count, which can be incremented/decremented
cumulatively in server_connection_cleanup() by all transport
disconnect paths. So instead of rpc_transport_unref() happening
per transport, it ends up doing it only once after all the files
on all the transports for the brick are flushed leading to
rpc-leaks.
Solution: To avoid races maintain fd_cnt at client instead of maintaining
on brick
Credits: Pranith Kumar Karampuri
Change-Id: I6e8ea37a61f82d9aefb227c5b3ab57a7a36850e6
fixes: bz#1668190
Signed-off-by: Mohit Agrawal <moagrawal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...when ctime is zero. ia_type and ia_gfid always need to be non-zero
for things to work correctly.
Problem:
Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac zeroed out the iatt
buffer in the cbks of modification fops before unwinding if the ctime in
the buffer was zero. This was causing the fops to fail: noticeable when
AFR's 'consistent-metadata' option was enabled. (AFR zeros out the ctime
when the option is set. See commit
4c4624c9bad2edf27128cb122c64f15d7d63bbc8).
Fixes:
-Do not zero out the ia_type and ia_gfid of the iatt buff under any
circumstance.
-Also, fixed _rda_inode_ctx_update_iatts() to always update these values from
the incoming buf when ctime is zero. Otherwise we end up with zero
ia_type and ia_gfid the first time the function is called *and* the
incoming buf has ctime set to zero.
fixes: bz#1670253
Reported-By:Michael Hanselmann <public@hansmi.ch>
Change-Id: Ib72228892d42c3513c19fc6dfb543f2aa3489eca
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the feature enabled, some of the performance testing results,
specially those which create millions of small files, got approximately
4x regression compared to version before enabling this.
On master without this patch: 765 creates/sec
On master with this patch : 3380 creates/sec
Also there seems to be regression caused by this in 'ls -l' workload.
On master without this patch: 3030 files/sec
On master with this patch : 16610 files/sec
This is a feature added to handle multiple clients parallely operating
(specially those which race for file creates with same name) on a single
namespace/directory. Considering that is < 3% of Gluster's usecase right
now, it makes sense to disable the feature by default, so we don't
penalize the default users who doesn't bother about this usecase.
Also note that the client side translators, specially, distribute,
replicate and disperse already handle the issue upto 99.5% of the cases
without SDFS, so it makes sense to keep the feature disabled by default.
Credits: Shyamsunder <srangana@redhat.com> for running the tests and
getting the numbers.
Change-Id: Iec49ce1d82e621e9db25eb633fcb1d932e74f4fc
Updates: bz#1670031
Signed-off-by: Amar Tumballi <amarts@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly, unlock before logging.
In some cases, moved different code that was not needed
to be under lock (for example, taking time, or malloc'ing)
to be executed before taking the lock.
Note: logging might be slightly less accurate in order, since it may
not be done now under the lock, so order of logs is racy. I think
it's a reasonable compromise.
Compile-tested only!
updates: bz#1193929
Signed-off-by: Yaniv Kaul <ykaul@redhat.com>
Change-Id: I2438710016afc9f4f62a176ef1a0d3ed793b4f89
|