glusterfs.git/glusterfsd, branch v8dev

core: fedora 30 compiler warnings

2019-06-18T10:53:49+00:00

warning: ‘%s’ directive argument is null [-Wformat-overflow=]

Change-Id: I69b8d47f0002c58b00d1cc947fac6f1c64e0b295
updates: bz#1193929
Signed-off-by: SheetalPamecha

multiple files: another attempt to remove includes

2019-06-14T16:50:32+00:00

There are many include statements that are not needed.
A previous more ambitious attempt failed because of *BSD plafrom
(see https://review.gluster.org/#/c/glusterfs/+/21929/ )

Now trying a more conservative reduction.
It does not solve all circular deps that we have, but it
does reduce some of them. There is just too much to handle
reasonably (dht-common.h includes dht-lock.h which includes
dht-common.h ...), but it does reduce the overall number of lines
of include we need to look at in the future to understand and fix
the mess later one.

Change-Id: I550cd001bdefb8be0fe67632f783c0ef6bee3f9f
updates: bz#1193929
Signed-off-by: Yaniv Kaul

tests: add tests for different signal handling

2019-05-30T07:32:34+00:00

Also some cleanup:
* old-protocol.t was actually added to make sure we have line-coverage
* first-test.t should have been removed as per the comment. It doesn't do anything.
* add statvfs to rpc-coverage so we can cover statvfs in few xlators.

updates: bz#1693692
Change-Id: Ie8651ce007de484c4abced16b4de765aa5e517be
Signed-off-by: Amar Tumballi

Fix some "Null pointer dereference" coverity issues

2019-05-26T13:59:13+00:00

This patch fixes the following CID's:

  * 1124829
  * 1274075
  * 1274083
  * 1274128
  * 1274135
  * 1274141
  * 1274143
  * 1274197
  * 1274205
  * 1274210
  * 1274211
  * 1288801
  * 1398629

Change-Id: Ia7c86cfab3245b20777ffa296e1a59748040f558
Updates: bz#789278
Signed-off-by: Xavi Hernandez

core: avoid dynamic TLS allocation when possible

2019-04-24T03:26:48+00:00

Some interdependencies between logging and memory management functions
make it impossible to use the logging framework before initializing
memory subsystem because they both depend on Thread Local Storage
allocated through pthread_key_create() during initialization.

This causes a crash when we try to log something very early in the
initialization phase.

To prevent this, several dynamically allocated TLS structures have
been replaced by static TLS reserved at compile time using '__thread'
keyword. This also reduces the number of error sources, making
initialization simpler.

Updates: bz#1193929
Change-Id: I8ea2e072411e30790d50084b6b7e909c7bb01d50
Signed-off-by: Xavi Hernandez

core: Log level changes do not effect on running client process

2019-04-15T04:30:43+00:00

Problem: commit c34e4161f3cb6539ec83a9020f3d27eb4759a975 set log-level
         per xlator during reconfigure only for a brick process not for
         the client process.

Solution: 1) Change per xlator log-level only if brick_mux is enabled.To make sure
             about brick multiplex introudce a flag brick_mux at ctx->cmd_args.

Note: There are two other changes done with this patch
      1) Ignore client-log-level option to attach a brick with
         already running brick if brick_mux is enabled
      2) Add a log to print pid of the running process to make easier
         debugging

Change-Id: I39e85de778e150d0685cd9a79425ce8b4783f9c9
Signed-off-by: Mohit Agrawal 
Fixes: bz#1696046

mgmt/shd: Implement multiplexing in self heal daemon

2019-04-01T03:44:23+00:00

Problem:

Shd daemon is per node, which means they create a graph
with all volumes on it. While this is a great for utilizing
resources, it is so good in terms of performance and managebility.

Because self-heal daemons doesn't have capability to automatically
reconfigure their graphs. So each time when any configurations
changes happens to the volumes(replicate/disperse), we need to restart
shd to bring the changes into the graph.

Because of this all on going heal for all other volumes has to be
stopped in the middle, and need to restart all over again.

Solution:

This changes makes shd as a per volume daemon, so that the graph
will be generated for each volumes.

When we want to start/reconfigure shd for a volume, we first search
for an existing shd running on the node, if there is none, we will
start a new process. If already a daemon is running for shd, then
we will simply detach a graph for a volume and reatach the updated
graph for the volume. This won't touch any of the on going operations
for any other volumes on the shd daemon.

Example of an shd graph when it is per volume

                           graph
                     -----------------------
                     |     debug-iostat    |
                     -----------------------
                    /         |             \
                   /          |              \
              ---------    ---------      ----------
              | AFR-1 |    | AFR-2 |      |  AFR-3 |
              --------     ---------      ----------

A running shd daemon with 3 volumes will be like-->

                           graph
                     -----------------------
                     |     debug-iostat    |
                     -----------------------
                    /           |           \
                   /            |            \
              ------------   ------------  ------------
              | volume-1 |   | volume-2 |  | volume-3 |
              ------------   ------------  ------------

Change-Id: Idcb2698be3eeb95beaac47125565c93370afbd99
fixes: bz#1659708
Signed-off-by: Mohammed Rafi KC

rpc/transport: Missing a ref on dict while creating transport object

2019-03-20T13:24:44+00:00

while creating rpc_tranpsort object, we store a dictionary without
taking a ref on dict but it does an unref during the cleaning of the
transport object.

So the rpc layer expect the caller to take a ref on the dictionary
before passing dict to rpc layer. This leads to a lot of confusion
across the code base and leads to ref leaks.

Semantically, this is not correct. It is the rpc layer responsibility
to take a ref when storing it, and free during the cleanup.

I'm listing down the total issues or leaks across the code base because
of this confusion. These issues are currently present in the upstream
master.

1) changelog_rpc_client_init

2) quota_enforcer_init

3) rpcsvc_create_listeners : when there are two transport, like tcp,rdma.

4) quotad_aggregator_init

5) glusterd: init

6) nfs3_init_state

7) server: init

8) client:init

This patch does the cleanup according to the semantics.

Change-Id: I46373af9630373eb375ee6de0e6f2bbe2a677425
updates: bz#1659708
Signed-off-by: Mohammed Rafi KC

glusterfsd: Multiple shd processes are spawned on brick_mux environment

2019-03-12T04:54:50+00:00

Problem: Multiple shd processes are spawned while starting volumes
         in the loop on brick_mux environment.glusterd spawn a process
         based on a pidfile and shd daemon is taking some time to
         update pid in pidfile due to that glusterd is not able to
         get shd pid

Solution: Commit cd249f4cb783f8d79e79468c455732669e835a4f changed
          the code to update pidfile in parent for any gluster daemon
          after getting the status of forking child in parent.To resolve
          the same correct the condition update pidfile in parent only
          for glusterd and for rest of the daemon pidfile is updated in
          child

Change-Id: Ifd14797fa949562594a285ec82d58384ad717e81
fixes: bz#1684404
Signed-off-by: Mohit Agrawal

glusterfsd: Do not process PROFILE_NFS_INFO if graph is not ready

2019-02-19T16:32:19+00:00

Otherwise, gnfs will crash in following situation.
Also see commit 2f9e555f.

Reproducible Steps:
1. kill gnfs process
2. service glusterd restart;gluster volume profile [vol] info nfs

dump trace info:
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7fcf5cb6a872]
/lib64/libglusterfs.so.0(gf_print_trace+0x324)[0x7fcf5cb743a4]
/lib64/libc.so.6(+0x35670)[0x7fcf5b1d5670]
/usr/sbin/glusterfs(glusterfs_handle_nfs_profile+0x114)[0x7fcf5d066474]
/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x7fcf5cba1502]
/lib64/libc.so.6(+0x47110)[0x7fcf5b1e7110]

Fixes: bz#1677559

Change-Id: Id68edb3e4646c39544e0b4c90b5e0a9083b37b0d
Signed-off-by: hujianfei