glusterfs.git/xlators/nfs/server/src/acl3.c, branch v3.8dev

Avoid conflict between contrib/uuid and system uuid

2015-04-04T17:48:35+00:00

glusterfs relies on Linux uuid implementation, which
API is incompatible with most other systems's uuid. As
a result, libglusterfs has to embed contrib/uuid,
which is the Linux implementation, on non Linux systems.
This implementation is incompatible with systtem's
built in, but the symbols have the same names.

Usually this is not a problem because when we link
with -lglusterfs, libc's symbols are trumped. However
there is a problem when a program not linked with
-lglusterfs will dlopen() glusterfs component. In
such a case, libc's uuid implementation is already
loaded in the calling program, and it will be used
instead of libglusterfs's implementation, causing
crashes.

A possible workaround is to use pre-load libglusterfs
in the calling program (using LD_PRELOAD on NetBSD for
instance), but such a mechanism is not portable, nor
is it flexible. A much better approach is to rename
libglusterfs's uuid_* functions to gf_uuid_* to avoid
any possible conflict. This is what this change attempts.

BUG: 1206587
Change-Id: I9ccd3e13afed1c7fc18508e92c7beb0f5d49f31a
Signed-off-by: Emmanuel Dreyfus 
Reviewed-on: http://review.gluster.org/10017
Tested-by: Gluster Build System 
Reviewed-by: Niels de Vos

nfs: prevent logging missing 'system.posix_acl_access' xattrs

2015-03-09T04:31:31+00:00

Change http://review.gluster.org/9773 addresses the majority of the
logging, but it seems it is still possible to trigger the excessive
logging by requesting the ACL on files directly. Lets squash those too.

BUG: 1197253
Change-Id: I9e90ddd45f1a39641478f34c69c64dfe1c11c727
Signed-off-by: Niels de Vos 
Reviewed-on: http://review.gluster.org/9781
Tested-by: Gluster Build System 
Reviewed-by: Meghana M

nfs: prevent logging missing 'system.posix_acl_*' xattrs

2015-03-02T07:49:45+00:00

The nfs.log gets spammed with messages that the system.posix_acl_access
and system.posix_acl_default xattrs are not set. The logging happens
because the dictionary that contains the xattrs is empty/NULL in case
the getxattr() did not return any contents for the ACLs.

Change-Id: Id31e30635146599915c6d8674a2dde065f348adc
BUG: 1197253
Signed-off-by: Niels de Vos 
Reviewed-on: http://review.gluster.org/9773
Tested-by: Gluster Build System 
Reviewed-by: Meghana M

nfs: nfs3_stat_to_fattr3() improvement

2015-02-28T17:30:03+00:00

During a review of backorti http://review.gluster.org/9170, Kaleb points
out:

    ick, return-by-value. About 50% slower than passing a pointer to the
    target struct.

Change-Id: I4464e6a4e50d82d446a834892d0308332b7c32d0
BUG: 1197142
Reported-by: Kaleb KEITHLEY 
Signed-off-by: Niels de Vos 
Reviewed-on: http://review.gluster.org/9772
Reviewed-by: Kaleb KEITHLEY 
Tested-by: Gluster Build System

gNFS: Allow reading ACLs even without read permissions on the file.

2014-11-13T19:58:00+00:00

When root-squash is enabled or when no permissions are given to
a file, NFS threw permission errors. According to the kernel-nfs
behaviour, no permissions are required to read ACLs.

When no ACLs are set, the system call sys_lgetxattr fails and
returns a ENODATA error. This translates to ESERVERFAULT error
in NFS. Fuse makes an exception to this error and returns a success
case. Similar changes are made here to achieve the expected behaviour.

Change-Id: I46b8f5911114eb087a3f8ca4e921b6b41e83f3b3
BUG: 1161092
Signed-off-by: Meghana Madhusudhan 
Signed-off-by: Niels de Vos 
Reviewed-on: http://review.gluster.org/9085
Tested-by: Gluster Build System

gNFS: Fix memory leak in setacl code path

2014-09-08T13:01:24+00:00

If ACL is set on a file in Gluster NFS mount (setfacl command),
and it succeed, then the NFS call state data is leaked. Though
all the failure code path frees up the memory.

Impact: There is a OOM kill i.e. vdsm invoked oom-killer during
rebalance and Killed process 4305, UID 0, (glusterfs nfs process)

FIX:
Make sure to deallocate the memory for call state in acl3_setacl_cbk()
using nfs3_call_state_wipe();

Signed-off-by: Santosh Kumar Pradhan 

Change-Id: I9caa3f851e49daaba15be3eec626f1f2dd8e45b3
BUG: 1139195
Signed-off-by: Santosh Kumar Pradhan 
Reviewed-on: http://review.gluster.org/8651
Tested-by: Gluster Build System 
Reviewed-by: Niels de Vos

rpcsvc: Validate RPC procedure number before fetch

2014-05-17T18:56:01+00:00

While accessing the procedures of given RPC program in,
rpcsvc_get_program_vector_sizer(), It was not checking boundary
conditions which would cause buffer overflow and subsequently SEGV.

Make sure rpcsvc_actor_t arrays have numactors number of actors.

FIX:
Validate the RPC procedure number before fetching the actor.

Special Thanks to: Murray Ketchion, Grant Byers

Change-Id: I8b5abd406d47fab8fca65b3beb73cdfe8cd85b72
BUG: 1096020
Signed-off-by: Santosh Kumar Pradhan 
Reviewed-on: http://review.gluster.org/7726
Tested-by: Gluster Build System 
Reviewed-by: Rajesh Joseph 
Reviewed-by: Anand Avati

gNFS: Possible NULL pointer dereference

2014-02-07T16:43:59+00:00

In NFS-ACL code (acl3.c) i.e. acl3svc_setacl(), contol can
go to "acl3err" block from setaclargs.mask validation or
acl3_validate_gluster_fh() and acl3_map_fh_to_volume() macros.
But at this point of time "cs" is yet to be init'd (the macro
acl3_handle_call_state_init() is not yet invoked) which can
cause a NULL ptr deref.

FIX:
Refactor the acl3 code.

Coverity ID (CID): 1124491

Change-Id: I3aca38770e03ce59d1705653b6d8349e6cc153b2
BUG: 789278
Signed-off-by: Santosh Kumar Pradhan 
Reviewed-on: http://review.gluster.org/6890
Reviewed-by: Rajesh Joseph 
Reviewed-by: Niels de Vos 
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

gNFS: Server sets ACL mask wrongly in GETACL reply

2014-01-13T07:48:15+00:00

FIX:
1. Set the ACL mask what was requested by client
2. Validate the ACL mask in SETACL routine

Change-Id: Icb8576a8fe2684e0beaf94e8db6a92bc70bbfe7f
BUG: 1051865
Signed-off-by: Santosh Kumar Pradhan 
Reviewed-on: http://review.gluster.org/6683
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur

gNFS: Client cache invalidation with bad fsid

2013-12-17T11:24:15+00:00

1. Problem:
Couple of issues are seen when NFS-ACL is turned ON. i.e.
i) NFS directory access is too slow, impacting customer workflows
   with ACL
ii)dbench fails with 100 directories.

2. Root cause: Frequent cache invalidation in the client side when ACL
is turned ON with NFS because NFS server getacl() code returns the
wrong fsid to the client.

3. This attr-cache invlaidation triggers the frequent LOOKUP ops for
each file instead of relying on the readdir or readdirp data. As
a result performance gets impacted.

4. In case of dbench workload, the problem is more severe. e.g.

Client side rpcdebug output:
===========================

Dec 16 10:16:53 santosh-3 kernel: NFS:
         nfs_update_inode(0:1b/12061953567282551806 ct=2 info=0x7e7f)
Dec 16 10:16:53 santosh-3 kernel: NFS:
         nfs_fhget(0:1b/12061953567282551806 ct=2)
Dec 16 10:16:53 santosh-3 kernel: <-- nfs_xdev_get_sb() = -116 [splat]
Dec 16 10:16:53 santosh-3 kernel: nfs_do_submount: done
Dec 16 10:16:53 santosh-3 kernel: <-- nfs_do_submount() = ffffffffffffff8c
Dec 16 10:16:53 santosh-3 kernel: <-- nfs_follow_mountpoint() = ffffffffffffff8c
Dec 16 10:16:53 santosh-3 kernel: NFS: dentry_delete(clients/client77, 20008)

As per Jeff Layton, This occurs when the client detects that the fsid on
a filehandle is different from its parent. At that point, it tries to
do a new submount of the new filesystem onto the correct point. It means
client got a superblock reference for the new fs and is now looking to set
up the root of the mount. It calls nfs_get_root to do that, which basically
takes the superblock and a filehandle and returns a dentry.  The problem
here is that the dentry->d_inode you're getting back looks wrong. It's not
a directory as expected -- it's something else. So the client gives up and
tosses back an ESTALE.

Which clearly says that, In getacl() code while it does the stat() call
to get the attrs, it forgets to populate the deviceid or fsid before
going ahead and does getxattr().

FIX:
1. Fill the deviceid in iatt.
2. Do bit more clean up for the confusing part of the code.

NB: Many many thanks to Niels de Vos and Jeff Layton for their
help to debug the issue.

Change-Id: I8d3c2a844c9d1761051a883b5ebaeb84062a11c8
BUG: 1043737
Signed-off-by: Santosh Kumar Pradhan 
Reviewed-on: http://review.gluster.org/6523
Reviewed-by: Rajesh Joseph 
Reviewed-by: Niels de Vos 
Tested-by: Gluster Build System 
Reviewed-by: Vijay Bellur