<feed xmlns='http://www.w3.org/2005/Atom'>
<title>nfs-utils.git/utils/gssd, branch gss-fixes</title>
<subtitle>NFS utils related patches</subtitle>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/'/>
<entry>
<title>Remove unused arguments</title>
<updated>2014-01-17T16:51:44+00:00</updated>
<author>
<name>Simo Sorce</name>
<email>simo@redhat.com</email>
</author>
<published>2014-01-17T04:01:37+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=94bc961961096bd614d37a5ecd6bee474340aa79'/>
<id>94bc961961096bd614d37a5ecd6bee474340aa79</id>
<content type='text'>
The name variable is always set to NULL now in all callers, so just
sto passing it around needlessly.
The uid_t variable is not used at all, so chuck it out too.

Signed-off-by: Simo Sorce &lt;simo@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The name variable is always set to NULL now in all callers, so just
sto passing it around needlessly.
The uid_t variable is not used at all, so chuck it out too.

Signed-off-by: Simo Sorce &lt;simo@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Improve first attempt at acquiring GSS credentials</title>
<updated>2014-01-17T16:51:10+00:00</updated>
<author>
<name>Simo Sorce</name>
<email>simo@redhat.com</email>
</author>
<published>2014-01-15T21:01:49+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=1c3089d57036f276976273a01d03d3cea6eab0bb'/>
<id>1c3089d57036f276976273a01d03d3cea6eab0bb</id>
<content type='text'>
Since now rpc.gssd is swithing uid before attempting to acquire
credentials, we do not need to pass in the special uid-as-a-string name
to gssapi, because the process is already running under the user's
credentials.

By removing this code we can fix a class of false negatives where the
user name does not match the actual ccache credentials and the ccache
type used is not one of the only 2 supported explicitly by rpc.gssd by the
fallback trolling done later.

Signed-off-by: Simo Sorce &lt;simo@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since now rpc.gssd is swithing uid before attempting to acquire
credentials, we do not need to pass in the special uid-as-a-string name
to gssapi, because the process is already running under the user's
credentials.

By removing this code we can fix a class of false negatives where the
user name does not match the actual ccache credentials and the ccache
type used is not one of the only 2 supported explicitly by rpc.gssd by the
fallback trolling done later.

Signed-off-by: Simo Sorce &lt;simo@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: don't let parent exit until child has a chance to scan directory once</title>
<updated>2013-11-20T20:04:47+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2013-11-20T18:55:50+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=250dbae681c3bd589f2fe52871b0c3611f72b87f'/>
<id>250dbae681c3bd589f2fe52871b0c3611f72b87f</id>
<content type='text'>
With some proposed kernel changes, it won't even attempt to upcall
sometimes if it doesn't appear that gssd is running. This means that
we have a theoretical race between gssd starting up at boot time and
the init process attempting to mount kerberized filesystems.

Fix this by switching gssd to use mydaemon() and having the child
only release the parent after it has processed the directory once.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With some proposed kernel changes, it won't even attempt to upcall
sometimes if it doesn't appear that gssd is running. This means that
we have a theoretical race between gssd starting up at boot time and
the init process attempting to mount kerberized filesystems.

Fix this by switching gssd to use mydaemon() and having the child
only release the parent after it has processed the directory once.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfs-utils: consolidate mydaemon() and release_parent() implementations</title>
<updated>2013-11-20T20:04:47+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2013-11-20T20:00:41+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=6a46d870c61433c8dea0270d9c10702b7b4b3d99'/>
<id>6a46d870c61433c8dea0270d9c10702b7b4b3d99</id>
<content type='text'>
We currently have 2 cut-and-paste versions of this code. One for idmapd
and one for svcgssd.[1]

The two are basically equivalent but there are some small differences,
mostly related to how errors in that function are logged. svcgssd uses
printerr() with a priority of 1, which only prints errors if -v was
specified. That doesn't seem to be quite right. Daemonizing errors are
necessarily fatal and should be logged as such. The one for idmapd uses
err(), which always prints to stderr even though we have the xlog
facility set up. Since both have xlog configured at this point, log the
errors using xlog_err() instead.

The only other significant difference I see is that the idmapd version
will open "/" if it's unable to open "/dev/null". I believe that however
was a holdover from an earlier version of that function that did not
error out when we were unable to open a file descriptor. Since the
function does that now, I don't believe we need that fallback anymore.

[1]: technically, we have a third in statd too, but it's different
     enough that I don't want to touch it here.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently have 2 cut-and-paste versions of this code. One for idmapd
and one for svcgssd.[1]

The two are basically equivalent but there are some small differences,
mostly related to how errors in that function are logged. svcgssd uses
printerr() with a priority of 1, which only prints errors if -v was
specified. That doesn't seem to be quite right. Daemonizing errors are
necessarily fatal and should be logged as such. The one for idmapd uses
err(), which always prints to stderr even though we have the xlog
facility set up. Since both have xlog configured at this point, log the
errors using xlog_err() instead.

The only other significant difference I see is that the idmapd version
will open "/" if it's unable to open "/dev/null". I believe that however
was a holdover from an earlier version of that function that did not
error out when we were unable to open a file descriptor. Since the
function does that now, I don't believe we need that fallback anymore.

[1]: technically, we have a third in statd too, but it's different
     enough that I don't want to touch it here.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: don't let spurious signals interrupt the wait after forking</title>
<updated>2013-11-20T20:04:47+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2013-11-20T17:59:39+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=95af6be7a7039282243118447d6d1895671504da'/>
<id>95af6be7a7039282243118447d6d1895671504da</id>
<content type='text'>
Because gssd uses dnotify under the hood, it's easily possible that the
parent process can catch a signal while processing an upcall. If that
happens, then we'll currently exit the wait for the child task to exit,
and it'll end up as a zombie.

Fix this by ensuring that we only wait for the child to actually exit.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because gssd uses dnotify under the hood, it's easily possible that the
parent process can catch a signal while processing an upcall. If that
happens, then we'll currently exit the wait for the child task to exit,
and it'll end up as a zombie.

Fix this by ensuring that we only wait for the child to actually exit.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: Fix file descriptor leak of old pipe dirs</title>
<updated>2013-11-20T20:04:47+00:00</updated>
<author>
<name>Weston Andros Adamson</name>
<email>dros@netapp.com</email>
</author>
<published>2013-11-20T17:46:20+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=d3eac1e59e068cf033f850ab8be016beccf9726d'/>
<id>d3eac1e59e068cf033f850ab8be016beccf9726d</id>
<content type='text'>
gssd doesn't properly clean up internal state for old pipes and never
closes the (since deleted) clnt_info directory. This leads to eventual
fd exhaustion.

To reproduce, run a lot of mount / umounts in a loop and watch the
output of 'ls /proc/$PID/fdinfo | wc -l' (where PID is the pid of gssd)
steadily grow until gssd eventually crashes with "Too many open files".

This regression was introduced by 841e83c1, which was trying to fix a
similar bug in the skip matching logic of update_old_clients. The
problem with that patch is that pdir will never match dirname,
because dirname is "&lt;pname&gt;/clntXXX".

Signed-off-by: Weston Andros Adamson &lt;dros@netapp.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
gssd doesn't properly clean up internal state for old pipes and never
closes the (since deleted) clnt_info directory. This leads to eventual
fd exhaustion.

To reproduce, run a lot of mount / umounts in a loop and watch the
output of 'ls /proc/$PID/fdinfo | wc -l' (where PID is the pid of gssd)
steadily grow until gssd eventually crashes with "Too many open files".

This regression was introduced by 841e83c1, which was trying to fix a
similar bug in the skip matching logic of update_old_clients. The
problem with that patch is that pdir will never match dirname,
because dirname is "&lt;pname&gt;/clntXXX".

Signed-off-by: Weston Andros Adamson &lt;dros@netapp.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: always reply to rpc-pipe requests from kernel.</title>
<updated>2013-11-20T20:04:47+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-11-20T17:43:29+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=f4b43e2ff3db1b12a60c9b5087ac2bcf6ba4bee5'/>
<id>f4b43e2ff3db1b12a60c9b5087ac2bcf6ba4bee5</id>
<content type='text'>
Sometimes gssd will open a new rpc-pipe but never read requests from it
or reply to them.  This causes the kernel to wait forever for a reply.

In particular, if a filesystem is mounted by IP, and the IP has no
hostname recorded in /etc/hosts or DNS, then gssd will not listen to
requests and the mount will hang indefinitely.

The comment in process_clnt_dir() for the "fail_keep_client:" branch
suggests that it is for the case where we couldn't open some
subdirectories.  However it is currently also taken if reverse DNS
lookup fails (as well as some other lookup failures).  Those failures
should not be treated the same as failure-to-open directories.

So this patch causes a failure from read_service_info() to *not* be
reported by process_clnt_dir_files.  This ensures that
insert_clnt_poll()
will be called and requests will be handled.

In handle_gssd_upcall, the current error path (taken when the mech is
not "krb5") does not reply to the upcall.  This is wrong.  A reply is
always appropriate.  The only replies which aren't treated as
transient errors are EACCES and EKEYEXPIRED, so we return the former.

If read_service_info() fails then -&gt;servicename will be NULL which will
cause process_krb5_upcall() (quite reasonably) to become confused.  So
in that case we don't even try to process the up-call but just reply
with EACCES.

As clp-&gt;servicename==NULL is no longer treated as fatal, it is not
appropraite to use it to test if read_service_info() has been already
called on a client.  Instread test clp-&gt;prog.

Finally, the error path of read_service_info() will close 'fd' if it
isn't -1, so when we close it, we should set fd to -1.

Acked-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sometimes gssd will open a new rpc-pipe but never read requests from it
or reply to them.  This causes the kernel to wait forever for a reply.

In particular, if a filesystem is mounted by IP, and the IP has no
hostname recorded in /etc/hosts or DNS, then gssd will not listen to
requests and the mount will hang indefinitely.

The comment in process_clnt_dir() for the "fail_keep_client:" branch
suggests that it is for the case where we couldn't open some
subdirectories.  However it is currently also taken if reverse DNS
lookup fails (as well as some other lookup failures).  Those failures
should not be treated the same as failure-to-open directories.

So this patch causes a failure from read_service_info() to *not* be
reported by process_clnt_dir_files.  This ensures that
insert_clnt_poll()
will be called and requests will be handled.

In handle_gssd_upcall, the current error path (taken when the mech is
not "krb5") does not reply to the upcall.  This is wrong.  A reply is
always appropriate.  The only replies which aren't treated as
transient errors are EACCES and EKEYEXPIRED, so we return the former.

If read_service_info() fails then -&gt;servicename will be NULL which will
cause process_krb5_upcall() (quite reasonably) to become confused.  So
in that case we don't even try to process the up-call but just reply
with EACCES.

As clp-&gt;servicename==NULL is no longer treated as fatal, it is not
appropraite to use it to test if read_service_info() has been already
called on a client.  Instread test clp-&gt;prog.

Finally, the error path of read_service_info() will close 'fd' if it
isn't -1, so when we close it, we should set fd to -1.

Acked-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: validate cred in gssd_acquire_user_cred</title>
<updated>2013-10-28T12:26:37+00:00</updated>
<author>
<name>Weston Andros Adamson</name>
<email>dros@netapp.com</email>
</author>
<published>2013-10-28T12:26:37+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=74de1431adebb7780a2c4b9c122050e2fb7608b8'/>
<id>74de1431adebb7780a2c4b9c122050e2fb7608b8</id>
<content type='text'>
Call gss_inquire_cred after gssd_acquire_krb5_cred check for expired
credentials.

This fixes a recent regression (since 302de786930a2c533068f9d8909a)
that causes the user's ticket cache to grow unbounded with expired
service tickets when the user's credentials expire.

To reproduce this issue:

 - mount kerberos nfs export
 - kinit for a short lifetime (ie "kinit -l 1m")
 - run a job that opens a file and writes for more than the lifetime
 - run klist a few times after expiry and see the list grow, ie:

Ticket cache: DIR::/run/user/1749600001/krb5cc/tktYmpGlX
Default principal: dros@APIKIA.FAKE

Valid starting       Expires              Service principal
10/21/2013 15:39:38  10/21/2013 15:40:35  krbtgt/APIKIA.FAKE@APIKIA.FAKE
10/21/2013 15:39:40  10/21/2013 15:40:35 nfs/zero.apikia.fake@APIKIA.FAKE

Signed-off-by: Weston Andros Adamson &lt;dros@netapp.com&gt;
Reviewed-by: Simo Sorce &lt;simo@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Call gss_inquire_cred after gssd_acquire_krb5_cred check for expired
credentials.

This fixes a recent regression (since 302de786930a2c533068f9d8909a)
that causes the user's ticket cache to grow unbounded with expired
service tickets when the user's credentials expire.

To reproduce this issue:

 - mount kerberos nfs export
 - kinit for a short lifetime (ie "kinit -l 1m")
 - run a job that opens a file and writes for more than the lifetime
 - run klist a few times after expiry and see the list grow, ie:

Ticket cache: DIR::/run/user/1749600001/krb5cc/tktYmpGlX
Default principal: dros@APIKIA.FAKE

Valid starting       Expires              Service principal
10/21/2013 15:39:38  10/21/2013 15:40:35  krbtgt/APIKIA.FAKE@APIKIA.FAKE
10/21/2013 15:39:40  10/21/2013 15:40:35 nfs/zero.apikia.fake@APIKIA.FAKE

Signed-off-by: Weston Andros Adamson &lt;dros@netapp.com&gt;
Reviewed-by: Simo Sorce &lt;simo@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: do a more thorough change of identity after forking</title>
<updated>2013-10-21T17:28:06+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2013-10-21T17:28:06+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=6b53fc9ce38ba6fff2fd5c2f6ed143747067a39d'/>
<id>6b53fc9ce38ba6fff2fd5c2f6ed143747067a39d</id>
<content type='text'>
The part of process_krb5_upcall that handles non-machine user creds
first tries to query GSSAPI for credentials. If that fails, it then
falls back to trawling through likely credcache locations to find them
and then points $KRB5CCNAME at it before proceeding. There are a number
of bugs in this code that this patch attempts to address.

The code that queries GSSAPI for credentials does it as root which
almost universally fails to do anything useful unless we happen to be
looking for non-machine root creds. Because of this, gssd almost always
falls back to having to search for credcaches "manually". The code that
handles credential switching is in create_auth_rpc_client, so it's too
late to be of any use here.

Worse yet, for historical reasons the MIT krb5 authors used %{uid} in
the default credcache locations which translates to the real uid. Thus
switching the fsuid or even euid is insufficient. You must switch the
real uid in order to be able to find the proper credcache in most cases.

This patch moves the credential switching to occur much earlier in the
process and has it do a much more thorough job of it. It first drops all
supplimentary groups, then determines a gid to use and switches the gids
and uids to the correct ones. If it can't determine the correct gid to
use, it then tries to look up the one for "nobody" and uses that.

Once the credentials are switched, the forked child now no longer tries
to change them back. It does the downcall with the new credentials and
just exits when it's done.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The part of process_krb5_upcall that handles non-machine user creds
first tries to query GSSAPI for credentials. If that fails, it then
falls back to trawling through likely credcache locations to find them
and then points $KRB5CCNAME at it before proceeding. There are a number
of bugs in this code that this patch attempts to address.

The code that queries GSSAPI for credentials does it as root which
almost universally fails to do anything useful unless we happen to be
looking for non-machine root creds. Because of this, gssd almost always
falls back to having to search for credcaches "manually". The code that
handles credential switching is in create_auth_rpc_client, so it's too
late to be of any use here.

Worse yet, for historical reasons the MIT krb5 authors used %{uid} in
the default credcache locations which translates to the real uid. Thus
switching the fsuid or even euid is insufficient. You must switch the
real uid in order to be able to find the proper credcache in most cases.

This patch moves the credential switching to occur much earlier in the
process and has it do a much more thorough job of it. It first drops all
supplimentary groups, then determines a gid to use and switches the gids
and uids to the correct ones. If it can't determine the correct gid to
use, it then tries to look up the one for "nobody" and uses that.

Once the credentials are switched, the forked child now no longer tries
to change them back. It does the downcall with the new credentials and
just exits when it's done.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gssd: have process_krb5_upcall fork before handling upcall</title>
<updated>2013-10-21T17:27:22+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2013-10-21T17:27:22+00:00</published>
<link rel='alternate' type='text/html' href='https://fedorapeople.org/cgit/simo/public_git/nfs-utils.git/commit/?id=f9cac65972da588d5218236de60a7be11247a8aa'/>
<id>f9cac65972da588d5218236de60a7be11247a8aa</id>
<content type='text'>
Most krb5 installations use credcache locations that contain %{uid},
which expands to the real UID of the current process. In order for
GSSAPI to find those properly, we need to be able to switch the real UID
of the process to the designated one. That however, opens the door to
allowing gssd to be killed or reniced during the window where we've
switched credentials.

To combat this, change gssd to fork before trying to handle each upcall.
The child will do the work to establish the context and the parent task
will just wait for it to exit. It's still possible for the child to be
killed or reniced, but that would only affect a single upcall instead of
the entire daemon. Also, If the process is killed prematurely, then log
an error to tip off the admin that there was a problem.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Most krb5 installations use credcache locations that contain %{uid},
which expands to the real UID of the current process. In order for
GSSAPI to find those properly, we need to be able to switch the real UID
of the process to the designated one. That however, opens the door to
allowing gssd to be killed or reniced during the window where we've
switched credentials.

To combat this, change gssd to fork before trying to handle each upcall.
The child will do the work to establish the context and the parent task
will just wait for it to exit. It's still possible for the child to be
killed or reniced, but that would only affect a single upcall instead of
the entire daemon. Also, If the process is killed prematurely, then log
an error to tip off the admin that there was a problem.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Steve Dickson &lt;steved@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
