summaryrefslogtreecommitdiffstats
path: root/src/providers/fail_over.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix debug messages - trailing '.'Pavel Reichl2014-09-291-1/+1
| | | | | | Fix debug messages where '\n' was wrongly followed by '.'. Reviewed-by: Lukáš Slebodník <lslebodn@redhat.com>
* failover: set port status to not working if previous srv lookup failedPavel Březina2014-07-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The meta server status consists of two parts: A) port status - managed by failover mechanism B) SRV lookup status - managed by SRV resolver Both parts are resetted to "neutral" after some time, having B timeout greater than A timeout. We were hitting the following issue: 1. SRV lookup fails (DNS is not reachable), this will set A to "not working and B to "resolve error". Then the next server is tried but fails as well. 2. If SSSD tries to go back online the failover will set A to "neutral" and it will try to resolve SRV again. But B status is still set to "resolve error" since we haven't reached the timeout yet and SRV resolution fails immediately. But the next server is not tried since the port status (A) remains "neutral". This patch sets the port status to "not working" making the failover to continue with the next server as expected. https://fedorahosted.org/sssd/ticket/2390 Reviewed-by: Pavel Reichl <preichl@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com>
* failover: Shorter retry time for failed SRVPavel Reichl2014-04-141-2/+12
| | | | | | | | | | | Until now there was only one timeout used to re-resolve SRV queries. This patch adds new (shorter) timeout that will be used for queries that previously failed. Resolves: https://fedorahosted.org/sssd/ticket/1885 Reviewed-by: Jakub Hrozek <jhrozek@redhat.com>
* Remove unused structures.Lukas Slebodnik2014-02-261-5/+0
| | | | | | | | | | Reported by: cppcheck 'struct py_sss_transaction', 'struct resolve_get_domain_stat', 'struct sync_op_res' were defined in implementation modules, but they were not used anywhere. Reviewed-by: Michal Žídek <mzidek@redhat.com>
* Update DEBUG* invocations to use new levelsNikolai Kondrashov2014-02-121-22/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a script to update DEBUG* macro invocations, which use literal numbers for levels, to use bitmask macros instead: grep -rl --include '*.[hc]' DEBUG . | while read f; do mv "$f"{,.orig} perl -e 'use strict; use File::Slurp; my @map=qw" SSSDBG_FATAL_FAILURE SSSDBG_CRIT_FAILURE SSSDBG_OP_FAILURE SSSDBG_MINOR_FAILURE SSSDBG_CONF_SETTINGS SSSDBG_FUNC_DATA SSSDBG_TRACE_FUNC SSSDBG_TRACE_LIBS SSSDBG_TRACE_INTERNAL SSSDBG_TRACE_ALL "; my $text=read_file(\*STDIN); my $repl; $text=~s/ ^ ( .* \b (DEBUG|DEBUG_PAM_DATA|DEBUG_GR_MEM) \s* \(\s* )( [0-9] )( \s*, ) ( \s* ) ( .* ) $ / $repl = $1.$map[$3].$4.$5.$6, length($repl) <= 80 ? $repl : $1.$map[$3].$4."\n".(" " x length($1)).$6 /xmge; print $text; ' < "$f.orig" > "$f" rm "$f.orig" done Reviewed-by: Jakub Hrozek <jhrozek@redhat.com> Reviewed-by: Stephen Gallagher <sgallagh@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com>
* Make DEBUG macro invocations variadicNikolai Kondrashov2014-02-121-60/+60
| | | | | | | | | | | | | | | | | | | | | | | | Use a script to update DEBUG macro invocations to use it as a variadic macro, supplying format string and its arguments directly, instead of wrapping them in parens. This script was used to update the code: grep -rwl --include '*.[hc]' DEBUG . | while read f; do mv "$f"{,.orig} perl -e \ 'use strict; use File::Slurp; my $text=read_file(\*STDIN); $text=~s#(\bDEBUG\s*\([^(]+)\((.*?)\)\s*\)\s*;#$1$2);#gs; print $text;' < "$f.orig" > "$f" rm "$f.orig" done Reviewed-by: Jakub Hrozek <jhrozek@redhat.com> Reviewed-by: Stephen Gallagher <sgallagh@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com>
* Fix formating of variables with type: time_tLukas Slebodnik2013-09-111-2/+3
|
* Always set port status to neutral when resetting service.Michal Zidek2013-07-111-1/+2
| | | | | | | | We did not set port status for metaservers (srv servers) in fo_reset_services(). Fixes: https://fedorahosted.org/sssd/ticket/1933
* failover: if expanded server is marked as neutral, invoke srv collapsePavel Březina2013-06-211-0/+7
| | | | | | | | | | https://fedorahosted.org/sssd/ticket/1947 Otherwise we will do the SRV expansion once again: 1. leaving the old servers in server list 2. meta server is not inserted back in the list, the newly found servers are inserted behind meta server, meta server is orphaned and the new servers are forgotten
* collapse_srv_lookup may free the server, make it clear from the APIPavel Březina2013-06-211-6/+9
| | | | https://fedorahosted.org/sssd/ticket/1947
* failover: return error when SRV lookup returned only duplicatesPavel Březina2013-06-211-2/+21
| | | | | | | | | | | | | | https://fedorahosted.org/sssd/ticket/1947 Otherwise we risk that the meta server is removed from the server list, but without a chance to return, because there may be no fo_server with srv_data = meta. Also if state->meta->next is NULL (it is still orphaned because we try to errornously expand it without invoking collapse first), state->out will be NULL and SSSD will crash. New error code: ERR_SRV_DUPLICATES
* failover: do not return invalid pointer when server is already presentPavel Březina2013-06-211-2/+6
| | | | https://fedorahosted.org/sssd/ticket/1947
* FO: Check the return value of send_fnJakub Hrozek2013-06-211-0/+4
|
* failover: set state->out when meta server remains in SRV_RESOLVE_ERRORPavel Březina2013-06-141-0/+1
| | | | https://fedorahosted.org/sssd/ticket/1886
* Use deep copy for dns_domain and discovery_domainLukas Slebodnik2013-06-031-2/+4
| | | | https://fedorahosted.org/sssd/ticket/1929
* FO: Fix setting status of duplicatesJakub Hrozek2013-05-281-9/+18
|
* DNS sites support - replace SRV lookup code with a plugin callPavel Březina2013-04-101-258/+73
| | | | | | | | https://fedorahosted.org/sssd/ticket/1032 Removes hard coded SRV lookup code with a plugin call. This patch breaks SRV lookups as there is currently no plugin in use. It is fixed in next patch.
* fail over - add function to insert multiple servers to the listPavel Březina2013-04-101-10/+101
|
* DNS sites support - SRV lookup plugin interfacePavel Březina2013-04-101-0/+26
| | | | | | | | | | | | | https://fedorahosted.org/sssd/ticket/1032 Introduces two new error codes: - ERR_SRV_NOT_FOUND - ERR_SRV_LOOKUP_ERROR Since id_provider is authoritative in case of SRV plugin choise, ability to override the selected pluging during runtime is not desirable. We rely on the fact that id_provider is initialized before all other providers, thus the plugin is set correctly.
* try primary server after retry_timeout + 1 seconds when switching to backupPavel Březina2012-12-181-0/+9
| | | | | | | | | | | | | | https://fedorahosted.org/sssd/ticket/1679 The problem is when we are about to reset the server status, we don't get through the timeout (30 seconds) because the "switch to primary server" task is scheduled 30 seconds after fall back to a backup server. Thus the server status remains "not working" and is resetted after another 30 seconds. We need to make sure that the server status is tried after the timeout period. retry_timeout is currently hardcoded to 30, thus the change in man page.
* Bad debug message when no dns_discovery_domain specified.Michal Zidek2012-09-241-3/+11
| | | | https://fedorahosted.org/sssd/ticket/920
* FO: Check server validity before setting statusJakub Hrozek2012-09-131-5/+8
| | | | | | | | | | | | | | | | | The list of resolved servers is allocated on the back end context and kept in the fo_service structure. However, a single request often resolves a server and keeps a pointer until the end of a request and only then gives feedback about the server based on the request result. This presents a big race condition in case the SRV resolution is used. When there are requests coming in in parallel, it is possible that an incoming request will invalidate a server until another request that holds a pointer to the original server is able to give a feedback. This patch simply checks if a server is in the list of servers maintained by a service before reading its status. https://fedorahosted.org/sssd/ticket/1364
* FO: Return EAGAIN if there are more servers to tryJakub Hrozek2012-08-151-0/+9
| | | | | The caller should issue a next request, which would just shortcut with ENOENT.
* FO: Don't retry the same server if it's not workingJakub Hrozek2012-08-151-2/+3
|
* Duplicate detection in fail over did not work.Michal Zidek2012-08-151-3/+27
| | | | https://fedorahosted.org/sssd/ticket/1472
* Don't use server after SRV data collapsedJakub Hrozek2012-08-091-5/+8
|
* Always mark SRV servers as primaryJakub Hrozek2012-08-071-0/+1
| | | | https://fedorahosted.org/sssd/ticket/1459
* Failover: Return last tried server if it's still being triedJakub Hrozek2012-08-071-2/+6
| | | | | | | | | | | | | | | | | In the failover, we treat both KDC and LDAP on the IPA server as a single "port", numbered 0. This was done in order to make sure that the SSSD always talks to the same server for both LDAP and Kerberos. However, this clever hack breaks when the IPA provider needs to establish an GSSAPI encrypted LDAP connection because we're asking the fail over code to yield a server while no server has yet been marked as tried. This triggers a fail over for the KDC, so in effect, the TGT is received from second server. If the second server is not available for some reason, the whole provider goes offline. The fail over needs to detect that the server asked for is still being resolved and return the same pointer.
* Don't call fo_set_{server,port}_status for SRV serversJakub Hrozek2012-08-031-2/+3
| | | | This bug was producing harmless, but annoying error messages.
* Primary server support: basic support in failover codeJan Zeleny2012-08-011-15/+60
| | | | | | | | Now there are two list of servers for each service. If currently selected server is only backup, then an event will be scheduled which tries to get connection to one of primary servers and if it succeeds, it starts using this server instead of the one which is currently connected to.
* Move some debug lines to new debug log levelsStef Walter2012-06-201-6/+6
| | | | | | | * These are common lines of debug output when starting up sssd https://bugzilla.redhat.com/show_bug.cgi?id=811113
* Return correct resolv_status on resolver timeoutJakub Hrozek2012-03-291-11/+11
| | | | https://fedorahosted.org/sssd/ticket/1274
* Only do one cycle when resolving a serverJakub Hrozek2012-03-061-0/+7
| | | | https://fedorahosted.org/sssd/ticket/1214
* Failover: Introduce a per-service timeoutJakub Hrozek2011-12-201-0/+46
| | | | https://fedorahosted.org/sssd/ticket/976
* Do not touch resolve_service_state in fo_resolve_service_doneJakub Hrozek2011-12-201-14/+11
|
* Multiline macro cleanupJakub Hrozek2011-09-281-1/+2
| | | | | | | | | | This is mostly a cosmetic patch. The purpose of wrapping a multi-line macro in a do { } while(0) is to make the macro usable as a regular statement, not a compound statement. When the while(0) is terminated with a semicolon, the do { } while(0); block becomes a compound statement again.
* fo_get_server_name() getter for a server nameJakub Hrozek2011-07-211-0/+9
| | | | | Allows to be more concise in tests and more defensive in resolve callbacks
* Rename fo_get_server_name to fo_get_server_str_nameJakub Hrozek2011-07-211-1/+1
|
* Switch resolver to using resolv_hostent and honor TTLJakub Hrozek2011-06-151-18/+18
|
* Fix minor typo in error messageStephen Gallagher2011-05-021-1/+1
| | | | https://fedorahosted.org/sssd/ticket/825
* Set same status for duplicate serversJakub Hrozek2011-04-151-0/+21
|
* Remove detection of duplicates from SRV result processingJakub Hrozek2011-04-111-9/+0
|
* Do not attempt to resolve nameless serversJakub Hrozek2011-04-011-1/+1
| | | | | | | | | | | The failover code is not strictly in charge of resolving. Its main function is to provide a server to connect to for a service. It is legal, although not currently used, to have a server that has no name (server->common == NULL). In this case, no resolving should be done and it is assumed that the failover user, which are the SSSD back ends in our case, would perform any resolving out of band, perhaps using the user_data attribute of fo_server structure.
* Run callbacks if server IP changesJakub Hrozek2011-03-091-0/+9
|
* Always expire host name resolutionJakub Hrozek2011-03-081-8/+7
| | | | | | The previous version of the patch only expired a resolved host name if the port was being reset. We want to always expire it so we notice IP address changes even if the previous server is still up.
* Prevent segfault in failover codeJakub Hrozek2011-03-071-2/+3
|
* Reset server status after timeoutJakub Hrozek2011-02-281-1/+11
| | | | https://fedorahosted.org/sssd/ticket/809
* Rename dns_domain to discovery domain for fo_add_srv_server()Stephen Gallagher2011-01-211-7/+11
|
* Allow fallback to SSSD domainStephen Gallagher2011-01-211-4/+44
| | | | | | | | | | | | | | | | | | | | | For backwards-compatibility with older versions of the SSSD (such as 1.2.x), we need to be able to have our DNS SRV record lookup be capable of falling back to using the SSSD domain name as the DNS discovery domain. This patch modifies our DNS lookups so that they behave as follows: If dns_discovery_domain is specified, it is considered authoritative. No other discovery domains will be attempted. If dns_discovery_domain is not specified, we first attempt to look up the SRV records using the domain portion of the machine's hostname. If this returns "NOTFOUND", we will try performing an SRV record query using the SSSD domain name as the DNS discovery domain. https://fedorahosted.org/sssd/ticket/754
* Rename SRV_NOT_RESOLVED to SRV_RESOLVE_ERRORSumit Bose2011-01-051-5/+5
|