From 5490fa3ea4f6a4118a5188acb0e05daa302ed9d6 Mon Sep 17 00:00:00 2001 From: Pavel Březina Date: Mon, 28 Jul 2014 11:48:31 +0200 Subject: failover: set port status to not working if previous srv lookup failed The meta server status consists of two parts: A) port status - managed by failover mechanism B) SRV lookup status - managed by SRV resolver Both parts are resetted to "neutral" after some time, having B timeout greater than A timeout. We were hitting the following issue: 1. SRV lookup fails (DNS is not reachable), this will set A to "not working and B to "resolve error". Then the next server is tried but fails as well. 2. If SSSD tries to go back online the failover will set A to "neutral" and it will try to resolve SRV again. But B status is still set to "resolve error" since we haven't reached the timeout yet and SRV resolution fails immediately. But the next server is not tried since the port status (A) remains "neutral". This patch sets the port status to "not working" making the failover to continue with the next server as expected. https://fedorahosted.org/sssd/ticket/2390 Reviewed-by: Pavel Reichl Reviewed-by: Simo Sorce --- src/providers/fail_over.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'src/providers/fail_over.c') diff --git a/src/providers/fail_over.c b/src/providers/fail_over.c index c45fe9c49..466e3f4de 100644 --- a/src/providers/fail_over.c +++ b/src/providers/fail_over.c @@ -1244,6 +1244,11 @@ resolve_srv_send(TALLOC_CTX *mem_ctx, struct tevent_context *ev, case SRV_RESOLVE_ERROR: /* query could not be resolved but don't retry yet */ ret = EIO; state->out = server; + + /* The port status was reseted to neutral but we still haven't reached + * timeout to try to resolve SRV record again. We will set the port + * status back to not working. */ + fo_set_port_status(state->meta, PORT_NOT_WORKING); goto done; case SRV_RESOLVED: /* The query is resolved and valid. Return. */ state->out = server; -- cgit