| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The tunable variables defined in CTDB configuration file are currently
set up from init script as well as part of "setup" event in 00.ctdb
eventscript. Remove the duplication of this code and set tunable
variables only from setup event. During the "setup" event, it's possible
that ctdb tool commands can timeout if CTDB daemon is not ready. To guard
against such eventuality, wait till "ctdb ping" command succeeds before
executing any other ctdb tool commands.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 632c1b9c1cc2e242376358ce49fd2022b3f27aa2)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 08dbd9c7958f9a0ee3de314d49523d32e4be135c)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bd4ff176387372b1c233373c0bc8ced523fc9670)
|
| |
| |
| |
| |
| |
| |
| |
| | |
Less copying and pasting is a good thing...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7d4b8cce96f33fff647a0c9d259c121dfc8403e9)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This rebuilds all policy routes and can be used if the configuration
changes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c185ffd2822fcee26d07398464c59b66c61f53fa)
|
| |
| |
| |
| |
| |
| |
| | |
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9550c497e6d6ef5ee44826c4bd9ed5ad65174263)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Disable for TakeoverTimeout seconds.
Otherwise the the recovery daemon can get overzealous and start trying
to add/delete addresses that it thinks are missing but where the
eventscript just hasn't finished. This didn't used to matter so much
but it is more important now that concurrent takeip/releaseip/updateip
generate error - we want to avoid spamming the log.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 56fcee3c7730cb12fa666072d5400949af6e5f7c)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There's a race here where release and takeover events for an IP can
run at the same time. For example, a "ctdb deleteip" and a takeover
initiated by the recovery daemon. The timeline is as follows:
1. The release code registers a callback to update the VNN. The
callback is executed *after* the eventscripts run the releaseip
event.
2. The release code calls the eventscripts for the releaseip event,
removing IP from its interface.
The takeover code "updates" the VNN saying that IP is on some
iface.... even if/though the address is already there.
3. The release callback runs, removing the iface associated with IP in
the VNN.
The takeover code calls the eventscripts for the takeip event,
adding IP to an interface.
As a result, CTDB doesn't think it should be hosting IP but IP is on
an interface. The recovery daemon fixes this later... but it
shouldn't happen.
This patch can cause some additional noise in the logs:
Release of IP 10.0.2.133/24 on interface eth2 node:2
recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it.
Release of IP 10.0.2.133/24 rejected update for this IP already in flight
recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed
recoverd:Failed to release local ip address
In this case the node has started releasing an IP when the recovery
daemon notices the addresses is still hosted and initiates another
release. This noise is harmless but annoying.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Stops the behaviour where unhealthy nodes can host IPs when there are
no healthy nodes. Set this to 1 when an immediate complete outage is
preferred when all nodes are unhealthy. The alternative
(i.e. default) can lead to undefined behaviour when the shared
filesystem is unavailable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The existing code makes one fatally bad assumption:
vnn->iface->references can never be -1 (or max-unit32_t in this case).
Right now the reference counting is broken so a reference count of -1
is possible and causes a spurious updateip when vnn->iface is the same
as best_face. This can occur frequently because we get a lot of
redundant takeovers, especially when each IP can only be hosted on one
interface.
This makes the code much more defensive by noting that when best_iface
is the same as vnn->iface there is never a need for an updateip event.
This effectively neuters the updateip code path when IPs can only be
hosted by a single interface.
This should obsolete 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7054e4ded59c6b8f254dcfefaef64da05f25aecd)
|
| |
| |
| |
| |
| |
| |
| |
| | |
With an old ctdb_protocol.h installed under /usr/local, ctdb will
not compile because the <> form of include will find the header
under /usr/local
(This used to be ctdb commit c4f5a58471b206e2287c7958c7f29c1f1c0626ac)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
the first interface specified"
This reverts commit 4308935ba48ac7a29e7523315acf580019715f0f.
This fixes 16_ctdb_config_add_ip.sh test when run against local daemons. When
running against local daemons, if the interface is assigned as soon as an IP is
added, then takeover would never assign this IP address.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 06dfd13604d08910e07cbf927c338d7b9fce9a2f)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Do some other hosuekeeping including stopping tevent.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 212298279557a2833ef0f81809b4a5cdac72ca02)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done
in the background with logging.
Fix some unit tests for samba and winbind.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
winbind and samba can be separately managed. This makes the service
starting and stopping code way too complicated, and even adds a small
amount of complexity to the monitoring code. The sensible option is
to split this eventscript in two.
There are two potentially backward incompatible changes here:
* Functionality has been removed that allowed 50.samba to manage
winbind when CTDB_MANAGES_WINBIND was unset but the smb.conf
"security" parameter was set to "ADS" or "DOMAIN".
Maintaining this functionality would have required moving the
testparm-related code to the functions file, deciding where the
cache file should go, and then calling it from both 49.winbind and
50.samba. This feature wasn't of great value and asking
administrators to set an extra variable in exchange for code
simplicity seems like a reasonable deal.
* External code will need to be changed if it calls 50.samba directly
with winbind-related expectations. This is fairly obvious!
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 34535ae64420926b9a3bf7d453fed4e6f4c90115)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Initialising a new ctdbd will destroy the Unix domain socket so
existing processes will be useless anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 043ef77086797a703aec436a26a05c56a1bcbf2b)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit dc2a8c638bd74b9f1dd75339cd2ae2f32ffa18a8)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These controls include:
CTDB_CONTROL_GET_RECMODE
CTDB_CONTROL_GET_RECMASTER
CTDB_CONTROL_GET_PID
CTDB_CONTROL_GET_PNN
CTDB_CONTROL_PING
CTDB_CONTROL_GET_DB_PRIORITY
In these cases the data field is empty.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b89e959904d7d1b0e5525abd7789f5101537a46a)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 6bd4feff7039138d435428eeded51975c44e567c)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0f0aef21a1bb2d88a8c184ef70c718e0c91acdc3)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Factor out repeated code into new function find_natgw()
* Support both machine and human readable output
* Use libctdb
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a56ec75edd1705b0539513d396d311f0e80a3bf5)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
control_getcapabilities(), control_lvs(), control_lvsmaster() updated
to use ctdb_getcapabilities(), ctdb_getnodemap() as appropriate.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c30ec02615183ecf9b412ad415bf1abd859aec45)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 81af67c6959fdbe0566e3f1a00e2be58dd268dc6)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some code (e.g. NAT gateway code) modifies the returned result so was
modifying the original.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a3f15d2828325bbfba5bc5c0a30429e2ce572a44)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 140fafef23050d40d66f5b5558c7efcb78f80cd2)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This used to catch trailing blank lines. However, these are caught
just as effectively by the whitespace filtering in the loop below.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7b75a3bb722dc86139b1a07a0100d08c34620b91)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The first line is currently human readable and the rest is machine
readable. This doesn't make sense. Do one or the other...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b29d5bbaa7048291c4b3a39bf12e04f0436f67da)
|
| |
| |
| |
| |
| |
| |
| |
| | |
It is already in 2 places and we might use it in another.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 12a0a7a208d1c8fa8991894200d1dc133f3a2d1a)
|
| |
| |
| |
| |
| |
| |
| |
| | |
... not NATGW_NODES.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2da7730dc06153173778ab14e228960e72ff8a86)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 93c97c3ba3ff714dfa0d056a91ff45010a6e2d66)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This involves refactoring ip_route_check_table() into a new function
ip_check_table() which tables the operation type (i.e. rule/route) as
an argument.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit acdaa04079a9827885f32a7bc078d3365c89b474)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This puts it under the umbrella of the previous warning that should
also have been printed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 5c3be8f26dcde0b1b3d86928953e74d4a8b35958)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 6d41208074f0e9b56c585bca7eb39aaed653c4ca)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d0d0a6f19960f233224970b8d5d19b0e37222616)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 0ce5b079f327aba55b62800ccb22d79976fac665)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
del_routing_for_ip() currently fails silently, which could hide real
errors.
In add_routing_for_ip() we don't want to see any error when calling
del_routing_for_ip(), since we don't expect the rule to be there.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 30d69defa7e97ab5e3ba0492a27868dde2616494)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 49dd755fcd077c84eaf3d2fe5dd7757f5588d49c)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Not just stopped nodes. In reality, this means that banned nodes will
also yield, since nodes in the other inactive states won't be running
a daemon.
This seems sensible since if another node notices that an inactive
node is the recovery master then it will force an election anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fc18188b7b63eb0dafbc47e3abf80e306e1dfc31)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
An inactive node can't become the recovery master. So if an inactive
node notices that the recovery master is inactive, it shouldn't force
an election for recovery master and nominate itself as a candidate.
This can cause the recovery master to flip-flop between nodes when all
nodes are inactive.
If there is actually an active node then it will trigger the election.
This is fairly cosmetic but is a step along the way towards ironing
out weirdness when all nodes are stopped.
Also, fix a related comment.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e7dc10da3ced54ea9d719ad167ee42dcca8dce75)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Doing these checks is pointless and potentially causes unnecessary log
messages.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a0c30c820fd47d4f8620dc060c825be10754f5d1)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If CTDB starts in STOPPED state then it thinks it is in the middle of
a recovery. rec->ifaces is also NULL and an early exit further down
(that checks to see if a recovery is in process) means that it stays
that way.
However, each time this function is entered the need for a takeover
run is re-flagged. The takeover run never happens due to the the
early exit, causing a couple of unneeded messages to be logged each
time.
This is avoided by moving the code that sets rec->ifaces so that it is
executed earlier and, in this case, in the middle of a recovery.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f586e8a2911fc6e7f6698f516653145d8fd45dad)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This message used to be correct because the ipreallocated event only
handled updating the NAT gateway. However, that has changed so the
message needs to be updated.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cc9d96f4248e45ea99c5f00db1526426ac26fbc2)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9119a568c2b4601318f7751f537dca2f92a7230b)
|
| |
| |
| |
| |
| |
| |
| |
| | |
Test the startup and monitor events too.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c29a943f9bbcfecb861e71d007c7698a53dc8773)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, if the configuration file is specified by
$CTDB_PER_IP_ROUTING_CONF but is missing, takeip fails but (the
absent) monitor event "succeeds", so the state of a node will
flip-flop.
Instead of this, if the configuration file is missing then fail early
on for all events.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c64c6c77c3f6aa2898e5a575547b587bea868c76)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
missing"
When the configuration file is missing this causes the node to
flip-flop betwen unhealthy (when takeip fails) and healthy (no monitor
event here).
Will reimplement this properly.
This reverts commit 351ca413eec460330571ca8b01ad269728fe15df.
(This used to be ctdb commit 5277d749c9111716fd723647d5421907476422bf)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 076282622fcb2663d378e0c90ed0d9c19f73c005)
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fa0f3cba5adaa38bed37dd8b121ad53e962a010d)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A list of files is given rather than a command. These files are
pushed to the specified nodes.
Quoting is fragile/broken so filenames with spaces won't work - you
win some, you lose some. :-)
All of the other onnode options should work together with this option.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit aed9b98ddbbf3e81de4f7257a10676565f7d7507)
|