| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We should never enter swap; if we do, show the memory state of the machine and the process list. This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 627a6d67a0e9e61f8713e62695b3518c51909230)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
path
This is very useful for testing, I use such a script:
cat ~/bin/ethtool
#!/bin/sh
IFACE=$1
case "$IFACE" in
Neth2)
;;
Neth3)
;;
Neth4)
;;
Neth5)
;;
*)
exec /usr/sbin/ethtool $@
;;
esac
ip link set down $IFACE
exec /usr/sbin/ethtool $@
metze
(This used to be ctdb commit 3bab985cf615720eded4d47b4f9f37a9c28840aa)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit 829150e814a5e6c85d0f21421f46f41e81d74c53)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit a5cdf1277387f8c6292153c37fa9ceb64707d04f)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit 712ed0c4c0bff1be9e96a54b62512787a4aa6259)
|
|
|
|
|
|
|
|
|
| |
With this option set to "yes", we don't become unhealthy
as long as at least one interface is still available.
metze
(This used to be ctdb commit d054eb33c6ae92560cddb40732e5dcf622591a3c)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit 615dec051c26aac628f120e96bf12fb39fc6d28a)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit c465f63585c419ba59a6b04cbbf78ae615a7259d)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit b5ba56dea57db97d6c6ba3e7582e74fe0e3041fc)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit f9837f8b6f887d28f29aeb3eeffe8cfb423b40b4)
|
|
|
|
|
|
|
|
|
| |
With this script it's possible to generate routing tables
per public ip address.
metze
(This used to be ctdb commit ff5678fbec2daef461143acf00cef3f94d7655fc)
|
|
|
|
|
|
|
|
|
|
|
| |
on interfaces atomic
When two releaseip events run in parallel it's possible that the 2nd script
readds a secondary ip that was removed by the 1st script.
metze
(This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit 24cd42769d8f32b90a8876a6a08a36ab23076cd1)
|
|
|
|
|
|
|
|
|
| |
This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.
metze
(This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
|
|
|
|
| |
(This used to be ctdb commit 085d1bea78fabf754ef6dd6d323f74a1d361e45c)
|
|
|
|
|
|
|
|
|
|
| |
reliable way to determine if winbindd is in a useful state."
This reverts commit 7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6.
wbinfo --ping-dc is proving too unreliable.
(This used to be ctdb commit b70021856e76df1ba407c83cfc19bf332fbfc869)
|
|
|
|
|
|
|
|
| |
This reverts commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b.
wbinfo --ping-dc is proving too unreliable.
(This used to be ctdb commit 178f429a7b6d1008d35e857b6ca1df6adb60d255)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when checking link status for an interface, first
check if this interface is in fact a bond device
(by the precense of a /proc/net/bonding/IFACE file)
and use that file for checking status.
Othervise assume ib* is an infiniband interface which we donnt know how
to check, or otherwise it is an ethernet interface and ethtool should
hopefully work.
(This used to be ctdb commit 8cc6c5de3d7abb0b72eaa6e769e70963b02d84cb)
|
|
|
|
|
|
|
| |
and have a link.
if not the node should become unhealthy
(This used to be ctdb commit 03b5bbaae1b53830a4cd20d3079ab8f45ffce923)
|
|
|
|
|
|
| |
metze
(This used to be ctdb commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b)
|
|
|
|
|
|
| |
to determine if winbindd is in a useful state.
(This used to be ctdb commit 7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 9ea261f791ab919eb1ce5b37073b4f1d30694bb8)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 4c6e68ae942c05224c5f8b683fbc2dc1adced8ee)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The functions file no longer causes a side-effect by doing a shift.
It also doesn't set a convenience variable for $1.
All eventscripts now explicitly use "$1" in their case statement, as
does the initscript. The absence of a shift means that the
takeip/releaseip events now explicitly reference $2-$4 rather than
$1-$3.
New function ctdb_standard_event_handler handles the status and
setstatus events, and exits for either of those events. It is called
via a default case in each eventscript, replacing an explicit status
case where applicable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b)
|
|
|
|
|
|
|
|
|
|
| |
Apart from lots of cleanup work, this also fixes a bug where the share
checks didn't used to cope with directory names containing spaces.
The previous commit also loaded the config incorrectly.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 35a60a63a9b5c7d98dde514ae552239506b691c9)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f1e350f9edb74cc44b6c5be4c062fd93e98ba8c4)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ac655b0a65b32d809d47fec9821f7f31bb2fe2a7)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first stage of an experimental change to eventscripts.
Ronnie and I did a few hours of factorisation of 40.vsftpd and applied
many of the changes to 41.httpd. Other eventscripts were also
modified.
At this stage this is completely untested.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1)
|
|
|
|
|
|
| |
Signed-off-by: Michael Adam <obnox@samba.org>
(This used to be ctdb commit 6ccb495d1110157c06596763c7e252f3182c251e)
|
|
|
|
|
|
|
|
|
|
|
| |
This just sleeps for twice the value of EventScriptTimeout
in the monitor action. It is not run by default, but
can be activated by setting CTDB_RUN_TIMEOUT_MONITOR
in /etc/sysconfig/ctdb .
Michael
(This used to be ctdb commit 1a3ecdee85b82bb3234a92ae6bcdeb92238eb7ee)
|
|
|
|
|
|
| |
this to handle the case where all links do have a physical layer, but where all slaves have been disabled using ifdown
(This used to be ctdb commit bf50709630df000583f2b0ef0edc177c01d60eaf)
|
|
|
|
| |
(This used to be ctdb commit 9e1b99221c8f257129641f6eda2795537b7ce9de)
|
|
|
|
|
|
| |
more scary and easier to spot in the logs
(This used to be ctdb commit 0c9b0466fd87b3f1e5d53f867c863217802ac43b)
|
|
|
|
|
|
|
|
| |
time since the last recovery OR failover."
This reverts commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36.
(This used to be ctdb commit cb36bbb5418290e8e5b770d2d836285b15da2a6f)
|
|
|
|
|
|
| |
since the last recovery OR failover.
(This used to be ctdb commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36)
|
|
|
|
|
|
| |
and refuse to start up if it can not
(This used to be ctdb commit 4037b6e73a819a8e2463dfe0959b42875e05e106)
|
|
|
|
| |
(This used to be ctdb commit 3997d7e5471810e9a2f145ce2e795073dfc5eded)
|
|
|
|
|
|
|
| |
pring the filename of the missing file when we turn unhealthy and also
a 'df'
(This used to be ctdb commit 97ded8a629ec762f71bad28515e4fbc810790b1d)
|
|
|
|
|
|
| |
This reverts commit f5e9f3007c10a937158bc8cdfabf33c984cf9c50.
(This used to be ctdb commit 02f68dc60e0b7bf26d631850b12834d5c71a88f2)
|
|
|
|
|
|
|
| |
Leave the node as UNHEALTHY this stops clients from accessing the node until
the reclock file can be accessed again
(This used to be ctdb commit f5e9f3007c10a937158bc8cdfabf33c984cf9c50)
|
|
|
|
|
|
| |
this allows us to configure and enable nfs at runtime without having to restart ctdbd
(This used to be ctdb commit f6e39d35713475defaa08a623e194f3f2f8f7d53)
|
|\
| |
| |
| | |
(This used to be ctdb commit 5f14874c5c705dd637f88a77f30c930fea1201d2)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Each recovery that involves IP reassignments results in a restart of
vsftpd in the "recovered" event. Currently, we can have several
recoveries in quick succession and the "monitor" event following each
can fail because vsftpd isn't ready yet. This results in cumulative
failures, so the node is marked unhealthy, even though vsftpd has
never had a proper opportunity to become ready.
This resets the fail count after each recovery.
While we're here, also move the delete of the restart flag file into
the body of the conditional.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 318abeb4b913a8d846e7eaf4cf5c2a67b61ce974)
|
|/
|
|
|
|
| |
remove the configuration at runtime
(This used to be ctdb commit deed52b7e4aac94b4d11a8d89d08739e1dfd4ed7)
|
|
|
|
|
|
|
|
|
| |
test -z really needs its argument to be quoted. Simplified a status
test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fe26da7780545b1ecc0a7da5bc1cf8beaeea94cc)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the monitor event in 40.vsftpd so it only fails if there are 2
successive failures connecting to port 21. This reduces the
likelihood of unhealthy nodes due to vsftpd being restarted for
reconfiguration due to node failover or system reconfiguration.
New eventscript functions ctdb_counter_init, ctdb_counter_incr,
ctdb_counter_limit. These are used to count arbitrary things in
eventscripts, depending on the eventscript name and a tag that is
passed, and determine if a specified limit has been hit. They're good
for counting failures!
These functions are used in 40.vsftpd and also in 01.reclock - the
latter used to do the counting without these functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the explicit vacuum/repack commands from the 00.ctdb eventscript
and implement this in the ctdb daemon.
Combine vacuuming and repacking into one
cheap read traverse to enumerate all candidate records
and one write traverse that both repacks the database and also deletes the record locally where we are lmaster and where the records have already been deleted remotely.
this code also adds initial autotuning heuristics for the vacuum intervals and how many records to delete in each iteration.
minor stylish changes made by ronnie s
(This used to be ctdb commit 95a3ee551241aa164967991fe5efe078e1714bde)
|
|
|
|
| |
(This used to be ctdb commit 6e35feb06ec036b9036c5d1cdd94f7cef140d8a6)
|
|
|
|
|
|
|
|
|
|
| |
if the reclock file has been set, then this script will test that the
reclock file can actually be accessed.
if the file does not exist, or if the attempts to stat the file hangs,
the node will be marked unhealthy after the third failed monitoring event
and after the tenth failure, ctdb itself will shutdown.
(This used to be ctdb commit 2cb04747887674def299e574fccb827c1c3194e7)
|
|
|
|
| |
(This used to be ctdb commit 03b0d913ae009284e2fadda1b9246ec77d19db29)
|