summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * web: Add the links to ftp/http ctdb download areaAmitay Isaacs2012-10-221-4/+7
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 12e4a3e2953842b4c3842bf920fe2086df4fe46c)
| * web: Remove reference to non-existent config filesAmitay Isaacs2012-10-221-3/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4250c7ebe369e73cf29ff910bb9bfc929735408c)
| * doc: getlog and clearlog changes for recovery daemon logsMartin Schwenke2012-10-223-340/+555
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c18ec8ec234cb71da6cc77b1aadc398f57187947)
| * tests: Local daemons should use the logging ringbufferMartin Schwenke2012-10-221-1/+1
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7547e011005f0dd5bd38e67572280126cf16e229)
| * tools/ctdb: Merge recoverd log handling into getlog/clearlogMartin Schwenke2012-10-221-102/+63
| | | | | | | | | | | | | | | | | | | | We don't need extra commands for these. Also, allow a default value of NOTICE for the getlog level. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7197e600f46f2d1638f6c45c0149f109ea25a47c)
| * tools/ctdb: Add log ringbuffer handling for recoverdMartin Schwenke2012-10-221-0/+73
| | | | | | | | | | | | | | | | | | | | | | This adds commands rdgetlog and rdclearlog These are analogous to getlog and clearlog but operate on the logs for the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ef55e06192819d840c09b65741bab737223ac34c)
| * recoverd: Add CTDB_SRVID_GETLOG and CTDB_SRVID_CLEARLOGMartin Schwenke2012-10-224-4/+63
| | | | | | | | | | | | | | | | | | These support getting and clearing logs from the ring-buffer in the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cbca233d1e03b2410e0bb63b936328d4a8b3c7b4)
| * build: Set CTDB_PATH to /tmp/ctdb.socket if SOCKPATH is not definedAmitay Isaacs2012-10-221-1/+5
| | | | | | | | | | | | | | | | | | When building samba with CTDB, if samba configure/waf does not support setting of SOCKPATH, fallback to /tmp/ctdb.socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a9511cf5ecd5bc39b0070f0afa8ac4d4926c6cab)
| * Build: Set the default ctdb socket path at configure timeDavid Disseldorp2012-10-213-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ctdb socket path currently defaults to /tmp/ctdb.socket and can be modified at runtime using the --socket=filename option, common to both ctdb and ctdbd binaries. This change allows the default path to be set at configure time using the --with-socketpath=FILE argument. When not specified, the default path remains /tmp/ctdb.socket, documentation remains unchanged as a result. Signed-off-by: David Disseldorp <ddiss@samba.org> (This used to be ctdb commit f92b9c83a2f39fba9a141417a88de96fc8c592ff)
| * locking: Do not use ctdb_kill() to kill smbd processesAmitay Isaacs2012-10-201-1/+1
| | | | | | | | | | | | | | | | ctdb_kill() is used to terminate processes spawned by CTDB. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7d025281ee70c91ebcd4d9a908de1045a689786b)
| * locking: Add database priority handling for older versions of sambaAmitay Isaacs2012-10-201-0/+61
| | | | | | | | | | | | | | | | | | | | | | In samba versions 3.6.x and older, database priorities are not set. later_db() function implements higher database priority (locking order) for these databases - brlock, g_lock, notify_onelevel, serverid, xattr_tdb Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit edbc8a6669b594d3c413d603e1c9fada9244c2ee)
| * locking: Schedule a new lock request everytime a lock is releasedAmitay Isaacs2012-10-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | Since the number of active lock requests is limited to MAX_LOCK_PROCESSES_PER_DB (= 100), any new requests won't get scheduled when they are created. So schedule a pending request once current active request is done. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c8eb4a3170ab8524e638047053831ba547e9cce8)
| * ctdbd: Replace lockwait with locking API and remove ctdb_lockwait.cAmitay Isaacs2012-10-207-262/+7
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2126795153dacb255e441abcb36ee05107b6282a)
| * ctdb_recover: Replace static locking functions with locking APIAmitay Isaacs2012-10-201-98/+8
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4456a01d8f54ca6c771d7488048de5f638477d21)
| * ctdb_freeze: Replace locking functions with locking APIAmitay Isaacs2012-10-201-140/+18
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 01ee86d2aafbcda658ef6acc2bba6d6781ae4047)
| * ctdbd_test: Include ctdb_lock.c code for test stubsAmitay Isaacs2012-10-201-0/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit caff197edf6f928494028ac6c993901954aaa36f)
| * tests: Fix statistics test for new output lines from locking APIAmitay Isaacs2012-10-201-1/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1ee55c511b99e9f8a6fa4e34207267e953f09bae)
| * tools/ctdb: Display the locking statisticsAmitay Isaacs2012-10-201-14/+44
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e24b5bf283736624b387b0364d7200212bb3054b)
| * ctdbd: locking: Provide non-blocking API for locking of TDB record/db/alldbAmitay Isaacs2012-10-208-82/+1239
| | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a consistent API for handling locks on single record, complete db or all dbs. The locks are taken out in a child process. In cases of timeout, find the processes that currently hold the lock and log. Callback functions for locking requests take locked boolean to indicate whether the lock was successfully obtained or not. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1af99cf0de9919dd89af1feab6d1bd18b95d82ff)
| * common: Add routines to get process and lock informationAmitay Isaacs2012-10-206-0/+262
| | | | | | | | | | | | | | | | Currently these functions are implemented only for Linux. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit be4051326b0c6a0fd301561af10fd15a0e90023b)
| * header: Added DB statistics update macrosAmitay Isaacs2012-10-201-16/+49
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a0cdfae7438092f5c605f0608daa536be860b7fe)
| * scripts: Refactor logging code in initscript and functions fileMartin Schwenke2012-10-182-23/+24
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772)
| * tools/ctdb_diagnostics: Add "ctdb listvars" outputMartin Schwenke2012-10-181-0/+1
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2d75a04ba9a2e87a0dcb9bf778c58e335af1871c)
| * initscript: Check that rc.ctdb is executable before running itMartin Schwenke2012-10-181-1/+1
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 59a47c0674bacfebc17a1b44f0244727bf2fa7a4)
| * ctdbd: Remove references to forcing running of eventscripts from log messagesMartin Schwenke2012-10-181-2/+2
| | | | | | | | | | | | | | | | | | Running of eventscripts can be initiated from many places, including the recovery daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 440892d75ef73c0aca22f47c0c01712be00cf5b7)
| * recoverd: Clarify some misleading log messagesMartin Schwenke2012-10-181-2/+2
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f)
| * tools/ctdb: Remove extra header from natgwlist -Y outputMartin Schwenke2012-10-181-4/+0
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 59520c9785d113ad5063eb5fbe42a9efc7e30076)
| * recoverd: Verifying local IPs should only check for unhosted available IPsMartin Schwenke2012-10-181-17/+34
| | | | | | | | | | | | | | | | | | | | | | Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8)
| * Revert "Eventscripts - add facility to 10.interface to delete unmanaged IPs"Martin Schwenke2012-10-182-51/+0
| | | | | | | | | | | | | | | | | | | | This reverts commit 88f88d86b0d08240f749fb721b8c401c2eeb1099. This is dangerous and, on reflection, I can't see it being useful. There are often permanent IPs on interfaces that CTDB shares with its public IPs. (This used to be ctdb commit 16aba4eb620844626a1c71c58b51658caf44dea6)
| * Eventscripts: "recovered" event should not fail on NATGW failureMartin Schwenke2012-10-181-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recovery process has no protection against the "recovered" event failing, so this can cause a recovery loop. Instead of failing the "recovered" event, add a "monitor" event and fail that instead. In this case the failure semantics are well defined. A separate patch should ban nodes if the "recovered" event fails for an unknown reason. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit eaa7c165f58abd7e259c37d76b7dd37c91e13d9f)
| * Logging: Map TEVENT_DEBUG_FATAL to DEBUG_CRITMartin Schwenke2012-10-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is currently mapped to DEBUG_EMERG. CTDB really has no business logging anything at EMERG level since the whole system is not about to abort or catch fire. EMERG causes the message to appear on the console and on every terminal. That's a bit overzealous! There would be very few situations where logs are being filtered at level below ERROR, so CRIT should certainly suffice. The trigger for this was curious messages saying "No event for <n> seconds!" logged in a user's terminal. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0e56e2dad1861892aa8ba59494ad244f2498314e)
| * common: Debug ctdb_addr_to_str() using new function ctdb_external_trace()Martin Schwenke2012-10-185-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've seen this function report "Unknown family, 0" and then CTDB disappeared without a trace. If we can reproduce it then this might help us to debug it. The idea is that you do something like the following in /etc/sysconfig/ctdb: export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh" When we hit this error than we call out to gcore to get a core file so we can do forensics. This might block CTDB for a few seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)
| * config/functions: fix a commentMichael Adam2012-10-171-1/+1
| | | | | | | | | | | | | | | | ctdb_check_counter_limits does not fail but succeed if count >= limit Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit af540ef728303b4a0a188b17c695e9aefab34489)
| * doc: Add info about execute permissions on event scriptsAmitay Isaacs2012-10-171-0/+2
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 25d886060b138bc5e78fe93d7bebe3990264f29d)
| * doc: Fix documentation for setup eventAmitay Isaacs2012-10-171-5/+3
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 36d25e96a2f8ae1461c5a708a2922f0475a39900)
| * scripts: Remove duplicate code from init script to set tunablesAmitay Isaacs2012-10-172-21/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The tunable variables defined in CTDB configuration file are currently set up from init script as well as part of "setup" event in 00.ctdb eventscript. Remove the duplication of this code and set tunable variables only from setup event. During the "setup" event, it's possible that ctdb tool commands can timeout if CTDB daemon is not ready. To guard against such eventuality, wait till "ctdb ping" command succeeds before executing any other ctdb tool commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 632c1b9c1cc2e242376358ce49fd2022b3f27aa2)
| * doc: Fix the hyperlink for "Testing CTDB" pageAmitay Isaacs2012-10-171-1/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 08dbd9c7958f9a0ee3de314d49523d32e4be135c)
| * tests/eventscripts: add unit tests for policy routing reconfigureMartin Schwenke2012-10-114-0/+77
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bd4ff176387372b1c233373c0bc8ced523fc9670)
| * tests/eventscripts: add extra infrastructure for policy routing testsMartin Schwenke2012-10-1116-317/+170
| | | | | | | | | | | | | | | | Less copying and pasting is a good thing... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7d4b8cce96f33fff647a0c9d259c121dfc8403e9)
| * Eventscripts: Add support for "reconfigure" pseudo-event for policy routingMartin Schwenke2012-10-111-2/+17
| | | | | | | | | | | | | | | | | | This rebuilds all policy routes and can be used if the configuration changes. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c185ffd2822fcee26d07398464c59b66c61f53fa)
| * recoverd: Track failure of "recovered" event, banning culpritsMartin Schwenke2012-10-111-29/+42
| | | | | | | | | | | | | | Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9550c497e6d6ef5ee44826c4bd9ed5ad65174263)
| * recoverd: When starting a takeover run disable IP verificationMartin Schwenke2012-10-112-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable for TakeoverTimeout seconds. Otherwise the the recovery daemon can get overzealous and start trying to add/delete addresses that it thinks are missing but where the eventscript just hasn't finished. This didn't used to matter so much but it is more important now that concurrent takeip/releaseip/updateip generate error - we want to avoid spamming the log. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 56fcee3c7730cb12fa666072d5400949af6e5f7c)
| * ctdbd: Stop takeovers and releases from colliding in mid-airMartin Schwenke2012-10-112-7/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race here where release and takeover events for an IP can run at the same time. For example, a "ctdb deleteip" and a takeover initiated by the recovery daemon. The timeline is as follows: 1. The release code registers a callback to update the VNN. The callback is executed *after* the eventscripts run the releaseip event. 2. The release code calls the eventscripts for the releaseip event, removing IP from its interface. The takeover code "updates" the VNN saying that IP is on some iface.... even if/though the address is already there. 3. The release callback runs, removing the iface associated with IP in the VNN. The takeover code calls the eventscripts for the takeip event, adding IP to an interface. As a result, CTDB doesn't think it should be hosting IP but IP is on an interface. The recovery daemon fixes this later... but it shouldn't happen. This patch can cause some additional noise in the logs: Release of IP 10.0.2.133/24 on interface eth2 node:2 recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it. Release of IP 10.0.2.133/24 rejected update for this IP already in flight recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed recoverd:Failed to release local ip address In this case the node has started releasing an IP when the recovery daemon notices the addresses is still hosted and initiates another release. This noise is harmless but annoying. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)
| * ctdbd: New tunable NoIPTakeoverOnDisabledMartin Schwenke2012-10-116-89/+115
| | | | | | | | | | | | | | | | | | | | | | | | Stops the behaviour where unhealthy nodes can host IPs when there are no healthy nodes. Set this to 1 when an immediate complete outage is preferred when all nodes are unhealthy. The alternative (i.e. default) can lead to undefined behaviour when the shared filesystem is unavailable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a555940fb5c914b7581667a05153256ad7d17774)
| * Eventscripts: Add service-start and service-stop pseudo-eventsMartin Schwenke2012-10-101-2/+28
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df)
| * ctdbd: Avoid unnecessary updateip eventMartin Schwenke2012-10-101-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code makes one fatally bad assumption: vnn->iface->references can never be -1 (or max-unit32_t in this case). Right now the reference counting is broken so a reference count of -1 is possible and causes a spurious updateip when vnn->iface is the same as best_face. This can occur frequently because we get a lot of redundant takeovers, especially when each IP can only be hosted on one interface. This makes the code much more defensive by noting that when best_iface is the same as vnn->iface there is never a need for an updateip event. This effectively neuters the updateip code path when IPs can only be hosted by a single interface. This should obsolete 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7054e4ded59c6b8f254dcfefaef64da05f25aecd)
| * Correct include for ctdb_protocol.hVolker Lendecke2012-10-091-1/+1
| | | | | | | | | | | | | | | | With an old ctdb_protocol.h installed under /usr/local, ctdb will not compile because the <> form of include will find the header under /usr/local (This used to be ctdb commit c4f5a58471b206e2287c7958c7f29c1f1c0626ac)
| * Revert "when creating/adding a public ip, set the initial interface to be ↵Amitay Isaacs2012-10-071-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | the first interface specified" This reverts commit 4308935ba48ac7a29e7523315acf580019715f0f. This fixes 16_ctdb_config_add_ip.sh test when run against local daemons. When running against local daemons, if the interface is assigned as soon as an IP is added, then takeover would never assign this IP address. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 06dfd13604d08910e07cbf927c338d7b9fce9a2f)
| * util: ctdb_fork() closes all sockets opened by the main daemonMartin Schwenke2012-10-052-18/+24
| | | | | | | | | | | | | | | | | | Do some other hosuekeeping including stopping tevent. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 212298279557a2833ef0f81809b4a5cdac72ca02)
| * eventscripts: Auto-start/stop services in backgroundMartin Schwenke2012-10-037-25/+65
| | | | | | | | | | | | | | | | | | | | | | If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done in the background with logging. Fix some unit tests for samba and winbind. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a)