summaryrefslogtreecommitdiffstats
path: root/__root__/doc/rgmanager-pacemaker/03.groups.txt
diff options
context:
space:
mode:
Diffstat (limited to '__root__/doc/rgmanager-pacemaker/03.groups.txt')
-rw-r--r--__root__/doc/rgmanager-pacemaker/03.groups.txt272
1 files changed, 272 insertions, 0 deletions
diff --git a/__root__/doc/rgmanager-pacemaker/03.groups.txt b/__root__/doc/rgmanager-pacemaker/03.groups.txt
new file mode 100644
index 0000000..d6e95be
--- /dev/null
+++ b/__root__/doc/rgmanager-pacemaker/03.groups.txt
@@ -0,0 +1,272 @@
+IN THE LIGHT OF RGMANAGER-PACEMAKER CONVERSION: 03/RESOURCE GROUP PROPERTIES
+
+Copyright 2016 Red Hat, Inc., Jan Pokorný <jpokorny @at@ Red Hat .dot. com>
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3
+or any later version published by the Free Software Foundation;
+with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+A copy of the license is included in the section entitled "GNU
+Free Documentation License".
+
+
+Preface
+=======
+
+This document elaborates on how selected resource group internal
+relationship properties (denoting the run-time behavior) formalized
+by the means of LTL logic maps to particular RGManager (R) and
+Pacemaker (P) configuration arrangements.
+Due to the purpose of this document, "selected" here means set of
+properties one commonly uses in case of the former cluster resource
+manager (R).
+
+Properties are categorised, each is further dissected based on
+the property variants (basically holds or doesn't, but can be more
+convoluted), and for each variants, the LTL model and R+P specifics
+are provided (when possible or practical).
+
+
+Outline
+-------
+
+Group properties derived from resource properties
+Group member vs. rest of group properties, PROPERTY(GROUP, RESOURCE)
+. FAILURE-ISOLATION
+Other group properties, PROPERTY(GROUP)
+
+
+
+Group properties derived from resource properties
+=================================================
+
+Resource group (group) is an ordered set of resources:
+
+GROUP ::= { RESOURCE1, ..., RESOURCEn },
+ RESOURCE1 < RESOURCE 2
+ ...
+ RESOURCEn-1 < RESOURCE n
+
+and is a product of two resource properties applied for each
+subsequent pair of resources in linear fashion:
+
+. ORDERING
+ ORDERING(RESOURCE1, RESOURCE2, STRONG)
+ ...
+ ORDERING(RESOURCEn-1, RESOURCEn, STRONG)
+
+. COOCCURRENCE
+ COOCCURRENCE(RESOURCE1, RESOURCE2, POSITIVE)
+ ...
+ COOCCURRENCE(RESOURCEn-1, RESOURCEn, POSITIVE)
+
+As the set is ordered, let's introduce two shortcut functions:
+
+. BEFORE(GROUP, RESOURCE) -> { R | for all R in GROUP, r < RESOURCE }
+. AFTER(GROUP, RESOURCE) -> { R | for all R in GROUP, r > RESOURCE }
+
+
+
+Group member vs. rest of group properties
+=========================================
+
+Generally a relation expressed by a predicate PROPERTY(GROUP, RESOURCE),
+assuming RESOURCE in GROUP, implying modification of the behavior of
+cluster wrt. group-resource pair:
+
+PROPERTY(GROUP, RESOURCE) -> ALTER(BEFORE(GROUP, RESOURCE))
+
+
+Independence between failing resource and its group predecessors
+----------------------------------------------------------------
+
+FAILURE-ISOLATION ::= FAILURE-ISOLATION(GROUP, RESOURCE, NONE)
+ | FAILURE-ISOLATION(GROUP, RESOURCE, TRY-RESTART)
+ | FAILURE-ISOLATION(GROUP, RESOURCE, STOP)
+. FAILURE-ISOLATION(GROUP, RESOURCE, NONE) ... RESOURCE failure leads to
+ recovery of the whole group
+. FAILURE-ISOLATION(GROUP, RESOURCE, TRY-RESTART)
+ ... RESOURCE failure leads to
+ (bounded) local restarts
+ of RESOURCE and its successor
+ (AFTER(GROUP, RESOURCE)) first
+. FAILURE-ISOLATION(GROUP, RESOURCE, STOP) ... RESOURCE failure leads to
+ stopping and disabling
+ of RESOURCE and its successor
+ (AFTER(GROUP, RESOURCE))
+
+R: driven by `__independent_subtree` property of RESOURCE within GROUP
+
+P: in part, driven by `on-fail` property of `monitor` and `stop` operations
+ for RESOURCE
+
+FAILURE-ISOLATION(GROUP, RESOURCE, NONE) [1. recovery the group]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: default, no need for that, othewise specifying `@__independent_subtree`
+ as `0` for RESOURCE within GROUP
+
+P: specifying `migration-threshold` 1 (+default `on-fail` values)
+ for RESOURCE, but only if original recovery policy was `relocate`,
+ so better not to do anything otherwise???
+
+
+FAILURE-ISOLATION(GROUP, RESOURCE, TRY-RESTART) [2. begin with local restarts]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: specifying `@__independent_subtree` as `1` or `yes`
+ + `@__max_restarts` and `__restart_expire_time`
+
+P: specifying `migration-threshold` as a value between 2 and INFINITY
+ (inclusive) (+default `on-fail` values) for RESOURCE, but only if
+ original recovery policy was `relocate`, so better not to do anything
+ otherwise???
+
+FAILURE-ISOLATION(GROUP, RESOURCE, STOP) [3. disable unconditionally]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: specifying `@__independent_subtree` as `2` or `non-critical`
+
+P: default `on-fail` values modulo `ignore` for `monitor` (or `status`)
+ operation and `stop` for `stop`) for RESOURCE ???
+
+
+
+Other group properties
+=========================
+
+Recovery policy group property
+---------------------------------
+
+RECOVERY ::= RECOVERY(GROUP, RESTART-ONLY)
+ | RECOVERY(GROUP, RESTART-UNTIL1, MAX-RESTARTS)
+ | RECOVERY(GROUP, RESTART-UNTIL2, MAX-RESTARTS, EXPIRE-TIME)
+ | RECOVERY(GROUP, RELOCATE)
+ | RECOVERY(GROUP, DISABLE)
+. RECOVERY(GROUP, RESTART) ... "attempt to restart in place", unlimited
+. RECOVERY(GROUP, RESTART-UNTIL1, MAX-RESTARTS)
+ ... ditto, but after MAX-RESTARTS attempts
+ (for the whole period of group-node
+ assignment) attempt to relocate
+. RECOVERY(GROUP, RESTART-UNTIL2, MAX-RESTARTS, EXPIRE-TIME)
+ ... ditto, but after MAX-RESTARTS attempts
+ accumulated within EXPIRE-TIME windows,
+ attempt to relocate
+. RECOVERY(GROUP, RELOCATE) ... move to another node
+. RECOVERY(GROUP, DISABLE) ... do not attempt anything, stop
+
+R: driven by `/cluster/rm/(service|vm)/@recovery`
+
+P: driven by OCF RA return code and/or `migration-threshold`
+
+RECOVERY(GROUP, RESTART-ONLY) [1. restart in place, unlimited]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: default, no need for that, otherwise specifying `@recovery` as `restart`
+ (and not specifying none of `@max_restarts`, `@restart_expire_time`,
+ or keeping `@max_restarts` at zero!)
+
+P: default, no need for that, otherwise specifying `migration-threshold`
+ as `INFINITY` (or zero?; can be overriden by OCF RA return code, anyway?)
+
+RECOVERY(GROUP, RESTART-UNTIL1, MAX-RESTARTS) [2. restart + absolute limit]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: driven by specifying `@max_restarts` as `MAX-RESTARTS` (value, non-positive
+ number boils down to case 1.)
+ - and, optionally, specifying `@recovery` as `restart` (or not at all!)
+
+P: driven by specifying `migration-threshold` as `MAX-RESTARTS` (value,
+ presumably non-negative, `INFINITY` or zero? boil down to case 1.)
+ (but can be overriden by OCF RA return code, anyway?)
+
+[3. restart + relative limit for number of restarts/period]
+RECOVERY(GROUP, RESTART-UNTIL2, MAX-RESTARTS, EXPIRE-TIME)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: driven by specifying `@max_restarts` as `MAX-RESTARTS` (value, non-positive
+ number boils down to case 1.) and `@restart_expire_time`
+ as `EXPIRE-TIME` (value, negative after expansion boils down to the
+ case 1., zero to case 2.)
+ - and, optionally, specifying `@recovery` as `restart` (or not at all!)
+
+P: driven by specifying `migration-threshold` as `MAX-RESTARTS` (value,
+ presumably non-negative, `INFINITY` or zero? boil down to case 1.) and
+ `failure-timeout` as `EXPIRE-TIME` (value, presumably positive, zero
+ boils down to case 2.)
+ (but can be overriden by OCF RA return code, anyway?)
+
+RECOVERY(GROUP, RELOCATE) [4. move to another node]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: driven by specifying `@recovery` as `relocate`
+
+P: driven by specifying `migration-threshold` as 1
+ (or possibly negative number?; regardless of `failure-timeout`)
+ (but can be overriden by OCF RA return code, anyway?)
+
+RECOVERY(GROUP, DISABLE) [5. no more attempt]
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+R: driven by specifying `@recovery` as `disable`
+
+P: can only be achieved in case of AFFINITY(GROUP, NODE, FALSE)
+ for all nodes except one and specifying `migration-threshold`
+ as `1` because upon single failure, remaining
+ AFFINITY(RESOURCE, NODE, FALSE) rule for yet-enabled NODE will
+ be added, effectively preventing RESOURCE to run anywhere
+
+
+Is-enabled group property
+-------------------------
+
+ENABLED ::= ENABLED(GROUP, TRUE)
+ | ENABLED(GROUP, FALSE)
+. ENABLED(GROUP, TRUE) ... group is enabled (default assumption)
+. ENABLED(GROUP, FALSE) ... group is disabled
+
+notes
+. see also 01/cluster: FUNCTION
+
+R: except for static disabling of everything (RGManager avoidance),
+ can be partially driven by `/cluster/rm/(service|vm)/@autostart`
+ and/or run-time modification using `clusvcadm`
+ (or at least it is close???)
+
+P: via `target-role` (or possibly `is-managed`) meta-attribute [1]
+
+ENABLED(GROUP, TRUE) [1. group is enabled]
+~~~~~~~~~~~~~~~~~~~~~~~
+
+R: (partially) driven by specifying `@autostart` as non-zero
+ (has to be sequence of digits for sure, though!)
+ - default, no need for that
+ # clusvcadm -U GROUP <-- whole service/vm only
+
+P: default, no need for that, otherwise specifying `target-role` as `Started`
+ (or possibly `is-managed` as `true`)
+ # pcs resource enable GROUP
+ # pcs resource meta GROUP target-role=
+ # pcs resource meta GROUP target-role=Started
+ or
+ # pcs resource manage GROUP
+ # pcs resource meta GROUP is-managed=
+ # pcs resource meta GROUP is-managed=true
+
+ENABLED(GROUP, FALSE) [2. group is disabled]
+~~~~~~~~~~~~~~~~~~~~~
+
+R: (partially?) driven by specifying `@autostart` as `0` (or `no`)
+ # clusvcadm -Z GROUP <-- whole service/vm only
+
+P: # pcs resource disable GROUP
+ # pcs resource meta GROUP target-role=Stopped
+ or
+ # pcs resource unmanage GROUP
+ # pcs resource meta GROUP is-managed=false
+
+
+
+References
+==========
+
+: vim: set ft=rst: <-- not exactly, but better than nothing