Runner refactoring
ClosedPublic

Authored by lbrabec on Sep 2 2015, 11:46 AM.

Details

Summary

Huge refactoring of runner and logic around it. See T603 for list.
Work in progress, test are broken, docs are incorrect. Open for discussion.

Test Plan

manual testing

Diff Detail

Repository
rLTRN libtaskotron
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
In D550#10812, @tflink wrote:

So the lesson for me to take away from this is either not to take sick days or not believe you when you tell me that "it's not important" :-P

I meant it. Moving code around and renaming classes is not an important change in my eyes.

  1. we feel that the current approach of Runner stealthily using either LocalRunner or RemoteRunner in the background, without telling you which one and giving you means to access it, is unfortunate.

Isn't that how OOP and inheritance is supposed to work?

Let's take Java for example. If you use a factory class, you receive an instance adhering to a particular interface. The same when you create it manually, you can create an ArrayList or a LinkedList but work with it as a List if you desire so. But that's not how Runner works and what I'd like to change. It doesn't return an instance, it's a facade for two very different objects, just similarly named (that's why I wanted to change their names at the same time) and I don't think that design is too useful.

  • It's very likely that we will have some specific functionality that is related only to one type of execution, but not the other. For that reason we need to allow people to access the runner instances directly, and not through some generic interface.

This doesn't make sense to me - can you elaborate on this a bit more? Who is "people" in this situation? Why would "people" need access to the runner?

The first problem I encountered was with exitcode attribute. It was originally defined only on LocalRunner. I wanted to make use of it only when running locally, but I had no means to do so, because Runner is a facade and gives me no access to LocalRunner instance.

Later we realized we can make exitcode work also for remote executions, and so implemented it in both LocalRunner and RemoteRunner and made it accessible through Runner. So this argument is moot. But the original idea remains - what if I need to access functionality specific for just on of those runners, what then? The work they do is so vastly different that it's very likely that we'll hit this exact problem shortly in the future.

As another example, let's say I want to skip installing libtaskotron and its deps in certain cases (we would have a cmdline switch for that). That's currently in RemoteRunner.prepare_task(). I can pull that into a separate method and then either run it or not, but how do I expose it? It has no counterpart in LocalRunner, so will I implement a fake method in LocalRunner that performs nothing, just for the sake of being able to have Runner.install_deps(bool) method that will work universally with both objects? A much better approach (IMO) would be to work with the particular instance (LocalRunner or RemoteRunner) instead of the facade (Runner) and then I can easily decide whether the current situation applies to the cmdline argument or not.

  • LocalRunner and RemoteRunner don't really share too much similar functionality, so it doesn't seem helpful to use inheritance there.

Fair enough.

Now I realize that both my points were trying to say the same thing.

I'm not crazy about labeling the two remote modes as "disposable" and "ssh" because they both use ssh but one uses an existing VM instead of spawning a new one. That being said, I don't have any better ideas for names ATM.

That's true. Ideas welcome.

We tried to come up with some suggestions describing what this machine/code really does - deploys the task somewhere and overlooks the execution. So we had ideas like "distributor", "arbiter", "governor", "controller", or, if you want to go funky, "overlord" (and then use "minion" instead of "executer"). I wanted to consult these naming options with you, but since you were not available, we picked "governor", because it seemed to fit best (even though it's a freaking tongue twister for us, or at least for me:-)). We again decided to rename DisposableRunner and SSHRunner into DisposableGovernor and SSHGovernor, possibly inheriting from RemoteGovernor, so that we have all the terminology in sync.

Does the multi-host use case (especially for the cloud tests) work well with this terminology? If we're going to change things around like this, I think that either "arbiter" or "overlord/minion" is closer to the abstraction that we'll end up hitting. Other alternatives that come to mind are "foreperson", "director" or possibly "overseer".

I have to admit we haven't thought hard about multi-host and cloud use cases, thinking something like "the task authors can call it whether they want, that's their problem". But it's true we will have to document it somehow. "Governor" still looks fine to me, but if you think other words sound better, let's go with them. Or we can still call it "RemoteRunner" and not rename the classes if people like it. My main drive was to have our terminology and the matching classes in code consistent, so that easier to navigate in the code and talk in the tickets. Which of the terms you like best?
(Btw, can we use graphics like these if we end up calling it overlord/minion? :) I suppose not. Damned copyright.)

In T603#7850, @tflink wrote:
  • Executer is arguably not a word and looks strange to me as a native English speaker - Executor may be better

I searched for executor vs executer and gathered that the first one is a legal profession and the second one is the fellow who beheads you. So we went with the second one :) You're the native speaker here, let's go with /your/ gut feeling :)

  • I think it's too early to remove vm.Image since we don't have logic to find images yet - the code that was proposed from D528 may fit well there

I assumed it will be added back once we have something to use it for. If that's imminent, great, I don't mind either way.

As far as terminology goes, I'm not a huge fan of Governor. It has implications of limiting functionality in my mind and sounds weird in this situation. Would keeping the governor bits as runner and keeping the executor bits separate still seem confusing?

I'd like to make our class names match our terminology (the document I was talking about is here), to reduce the number of names that are floating around, that's all. Do you think it's better to have "runners" in code but talk about them as "initiators" (or some other term) everywhere else?

In D550#10874, @kparal wrote:
In D550#10812, @tflink wrote:

So the lesson for me to take away from this is either not to take sick days or not believe you when you tell me that "it's not important" :-P

I meant it. Moving code around and renaming classes is not an important change in my eyes.

Ah, the limitations of text-only communication - that was supposed to be a joke.

  1. we feel that the current approach of Runner stealthily using either LocalRunner or RemoteRunner in the background, without telling you which one and giving you means to access it, is unfortunate.

Snipping some of this for the sake of brevity. I understand what you were getting at now and the new split makes sense.

I'm not crazy about labeling the two remote modes as "disposable" and "ssh" because they both use ssh but one uses an existing VM instead of spawning a new one. That being said, I don't have any better ideas for names ATM.

That's true. Ideas welcome.

Going with the idea below, what about using 'disposable' and 'persistent'? The idea of the SSHGovernor is that it's using a pre-existing remote machine to avoid the overhead of spawning disposable clients.

We tried to come up with some suggestions describing what this machine/code really does - deploys the task somewhere and overlooks the execution. So we had ideas like "distributor", "arbiter", "governor", "controller", or, if you want to go funky, "overlord" (and then use "minion" instead of "executer"). I wanted to consult these naming options with you, but since you were not available, we picked "governor", because it seemed to fit best (even though it's a freaking tongue twister for us, or at least for me:-)). We again decided to rename DisposableRunner and SSHRunner into DisposableGovernor and SSHGovernor, possibly inheriting from RemoteGovernor, so that we have all the terminology in sync.

Does the multi-host use case (especially for the cloud tests) work well with this terminology? If we're going to change things around like this, I think that either "arbiter" or "overlord/minion" is closer to the abstraction that we'll end up hitting. Other alternatives that come to mind are "foreperson", "director" or possibly "overseer".

I have to admit we haven't thought hard about multi-host and cloud use cases, thinking something like "the task authors can call it whether they want, that's their problem". But it's true we will have to document it somehow. "Governor" still looks fine to me, but if you think other words sound better, let's go with them. Or we can still call it "RemoteRunner" and not rename the classes if people like it. My main drive was to have our terminology and the matching classes in code consistent, so that easier to navigate in the code and talk in the tickets. Which of the terms you like best?

If I'm the only one who doesn't care for 'Governor' then it may not make sense to change it.

If we assume that we'll eventually support multiple client spawning (cloud or desktop use case), I'm thinking it would make sense to name for that kind of scenario.

I like 'overlord' and 'minion' where the 'overlord' is responsible for spinning up and coordinating between the different 'minion' instances (disposableMinion, persistantMinion etc.) but we could also use 'director' and 'subordinate' or 'minion' if we wanted to be more serious.

(Btw, can we use graphics like these if we end up calling it overlord/minion? :) I suppose not. Damned copyright.)

And here I thought we were talking about Starcraft :)

In T603#7850, @tflink wrote:
  • Executer is arguably not a word and looks strange to me as a native English speaker - Executor may be better

I searched for executor vs executer and gathered that the first one is a legal profession and the second one is the fellow who beheads you. So we went with the second one :) You're the native speaker here, let's go with /your/ gut feeling :)

From what I read, the legal "executor" is definitely spelled that way but that spelling can also be used for any other meaning of the word. "executer" means anything that "executor" does other than the legal "executor".

Then again, other things suggest avoiding the term entirely :)

At the risk of sounding a bit like a broken record, we could use "runner" for this part and change our docs to be of the form: "the overlord spawns minions, one of which is responsible for running the core bits of the task"

  • I think it's too early to remove vm.Image since we don't have logic to find images yet - the code that was proposed from D528 may fit well there

I assumed it will be added back once we have something to use it for. If that's imminent, great, I don't mind either way.

Either way works, D528 seems to have stalled.

As far as terminology goes, I'm not a huge fan of Governor. It has implications of limiting functionality in my mind and sounds weird in this situation. Would keeping the governor bits as runner and keeping the executor bits separate still seem confusing?

I'd like to make our class names match our terminology (the document I was talking about is here), to reduce the number of names that are floating around, that's all. Do you think it's better to have "runners" in code but talk about them as "initiators" (or some other term) everywhere else?

I really should have responded in one place instead of doing the ticket first - still not a huge fan of Governor but leaving the runner terminology alone like that doesn't make sense.

Since this conversation has gotten pretty long, I'll condense my counter-proposal-ish-thing down a bit (all of which is up for discussion):

  • Instead of main, have an overlord which is responsible for coordinating the process of task running/execution from beginning to end after args are parsed etc.
  • Change the governor classes to be: disposableMinion, persistentMinion (current SSHGovernor) and leave the door open for cloudMinion
  • Possibly change the executer to runner so that we're using a common term that's unlikely to get "huh?" reactions from other people. this runner would be limited to the local execution of directives in the formula

We could substitute other terms for different levels of "make serious" - director/peon, conductor/player etc.

Or we could just leave terms and the structure of this diff as-is. Thoughts?

After discussion with Tim yesterday, here's the current plan:

  1. rename executer to executor (module and class)
  2. rename governor to minion (module and classes)
  3. use "persistent" instead of "ssh", so DisposableMinion, PersistentMinion, Minion
  4. create overlord.py with Overlord class, put there method(s) to start the whole execution, distinguish between local and remote execution, return either the proper minion or the executor class, operate those classes.
  5. main.py should keep only argument parsing, basic initialization (logging), and running
overlord = Overlord(args)
overlord.start()
lbrabec updated this revision to Diff 1510.Sep 15 2015, 10:12 AM
  • changes in names, further refactoring

If there are no further concerns, I think we can go forward with this and provide a full patch (including tests, polishing, etc).

libtaskotron/main.py
215–218

I'd leave this inside main().

libtaskotron/minion.py
40–41

The second sentence doesn't make sense now, because we'll use Minion just for remote execution.

libtaskotron/overlord.py
35

Hmm, it might get tricky to rename this method... _get_executor_or_minion()? That's probably not better than it is. :-)

Would it be easier to do it like this in start()?

if self._local_execution():
   executor = self._get_executor()
   executor.run()
   # do some things potentially specific for local execution
else:
   minion = self._get_minion()
   minion.run()
   # do some things potentially specific for remote execution

Probably doesn't matter as long as we don't have the specific bits for particular execution types.

39

We want to name this "persistent" (or persistent something) instead.

jskladan requested changes to this revision.Sep 15 2015, 12:27 PM
jskladan added inline comments.
libtaskotron/executor.py
25

How about moving the methods around so it more reflects the call-order? At the moment I really need to scroll up'n'down rather too much...

I propose:

__init__
execute
_prepare_task
_run
_validate_input
_do_actions
_do_single_action
_extract_directive_from_action
_render_action
_load_directive
_validate_env
libtaskotron/main.py
66–72

+1

98–100

Why exactly are we doing this? IIUIC it transforms {item: foo, type: koji_build} to {item: foo, type: koji_build, koji_build: foo}, but why?

libtaskotron/overlord.py
35

This would make sense, but IMHO having all the bits of the decision process in one place makes sense right now. If we come to a place where we want to do some things potentially specific for $$$ execution, then refactoring start() to something like:

if self._decide_execution_mode() == 'local':
    executor = self._get_executor()
    executor.run()
else:
    minion = self._get_minion()
    minion.run()

might be worth it, but I would not overcomplicate it right now.

40

?

This does not really make sense - you parse&set the stuff in main.py and then just plain trash it here? Also PersistentMinion really wants these IIUIC...

This revision now requires changes to proceed.Sep 15 2015, 12:27 PM
kparal added inline comments.Sep 15 2015, 12:56 PM
libtaskotron/main.py
98–100

To be able to use ${koji_build} in the formula. But it's a good question - is there some benefit over just using ${item}? It looks better, but it also complicates things a bit.

Either way, the discussion probably should not be part of this ticket, but a different one.

libtaskotron/overlord.py
40

Good catch, that seems to be an oversight when moving the code.

It's a good start - I think that @kparal and @jskladan caught most of the inconsistencies.

libtaskotron/main.py
15

I assume that the unused imports will be cleaned up in a later revision

libtaskotron/minion.py
20

It seems a bit odd to me that a base class is marked as protected. Wouldn't calling it BaseMinion and leaving some comments about how it shouldn't be used alone be enough?

40–41

Is that really how we want to approach this? I know that part of this was to break apart what was the runner but I like the conceptual cleanliness of keeping to the overlord/minion model as much as we can and I think that it'd be wise to keep the difference between overloard and minion clear.

The LocalMinon doesn't have to inherit from the same base class as the remote minions do if that doesn't make sense (they can inherit from RemoteMinion in that case) but I do think that we'd be wise to keep concepts clear unless there's a very good reason to diverge from that

libtaskotron/overlord.py
35

I really don't think putting this into start() is a good idea. It would make things more simple in the short term but I fear it's a bit short-sighted. How will cloud or desktop task development work if the overlord is short-circuited like this? Wouldn't that mean that you would have to use disposable clients unless all execution happened locally and no target resource was needed?

I'd rather see the local decision passed into the overlord and delegated to a local minion that has little code in it, mostly calls an executor so that the division of responsibility between overlord and minion is crystal clear: the overlord coordinates and oversees the minions which actual work.

I don't want want to needlessly add complexity but I'd really rather see a similar execution model be similar for all 3 types that we support right now unless there's a really good reason to limit that flexibility and muddy the distinction between minion and overlord.

Can we keep this moving or are we waiting for more comments? @kparal was gone last week and now @jskladan is gone this week.

I really want to keep this moving so other stuff can happen in disposable-develop and it gets merged back into develop. If we're waiting on comments from joesf, I propose that we move forward on this diff and can go back to fix things later if he objects when he gets back.

lbrabec updated this revision to Diff 1570.Sep 30 2015, 11:21 AM
  • code shuffle and some docs
tflink requested changes to this revision.Sep 30 2015, 2:43 PM

I notice that the questions/concerns about the logic in Overlord's constructor and the remote-only minions have been neither discussed farther nor addressed - what's the plan for this? Just leave it as-is for now and address later? If so, I'd like to see a ticket about that so it doesn't slip through the cracks

libtaskotron/overlord.py
46

incomplete docs

This revision now requires changes to proceed.Sep 30 2015, 2:43 PM
kparal added a comment.Oct 1 2015, 3:02 PM

Here are responses to a few past remarks. I'll try to provide a more complete code review tomorrow.

libtaskotron/minion.py
40–41

My idea was that Overlord figures out what's going to happen (local or remote execution and which kind) and either runs Executor or a particular Minion. We can rename Executor to LocalMinion if you think it sounds better. I'm not sure what would be the benefit of Overlord -> LocalMinion -> Executor, when LocalMinion would be just an empty class running Executor and doing nothing else. It seems like too much abstraction? Or is LocalMinion supposed to do something more?

libtaskotron/overlord.py
35

No objection, let's keep all the decision process in _get_runner(), it's encapsulated and we can test it easily. I wouldn't move this method outside of Overlord, I think it makes sense that Overlord makes this kind of decision, and I don't know where to better put it.

lbrabec updated this revision to Diff 1578.Oct 2 2015, 10:11 AM
  • docs, quickfix of tests
kparal added inline comments.Oct 2 2015, 12:53 PM
libtaskotron/overlord.py
49–50

A few typos. What about:

:returns: either :class:`.Executor` (local run mode) or :class:`.PersistentMinion` or :class:`.DisposableMinion` (remote run mode)

I haven't tried to render this, so I'm not sure the hyperlinks will work as it is. But somehow like this :)

testing/functest_main.py
20

Let's make this runtask instead of main.py to have it less confusing?

lbrabec updated this revision to Diff 1583.Oct 2 2015, 1:34 PM
  • rebase
jskladan accepted this revision.Oct 2 2015, 6:04 PM

Fine with me...

lbrabec updated this revision to Diff 1588.Oct 5 2015, 8:55 AM
  • changes in test names so it reflects the refactored code
tflink accepted this revision.Oct 5 2015, 6:37 PM
This revision is now accepted and ready to land.Oct 5 2015, 6:37 PM
kparal requested changes to this revision.Oct 6 2015, 1:08 PM

I have added quite a few comments, but most of them should be completely trivial, so it just looks like I'm an old annoying nitpicker who is never satisfied!

  • library.rst is missing the new modules, so they won't show up in the docs. Please add them (I noticed it's also missing some additional recent modules, please add them all and then mock relevant authors, munchkin style).
libtaskotron/executor.py
26–27

You can hyperlink Overlord and Minion.

35

This is not an input parameter. But you can keep the documentation for the internal attributes, that's always good. It just probably needs a bit different markup. Look into different classes for inspiration.

44

This looks weird. Does this really work? arg_data['uuid'] doesn't contain a proper path, I would think. Should this have been arg_data['artifactsdir']?

49

Typo.

83–84

This is now in process_args(). No need to duplicate it here, I think.

libtaskotron/minion.py
21–23

Let's document the requirements you need to do in order to use this in one of the subclasses. You need to set self.ssh. Anything else?

Let's put this documentation either to constructor or to relevant methods, whatever makes more sense.

31

Again, this is not really an input param.

38–40

This is no longer needed, the decision is in Overlord.

109

RemoteRunner is no longer a thing.

130

Typo 2x.

135

Again RemoteRunner reference.

libtaskotron/overlord.py
22–24

Maybe add ... and orchestrates the execution :-)

23

Typo.

34

Let's document this one, thanks. Data type is quite obvious. What happens when it is None at the end? Does it have any effect without the exitcode directive?

If you already have it documented elsewhere (i.e. Executor) and the behavior is the same, you can hyperlink it from one place to a different place, no need to duplicate.

42

appropriate

44–45

I've found out that private methods are not rendered in the docs. Which makes sense, it just means we won't make use of our lovely hyperlinking (unless someone has a code editor which is able to use that, my Spyder doesn't).

So next time if you feel lazy, the markup is not that important for private methods, just for public methods (feel free to remind me if I complain next time). Of course the more markup the merrier, this is not to discourage anyone :)

Btw I have tested it and the hyperlinking works well, the format is correct.

47–50

The #: sphinx markup does not have any effect here, that works just for class attributes.

86

Typo.

98–101

This should go into main. I presume artifactsdir doesn't change, so you don't need to pass anything back.

testing/test_executor.py
21

Please grep for exectutor in the whole codebase and fix all instances :-)

401

Please move this into test_main.py and rename accordingly. It doesn't seem to be related to executor.py.

testing/test_overlord.py
132

This also seems to be related to test_main.py and not to Overlord.

This revision now requires changes to proceed.Oct 6 2015, 1:08 PM
jskladan added inline comments.Oct 7 2015, 10:04 AM
libtaskotron/minion.py
130

awww, we should have called them mini-onions instead... :)

lbrabec updated this revision to Diff 1595.Oct 7 2015, 1:15 PM
lbrabec marked 20 inline comments as done.
  • polishing
kparal accepted this revision.Oct 8 2015, 9:35 AM

Two minor things, please fix and commit. Thanks!

libtaskotron/minion.py
35

Let's add #:.

105

It's probably better to raise an instance and not the class itself.

This revision is now accepted and ready to land.Oct 8 2015, 9:35 AM
kparal added inline comments.Oct 8 2015, 9:39 AM
docs/source/library.rst
1–3

Oh, and please keep this library in alphabetical order and add all missing modules, not just your ones, thanks.

Closed by commit rLTRNbf89e0f7e160: Runner refactoring (authored by Lukas Brabec <lbrabec@redhat.com>). · Explain WhyOct 8 2015, 9:55 AM
This revision was automatically updated to reflect the committed changes.