*WIP* More hacky support, but for containers
Needs RevisionPublic

Authored by roshi on May 15 2017, 6:54 PM.

Details

Summary

This patch builds off of Martin's ansible tasks commit (rLTRN2b720ca61cb8).
I've added a Dockerfile in the 'libtaskotron/ext/' directory as
well as a module to 'libtaskotron/ext/disposable' for dealing with
Docker containers.

This expects a docker daemon to be running on the host (so is akin
to --libvirt), and will build the 'taskotron-worker' image if it's
not already found on the system.

Testing this is very similar to testing the patch Martin wrote,
except you use the '--docker' flag during invocation.

To test:
$ git clone https://github.com/stefwalter/gzip-dist-git
$ sudo runtask -d -i gzip -t koji_build -a x86_64 --docker gzip-dist-git/tests/test_rpm.yml

Test Plan
  • Run the above playbook

Diff Detail

Repository
rLTRN libtaskotron
Branch
docker-tasks (branched from develop)
Lint
Lint WarningsExcuse: Error with linter.
SeverityLocationCodeMessage
Warninglibtaskotron/executor.py:104E225flake8 E225
Unit
Unit Test Errors
Build Status
Buildable 1163
Build 1163: arc lint + arc unit

Unit TestsFailed

Excuse: WIP
TimeTest
1 mstesting.functest_logger.TestLogger.test_logfile_no_write
self = <testing.functest_logger.TestLogger instance at 0x7fcb325a9128> tmpdir = local('/tmp/pytest-of-root/pytest-6/test_logfile_no_write0')
0 mstesting.test_directive_modules.TestDirectives.test_directive_class_inherits_from_base
self = <testing.test_directive_modules.TestDirectives instance at 0x7fcb31ec6c68> def test_directive_class_inherits_from_base(self):
0 mstesting.test_directive_modules.TestDirectives.test_directive_class_process_method_exists
self = <testing.test_directive_modules.TestDirectives instance at 0x7fcb327c4170> def test_directive_class_process_method_exists(self):
1 mstesting.test_directive_modules.TestDirectives.test_directive_class_var_exists
self = <testing.test_directive_modules.TestDirectives instance at 0x7fcb3277fdd0> def test_directive_class_var_exists(self):
0 ms.testing.functest_executor
testing/functest_executor.py:8: in <module> from libtaskotron import executor libtaskotron/executor.py:17: in <module>
View Full Test Results (4 Failed · 20 Broken · 190 Passed)
roshi created this revision.May 15 2017, 6:54 PM

Looks good for a WIP. My concern here is, that with disposable minions, we do a thing where we use "the right fedora version", so e.g. fc24 packages are tested on F24 machine, and so on. I'd like to see the same for Docker - it can even be done quite easily. Not that it needs to happen for PoC, but I'd like at least a big fat "TODO/FIXME" somewhere in the code to remind you of that :)

libtaskotron/executor.py
18

Just a nitpick, but I'd rather see from libtaskotron.ext.disposable.docker import DockerClient and docker.DockerClient() respectively. Just nit-pick though.

62

Might be worth adding config.RuntaskModeName.DOCKER, so this can be used not only from the command line.

99

Y U No return a (ipaddr, port) tuple from the _spawn_container() directly?

libtaskotron/ext/disposable/docker.py
16

You seem to be using a lot of direct command line. I'd rather use docker-py [https://docker-py.readthedocs.io/en/stable/] - it does whatever the docker command can, and it is even packaged. Not a big deal, and it might even be overly complex for what you are doing here. But I though it worth mentioning.

libtaskotron/ext/docker/yumrepoinfo.conf
1 ↗(On Diff #3033)

Is duplicating yumrepoinfo.conf really necessary - I know this is WIP (and thus the "easy" method), but I'd much rather see the conf/yumrepoinfo.conf.example being used. This only adds another place where stuff can get out of sync.

roshi added a comment.May 16 2017, 1:02 PM

We can probably add some logic to manipulate the version of Fedora that goes into the container we run. I'll look into it.

Thanks for the feedback!

libtaskotron/executor.py
62

Yeah, that makes sense.

99

No real reason.

libtaskotron/ext/disposable/docker.py
16

I would rather use docker-python as well. However, there are a couple reasons I didn't use it.

First, the docker API is notorious for breaking things when they change it. Oddly enough, the CLI is about the most stable interface into docker. Second, the packaged versions we have in F24 and F25 are all on a much older version of the API. The API needs to be matched to the version the daemon is running on whatever host these containers will be running on. The 1.x series (that's packaged for 24 and 25) is no longer supported as they've moved onto 2.x - and 2.x is backwards incompatible with any code written for the 1.x.

I tried to write it in a fashion that if docker api get stabilized we can easily move to using the API without having to change much.

libtaskotron/ext/docker/yumrepoinfo.conf
1 ↗(On Diff #3033)

we'd at the very least have to have a symlink into the docker directory for docker build to work. However, my hope is that we can get these containers built and put into the fedora registry so we don't have to maintain them here.

kparal added a subscriber: kparal.May 16 2017, 2:14 PM
kparal added inline comments.
libtaskotron/executor.py
46

I don't really understand docker, but wouldn't it be better (and safer) to run on 127.0.0.1 rather than 0.0.0.0?

libtaskotron/ext/disposable/docker.py
16

Could we ping the package maintainer and ask her to update the package? I'd also feel better if we didn't have to parse the command line output.

19

Same question regarding default interface.

libtaskotron/ext/docker/yumrepoinfo.conf
1 ↗(On Diff #3033)

+1 jskladan. Duplicating these config files (not just this one, all of them) is going to be a maintenance hell. I don't know how exactly docker build process works, but it seems we could either copy in the example config files or the system installed config files dynamically?

roshi added inline comments.May 17 2017, 2:32 PM
libtaskotron/ext/disposable/docker.py
16

The API is tied to the version of the daemon being run. AIUI, all docker things would have to be updated. The CLI has and does remain stable, which is why I landed on using that. Also, the output we're parsing is from docker's go templates, not raw stdout - so we're much less likely to run into changes that affect our own output templates.

roshi updated this revision to Diff 3041.May 18 2017, 2:12 PM
  • Updated to reflect review comments.
roshi updated this revision to Diff 3042.May 18 2017, 2:27 PM

Updated diff to reflect comments in review.

roshi added a comment.May 18 2017, 2:28 PM

Sorry for the spam. Last arc diff went against master I think, since I forgot to base it off the right commit - so it was showing a ton of stuff not included in this review.

kparal added inline comments.May 19 2017, 2:03 PM
libtaskotron/ext/disposable/docker.py
16

When I look at python-docker-py package, the same version (1.1.0.6) is in F25, F26 and Rawhide. So IIUIC, this is API 1.x, and all the rest of the docker packages in the system are also based on API 1.x, because they need to be kept in sync, correct? API 2.x doesn't seem to have hit even Rawhide, yet.

So what exactly is the problem? We can use API 1.x, and when docker+API 2.x is introduced in Rawhide/Branched in the future, we can rewrite libtaskotron to support both (calling the relevant methods with the right arguments depending whether it's 1.x or 2.x). Do you think this is too much work, are the API changes vast? Or am I missing something else?

I'm quite nervous from command line parsing here. I understand it's supposed to be very stable, but... very nervous.

roshi added inline comments.May 19 2017, 7:15 PM
libtaskotron/ext/disposable/docker.py
16

I guess I don't really understand what makes you nervous about parsing stdout. I would have no problem doing this in a bash script, and CLI is an API...

There are 2 different APIs at play in my thinking here. First is the actual docker API, and the second is the docker-python API. They both have a history of changing and it feels like more of a chance for a headache. If you look at the specfile for python-docker-py ( http://pkgs.fedoraproject.org/cgit/rpms/python-docker-py.git/tree/python-docker-py.spec?h=f26#n138 ) you'll see a 4 day period where it got bumped to 2.0.2 and then dropped back to 1.10.6, as an example. I'd have to do more digging to find other times docker has broken it's API, but the sentiment I've gotten across the board is that it happens fairly regularly. Digging into this seemed like it'd take more time than it's worth, and this works now.

We don't need all the functionality the API, so it feels a bit heavy to pull in a whole new dep for such a small need. The way I see it, with the current code we have to look for changes in docker cli --format flag to break things. If we use docker-py, now we have 2 places errors can crop up instead of one.

If it breaks severely, I can port it to use docker-py; though I don't think we'll need to.

kparal added inline comments.May 22 2017, 2:21 PM
libtaskotron/ext/disposable/docker.py
16

I would have no problem doing this in a bash script

Because there are no better ways in bash...

and CLI is an API...

That's where we fundamentally differ, I believe. I don't consider CLI output to be an API, until it actually prints json/xml or something. APIs are far superior to text parsing - you have much better error handling (exceptions, typed values), you have structured data, APIs are versioned, you can get logging, etc.

First is the actual docker API, and the second is the docker-python API. They both have a history of changing and it feels like more of a chance for a headache. If you look at the specfile for python-docker-py ( http://pkgs.fedoraproject.org/cgit/rpms/python-docker-py.git/tree/python-docker-py.spec?h=f26#n138 ) you'll see a 4 day period where it got bumped to 2.0.2 and then dropped back to 1.10.6, as an example.

I talked to the maintainer (ttomecek). The 4 day period was a packaging accident. python-docker-py is API 1.x, they will keep this in all fedora releases. python-docker is API 2.x, it got renamed because upstream renamed it. They keep it in parallel in F26 and above (not sure about F24/F25 plans). According to ttomecek, the APIs are stable as long as you keep using the same major version. Between versions the APIs change a lot, that is correct. (However, since we're using only the basics, hopefully the changes would not affect us much? And it certainly should not be an unexpected change, in contrast to CLI changes, which has no stability guarantees).

If we use docker-py, now we have 2 places errors can crop up instead of one.

I don't follow here, I assume we would do everything through the python api. What's the second place?

Overall, I'm not completely opposed to cli parsing. Just trying to make sure it's really the best approach here. Of course, since the code is already written, it makes sense to use it and replace it once it gets problematic.

tflink added a subscriber: tflink.May 22 2017, 7:14 PM
roshi added inline comments.May 22 2017, 7:15 PM
libtaskotron/ext/disposable/docker.py
16

Because there are no better ways in bash...

The point was that if this was bash we'd run it in prod and never give it a second thought.

That's where we fundamentally differ, I believe. I don't consider CLI output to be an API...

By nature of running in a shell, it's an API. Return codes, awk (and --format flags) and application logging cover all the other bits you mentioned. But I digress, as this is probably better suited to a conversation over beers at Flock :)

What's the second place?

The second place is the docker daemon itself. If the daemon gets updated, but docker-py doesn't then it'll stop working - since they have to be in sync. It's the whole "now you have two problems" thing (at least in my head).

Of course, since the code is already written, it makes sense to use it and replace it once it gets problematic.

That's good enough for me :) As for the duplicated conf files, for now I'll just maintain it so we can push it out the door. I have a couple ideas for that in the future, but right now I don't feel that it should stop us from moving forward. That work?

kparal added inline comments.May 23 2017, 8:51 AM
libtaskotron/ext/disposable/docker.py
16

The second place is the docker daemon itself. If the daemon gets updated, but docker-py doesn't then it'll stop working - since they have to be in sync.

OK, but we have package maintainers to keep an eye on that. That's not our problem. And they'll surely do a good job to keep things in sync, because docker is going to be an important part of our release process and artifacts we create. I don't think this is an issue.

As for the duplicated conf files, for now I'll just maintain it so we can push it out the door. I have a couple ideas for that in the future, but right now I don't feel that it should stop us from moving forward. That work?

Sure, thanks.

jskladan requested changes to this revision.May 23 2017, 10:10 AM

Well, I'm not a huge fan of this - since not all the commands support the --format flag. This will most definitely have the same issues as testcloud does, but if the goal is to really have it now...

The only thing I'm not going to waive on is the config duplication. Fortunately, this can easily be made sane by
ADD https://pagure.io/taskotron/libtaskotron/blob/master/f/conf/yumrepoinfo.conf.example /etc/taskotron/yumrepoinfo.conf

This revision now requires changes to proceed.May 23 2017, 10:10 AM

Have you considered using ansible's docker module [1]? Seems like they have it all implemented.

http://docs.ansible.com/ansible/docker_container_module.html

roshi added a comment.EditedJun 12 2017, 2:34 PM

Have you considered using ansible's docker module [1]? Seems like they have it all implemented.

http://docs.ansible.com/ansible/docker_container_module.html

We discussed this in IRC earlier this morning:

13:38 < tflink> roshi: you've talked to folks with more docker experience about this, no?
13:39 * kparal doesn't have experience with docker-compose
13:45 < roshi> I did
13:45 < roshi> I talked to adam (maxamillion) about it and he said CLI is the least headache
13:46 < roshi> as well as other, non-fedora folks I know who deal with quite a bit of docker
13:46 < roshi> the general consensus was that for what I was looking to do (which isn't very much), --format would be the easiest to deal with in the long term
13:47 < roshi> I don't really see us doing much more with docker in this context, but I could be wrong
13:47 < maxamillion> roshi: we talked about docker-compose vs ansible-docker?
13:47 < maxamillion> (also, do you mean ansible-container ?)
13:47 < roshi> we talked about python libs vs cli parsing with the --format flag
13:47 < maxamillion> oh right
13:47 < maxamillion> +1
13:47 < maxamillion> sorry, I thought you were talking about multi-container stuff

<snip>

13:49 < roshi> I can see us moving to ansible-container at some point, but for what we're trying to do right now, I don't see the value add
13:49 < roshi> seems like a pre-optimization to me, and I like to try to follow YAGNI
13:51 < maxamillion> probably a fair assessment

(edit: it formatted weird(

roshi updated this revision to Diff 3067.Jun 12 2017, 3:03 PM
  • Update yumrepo.conf location per diff comments and remove yumrepo.conf

Martin just landed his commit in a feature branch, can you please rebase against that?

roshi updated this revision to Diff 3072.Jun 19 2017, 7:56 PM

Rebased per @kparal 's comments.

That's weird, it's still showing more than just your changes in the diff. I merged develop into feature/ansiblize. Can you try to rebase again? I'd like to read the clean diff once more, and in the current state it's not clear which of the changes are yours.

Also please make sure you're running arc diff feature/ansiblize when updating this.

roshi updated this revision to Diff 3073.Jun 20 2017, 2:37 PM

Retrying the diff... (man I hate arcanist)

jskladan requested changes to this revision.Jun 20 2017, 3:08 PM

On top of the failing tests, some of which are really weird, I suggest you investigate/fix ASAP, I have some mostly semantical issues.

libtaskotron/executor.py
40–42

This does not make much sense to me, honestly. Why do you have (created out of the blue, BTW, which is a bit naughty) self.client, which is the actual container, and then self.task_container which I'd intuitively think contains the container, but is in fact just a random int?

Please remove self.client, and use self.task_container instead. Also, insted of returning the (evidently quite important) random int (host_port from create_container), please store the host_port directly in the DockerClient object.
Whether you return the tuple (I also kind do not understand the need to return hard-coded localhost IP, especially since it is the default value of ipaddr later on in the _prepare methond, and default value of host in the DockerClientcode..) to make it "more in line with spawn_vm, or not, I don't really care, but please get rid of this particular nonsense.

libtaskotron/ext/disposable/docker.py
19–23

please unify host_ip param and self.host variable names. I suggest going with host_ip in both cases, as it is descriptive of what is going on.

152

instead of returning host_port (which seems to be quite important in the other parts of the code), please set self.host_port instead.

libtaskotron/ext/docker/namespaces.yaml
1–5

I'm not really sure this is right. I would not like for the Docker hosts to be able to just do whatever, whilst virtual machines are limited to the standard set by conf/namespaces.yaml.example

This revision now requires changes to proceed.Jun 20 2017, 3:08 PM
kparal requested changes to this revision.Jun 20 2017, 3:33 PM

Some of the test suite failures seem to be caused by missing dependencies (import hawkey or import libvirt fails). Follow the readme to install dependencies. The test suite needs to pass so that we can continue refactoring while being sure we haven't broken anything.

But a bigger problem seems to be that your code is not based on current feature/ansiblize, but on rLTRN2b720ca61cb8. That is some old version of Martin's patch and it is not part of any existing branch. You'll need to rebase it on top of feature/ansiblize, if you want to commit it there.

Also, I'd like to see some decent test suite coverage for your new code before we merge it to develop. It doesn't need to be now, but it should happen soon-ish. 'Now' would of course be better ;-) Might help to uncover some bugs already.

Thanks!

libtaskotron/executor.py
101–104

Could you change this into triple-quotes string syntax so that it's more readable? Or simple strings, but easier to read?

119–121

If possible, it's better to use the longer variant of cmdline arguments, because it's immediately understandable when reading the source code.

libtaskotron/ext/disposable/docker.py
184–186

Please move this to exceptions.py, possibly as TaskotronDockerError or something similar.

mkrizek resigned from this revision.Thu, Jul 13, 9:04 AM
jsedlak resigned from this revision.Wed, Aug 2, 10:32 AM