Add support for cloud and atomic composes
ClosedPublic

Authored by roshi on Mar 2 2017, 7:45 PM.

Diff Detail

Repository
rTRGR taskotron-trigger
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
roshi retitled this revision from to Add support for cloud and atomic composes.Mar 2 2017, 7:45 PM
roshi updated this object.
roshi edited the test plan for this revision. (Show Details)
roshi set the repository for this revision to rTRGR taskotron-trigger.
roshi added a reviewer: tflink.Mar 2 2017, 7:45 PM
roshi added a comment.Mar 2 2017, 7:48 PM

One thing in this is wrong - that's the repo to pull the task from for the CloudCompose. I'll update that here as soon as I create the task for it to pull, right now it's just pointed at an ansible repo.

roshi updated this revision to Diff 2947.Mar 2 2017, 9:42 PM

Fixed the issue I mentioned before.

roshi updated this revision to Diff 2948.Mar 2 2017, 9:44 PM

Combined the two into a single view. Thought it would display in addition to, not in place of.

roshi updated this revision to Diff 2949.Mar 3 2017, 6:46 AM

I forgot to update the strings to turn on the consumer in my trigger as well as in fedmsg.d/taskotron-trigger.py. The most recent diff includes those changes.

kparal added a subscriber: adamwill.Mar 3 2017, 9:01 AM

I wonder how this patch was created so that file context is not available, and neither test suite nor lint was run? Have you created it from a diff file? Or did you use arc diff develop? Just curious.

conf/trigger_rules.yml.example
56

I already asked Mike if he can move this to https://pagure.io/taskotron/task-fedora-cloud-tests. But I didn't even know about the other one! :-) Mike, could you also move https://pagure.io/taskotron-upstream-atomic to https://pagure.io/taskotron/task-upstream-atomic (or task-atomic or task-fedora-atomic-tests or whatever)?

Also, maybe we could start synchronizing how we use git branches (always having develop and master, and using develop for dev and stg hosts and master for production hosts).

jobtriggers/cloud_compose_complete_msg.py
32

We already have compose type. @mkrizek, couldn't we use compose type everywhere here, and then schedule the right jobs with the right data (cloud jobs with data pointing to cloud images, atomic jobs with data pointing to atomic images, etc)? I don't have a complete idea how trigger works, pardon my ignorance.

35–37

It seems too many things are hardcoded here (path, filename, arch). Couldn't we get that info from the metadata?
https://kojipkgs.fedoraproject.org/compose/twoweek/Fedora-Atomic-25-20170302.0/compose/metadata/

@adamwill might have a good advice for us, I'm not familiar too much with compose metadata.

55–57

Why not have this as the very start of the method, rather than the end?

roshi added a comment.Mar 3 2017, 1:55 PM

@kparal I created this diff manually through the web interface.

roshi added inline comments.Mar 3 2017, 2:03 PM
jobtriggers/cloud_compose_complete_msg.py
32

I'm actually returning the 'compose' type in the data for do_trigger(). Initially I thought I needed to have my own item_type - but I wasn't sure and wanted to get feedback. I feel like using 'compose' is fine and I can get rid of mentions of cloud_compose and atomic_compose.

35–37

The fedmsg doesn't provide that much data in and of itself, just status, what image it was (the compose_id, like "Fedora-Cloud-24-20170303.0") and the location of the directory where all the stuff was dumped. You can see an example here: https://apps.fedoraproject.org/datagrepper/id?id=2017-9346a5a8-31f8-4db2-9ee6-6080cb14ac6f&is_raw=true&size=extra-large

I just hard-coded the bits to get the qcow2 image out. AFAICT that directory structure isn't going to change, so I'm pretty comfortable relying on it. As for the arch, I think x86_64 is the only one we produce for cloud/atomic.

55–57

I'm just used to putting my exception handling at the end. Since we're just futzing with strings I don't imagine we'd see much of a performance improvement.

kparal added a comment.Mar 3 2017, 2:04 PM
In D1157#21504, @roshi wrote:

@kparal I created this diff manually through the web interface.

OK, then it makes sense. I ran py.test and unit tests are fine. Running arc lint --rev develop complains quite a bit, though.

kparal added inline comments.Mar 3 2017, 2:20 PM
jobtriggers/cloud_compose_complete_msg.py
35–37

Yeah, I tried to suggest that trigger downloads the metadata file ($location/metadata/something.json) and parses out the path to the image it wants. That also gets rid of hardcoding architecture. It's not that hard (but please let's add a relatively quick timeout to the metadata download call). OTOH if this is a "we need to have something done ASAP" attempt, I don't really object too much in here. But in the long run, I think we'll need to do it the proper way, so at least a FIXME here would be nice.

55–57

OK, but you never know when someone adds something non-trivial to the middle, network queries or whatever (exactly as I suggested above). I'm generally trying to verify inputs and stop execution as soon as possible.

adamwill added inline comments.Mar 3 2017, 5:04 PM
jobtriggers/cloud_compose_complete_msg.py
35–37

This is the kinda thing fedfind is for, yeah. File names and paths have changed before and could change again (they're talking about changing the Atomic image file names so they include 'AtomicHost' not just 'Atomic' right now, for instance):

https://pagure.io/pungi-fedora/pull-request/154
https://pagure.io/releng/issue/6413

I've been taking a stance that the metadata properties should be stickier than the file names; the 'AtomicHost' proposal also proposes changing the metadata 'subvariant' for the images, but I'm asking them not to do that.

You might be interested in the way we do this in the openQA scheduler, which is more or less shadowing the layout of the metadata and matching the dicts:

https://pagure.io/fedora-qa/fedora_openqa/blob/master/f/fedora_openqa/schedule.py#_59
https://pagure.io/fedora-qa/fedora_openqa/blob/master/f/fedora_openqa/config.py#_103

The openQA scheduler uses fedfind to do the work of downloading and parsing the images.json file (and for a few other helper functions): fedfind.release.get_release(location=location).all_images() gets you the list of image dicts from the compose at location, flattened a bit (I hate the unnecessary nesting by 'variant' and 'arch' in the original metadata, so the all_images version stuffs the variant and arch into each image dict and gives you a flat list). But downloading and parsing it yourself is pretty easy too, if you know the location of the compose.

roshi marked 4 inline comments as done.Mar 3 2017, 8:41 PM
roshi added inline comments.
jobtriggers/cloud_compose_complete_msg.py
35–37

Right now it's a "get this thing working" effort. I've added a FIXME.

55–57

I'm fine with that. I've moved it up in the latest diff.

roshi updated this revision to Diff 2951.Mar 3 2017, 8:42 PM
roshi marked an inline comment as done.
mkrizek requested changes to this revision.Mar 6 2017, 9:13 AM

Thanks for the patch, Mike. One major issue and a few nitpicks I found :)

conf/trigger_rules.yml.example
55

I wonder if it's a good idea to define tasks for the composes here since it differs from the way we currently define tasks, that is in grokmirror - https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/inventory/group_vars/taskotron-prod#n19.

Also, the branch needs to be named "develop", not "dev". At least in case we use grokmirror for pulling the repos.

jobtriggers/cloud_compose_complete_msg.py
47

I'd say let's not have it here if we don't use it. Same for the above.

63

I think we need to approach this a bit differently. The class is not based on fedmsg.consumers.FedmsgConsumer but you define topic and config_key. There shouldn't be those vars and the consume method since it's not a fedmsg consumer.

The purpose of Trigger classes is just to process a message (and return it to the consumer that called the process method). The reason why it's done outside of a consumer class is that it can be used within both a consumer class and in jobrunner.py which is CLI for the trigger. Have a look at jobtriggers/dist_git_commit_msg.py for example and how it's also used in jobrunner.py.

Hope that makes sense, let me know if not :)

72

This never gets called.

85

Specifying topic changed meanwhile in D1153, please rebase and change hardcoded 'prod' for config.deployment_type, thanks.

95

This always returns None :)

This revision now requires changes to proceed.Mar 6 2017, 9:13 AM
roshi updated this revision to Diff 2952.Mar 7 2017, 3:09 PM

Fixed multiple issues @mkrizek brought up.

roshi marked 5 inline comments as done.Mar 7 2017, 3:10 PM

@mkrizek I think I got everything, let me know if I missed something.

Thanks.

roshi updated this revision to Diff 2953.Mar 7 2017, 3:36 PM

Don't call do_trigger in the process() method.

mkrizek requested changes to this revision.Mar 8 2017, 2:48 PM

Apart from one issue and minor nitpick, can you please run arc lint and fix the lint issues? BTW arc lint and arc unit are run automatically if you send a patch for review via arc diff, so it's better to use that.

conf/trigger_rules.yml.example
51

So this should really be changed like this:

 - when:
     message_type: AtomicCompose
-    do:
-      - {discover: {repo: 'https://pagure.io/taskotron/task-upstream-atomic.git', branch="master"}}
-      - {discover: {repo: 'https://pagure.io/taskotron/task-fedora-cloud-tests.git', branch="master"}}
+  do:
+    - {tasks: [upstream-atomic]}
+    - {tasks: [fedora-cloud-tests]}
 
 - when:
     message_type: CloudCompose
-    do:
-     - {discover: {repo: 'https://pagure.io/taskotron/task-fedora-cloud-tests.git', branch="master"}}
+  do:
+    - {tasks: [fedora-cloud-tests]}

Apart from fixed indentation (do), the discover keyword is for discovering tasks in the repo which is not really needed since there is only one task there.

jobtriggers/cloud_compose_complete_msg.py
31

This doesn't need to be all caps, right?

This revision now requires changes to proceed.Mar 8 2017, 2:48 PM
mkrizek added inline comments.Mar 8 2017, 2:54 PM
conf/trigger_rules.yml.example
51

er...

 - when:
     message_type: AtomicCompose
-    do:
-      - {discover: {repo: 'https://pagure.io/taskotron/task-upstream-atomic.git', branch="master"}}
-      - {discover: {repo: 'https://pagure.io/taskotron/task-fedora-cloud-tests.git', branch="master"}}
+  do:
+    - {tasks: [upstream-atomic, fedora-cloud-tests]}
 
 - when:
     message_type: CloudCompose
-    do:
-     - {discover: {repo: 'https://pagure.io/taskotron/task-fedora-cloud-tests.git', branch="master"}}
+  do:
+    - {tasks: [fedora-cloud-tests]}
roshi updated this revision to Diff 2962.Mar 8 2017, 7:35 PM

Updated the diff using arcanist. Below are all the commits included in the past of this revision.

  • Add support for cloud and atomic composes
  • Fix trigger_rules.yml.example to point to the right tasts for Cloud and Atomic composes.
  • Correct the strings relating to turning on the consumer in fedmsg.d
  • Update links to repos in trigger.yml.example and move status check in cloud_compose_complete_msg.py
  • Annotate that we need to fix the hardcoded paths to the image file at some point.
  • Fix issues pointed out in D1157 by Martin.
  • Don't call do_trigger() in the process() method, that's handled by the consumer.
  • Fixed multiple arc lint issues.
roshi marked an inline comment as done.Mar 8 2017, 7:40 PM

There, I used arcanist. I have a container for it now if anyone wants it.

conf/trigger_rules.yml.example
51

I presume that it'll be able to find the tasks because of grokmirror? So grokmirror pulls all the taskotron/task-* repos and mirrors them so you can call them in this fashion? It's unclear to me how this actually functions.

jobtriggers/cloud_compose_complete_msg.py
31

Doesn't need to be, but it also doesn't hurt.

mkrizek accepted this revision.Mar 9 2017, 9:25 AM

Looks good!

conf/trigger_rules.yml.example
51

Ok, so the way it works is that the trigger sends (item, item_type, taskname, arch) to the buildmaster. The buildmaster then pulls task git repo like so git pull /var/lib/git/mirror/fedoraqa/$taskname ($taskname is what buildmaster got from the trigger, so upstream-atomic for example). /var/lib/git/mirror/fedoraqa/ is where grokmirror mirrors task repos on the buildmaster.

You can look how grokmirror and buildmaster are set up in infra's ansible playbooks [1][2][3].

Hope that helps :)

[1] https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/taskotron/grokmirror
[2] https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/taskotron/buildmaster-configure
[3] https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/taskotron/buildmaster

This revision is now accepted and ready to land.Mar 9 2017, 9:25 AM
This revision was automatically updated to reflect the committed changes.
roshi marked an inline comment as done.