Switched from TAP to YAML
ClosedPublic

Authored by jskladan on Sep 3 2015, 8:13 PM.

Details

Summary

This is a WIP of the TAP to YAML conversion for the purposes of results-reporting.
The code is in the state where all the unittests work, but I'm yet to try it on a
real-life libtaskotron run. The documentation is practically untouched so far.

But I guess this might be a good start for discussions.

Test Plan

Unittests work

Diff Detail

Repository
rLTRN libtaskotron
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
jskladan retitled this revision from to Switched from TAP to YAML.Sep 3 2015, 8:13 PM
jskladan updated this object.
jskladan edited the test plan for this revision. (Show Details)
kparal added a comment.Sep 7 2015, 3:43 PM

I think it would be nice to include a ResultYAML sample to showcase what is going to be supported and what is not (we had a lot of ideas when brainstorming this).

I went through the main code changes (skipping docstrings and tests), even though not fully closely, and the changes look fine to me.

libtaskotron/check.py
270–271

Are we going to support multiple documents (document roots) in a single ResultYAML? If not, is there a purpose for using the triple-dash and triple-dot syntax?

There will be ResultYAML samples in the docs, which I'm working on now. Do you think it's worth going into the "not yet implemented" details anywhere else than there?

libtaskotron/check.py
270–271

meh, probably left it there accidentally...

jskladan updated this revision to Diff 1458.Sep 8 2015, 10:18 AM
  • Documentation
jskladan updated this revision to Diff 1460.Sep 8 2015, 10:43 AM
  • fixed lint warns
  • moar tests
jskladan updated this revision to Diff 1461.Sep 8 2015, 10:57 AM
  • coverage
kparal requested changes to this revision.Sep 8 2015, 2:30 PM

A few nitpicks. Looks good overall.

docs/source/writingtasks.rst
244

We concluded we don't want note to be used like this, it should present some additional information that should be highly visible. Either improve or remove the line?

291–292

Again sub-par note usage.

302

Again sub-par note usage.

docs/source/yamlformat.rst
3–5 ↗(On Diff #1461)

Maybe a title like "Task result format" might be more concrete here? We use YAML in many places, in configuration for example, and this chapter seems to talk just about the result format. It might be nice to fit YAML word into the title somehow, but I don't have any good ideas now.

9–11 ↗(On Diff #1461)

In case you didn't, please render the docs and make sure all the hyperlinks like these work.

36 ↗(On Diff #1461)

I would use quotes instead of backticks around "ok" and "not ok", so that it doesn't look like values. When I first read it, it seemed like ok and not ok were allowed values for outcome. Or make it into a longer sentence:

The ``outcome`` field defines whether the check result was OK or not OK.
40 ↗(On Diff #1461)

A verb seems to be missing. Maybe "what" was tested on a high level?

42–46 ↗(On Diff #1461)

We also have compose.

61–62 ↗(On Diff #1461)

It would be nice to see an example with multiple (2-3) results in a single YAML document, to show that you can easily report results of multiple items. Either here in "minimal version" section, or in/after the "full version" section. Simply like this (with a short description):

results:
  - item: xchat-0.5-1.fc20
    type: koji_build
    outcome: PASSED
  - item: pidgin-1.4-2.fc20
    type: koji_build
    outcome: FAILED
85 ↗(On Diff #1461)

typo: useful

90 ↗(On Diff #1461)

Maybe "allows you to add a short valuable information that should be easily visible when looking at the result overview"?

libtaskotron/check.py
74–77

I would get rid of this, it serve no more purpose now that we have artifact support. Might be part of a future diff.

libtaskotron/directives/resultsdb_directive.py
206–207

This is no longer true, we used to strip values in TAP, but your patch doesn't do it. If we intend to keep CheckDetail.output, I'd probably keep the message and introduce output stripping again. If we plan to get rid of it, let's delete this message and then remove CheckDetail.output soon.

254–255

This method has no further purpose (it calls a single method and does nothing else), let's either implement it according to the docstring, or delete it completely.

This revision now requires changes to proceed.Sep 8 2015, 2:30 PM
jskladan updated this revision to Diff 1475.Sep 9 2015, 8:03 AM
  • docs
  • removed report_summary from resultsdb directive
kparal accepted this revision.Sep 9 2015, 11:03 AM

LGTM

docs/source/yamlformat.rst
47 ↗(On Diff #1475)

The word seems to be so bare and lonely on that line... ;-)

This revision is now accepted and ready to land.Sep 9 2015, 11:03 AM
jskladan updated this revision to Diff 1477.Sep 9 2015, 11:24 AM
  • Switched from TAP to YAML
  • Docs wrangling
jskladan updated this revision to Diff 1478.Sep 9 2015, 11:28 AM
  • un-lonely the compose

I like it - looks good overall. A couple of thoughts outside the inline comments:

  • The example output is not very easily readable (a side effect of moving from TAP to YAML, not a criticism) and I wonder if we might want to write a pretty-printer for the resultyaml output instead of just relying on the bare YAML
  • Once this lands, we'll need to check on the existing tasks to make sure that they're compatible - I think that depcheck will need tweaking, at least

Is there a situation where we might reasonably want nested results? I think that'd be a "deal with that later, if we need it" kind of thing but figured that I'd ask the question.

docs/source/index.rst
105

Sometimes the docs/comments refer to a resultyaml format and other times, like here, it's just called YAML. I think we'd be better off being consistent about naming and sticking with the resultyaml moniker to reduce the potential for confusion with all the other places we're using YAML

docs/source/yamlformat.rst
4–6 ↗(On Diff #1478)

Following on this - I'd like to see consistent naming of the resultyaml (or whatever we end up calling it) from titles in documentation to filenames and comments.

In D555#10909, @tflink wrote:
  • The example output is not very easily readable (a side effect of moving from TAP to YAML, not a criticism) and I wonder if we might want to write a pretty-printer for the resultyaml output instead of just relying on the bare YAML

I'm a bit confused, do you consider this less readable than TAP?

results:
   - item: xchat-0.5-1.fc20
     type: koji_build
     outcome: PASSED

Or are you talking about some other example output?

In D555#10918, @kparal wrote:
In D555#10909, @tflink wrote:
  • The example output is not very easily readable (a side effect of moving from TAP to YAML, not a criticism) and I wonder if we might want to write a pretty-printer for the resultyaml output instead of just relying on the bare YAML

I'm a bit confused, do you consider this less readable than TAP?

results:
   - item: xchat-0.5-1.fc20
     type: koji_build
     outcome: PASSED

Or are you talking about some other example output?

Yeah, that's what I meant. I don't think it's a huge difference but I do think it shows up more with multiple results

1..41
ok - depcheck for Koji build COPASI-4.16-0.13.20150817git3bc4e9.fc22
  ---
  arch: i386
  artifact: /var/lib/taskotron/artifacts/d3ebc790-5ae7-11e5-9c54-525400062113/COPASI-4.16-0.13.20150817git3bc4e9.fc22.i386.log
  item: COPASI-4.16-0.13.20150817git3bc4e9.fc22
  outcome: PASSED
  type: koji_build
  ...
ok - depcheck for Koji build nrpe-2.15-6.fc22
  ---
  arch: i386
  artifact: /var/lib/taskotron/artifacts/d3ebc790-5ae7-11e5-9c54-525400062113/nrpe-2.15-6.fc22.i386.log
  item: nrpe-2.15-6.fc22
  outcome: PASSED
  type: koji_build
  ...

I think that TAP13 is a lot easier to scan and see where the failure is. It's not as big of a deal now that we're doing artifacts but I still think it'd be worth printing out a summary or a pretty-printed form for the times when people are reading through the raw output

In D555#10930, @tflink wrote:

I think that TAP13 is a lot easier to scan and see where the failure is. It's not as big of a deal now that we're doing artifacts but I still think it'd be worth printing out a summary or a pretty-printed form for the times when people are reading through the raw output

While I do agree, I have not found a reliable (and easy) way to add comments "to the right places", so the output could look like:


results:
   #ok
   - item: xchat-0.5-1.fc20
     type: koji_build
     outcome: PASSED
   #not ok
   - item: bugemos-0.2-1.fc20
     type: koji_build
     outcome: ABORTED

I'm not at all against us adding the comments, so one can easily scan through the results, but I would not even mention it in the documentation (of the format), as it's just eye-candy in the end (and requiring the eye-candy comments would effectively make it "not yaml", and not that easy to produce while not using libtaskotron tooling).

The good thing about yaml (without the required comments) is that this:

{results: [{item: xchat, type: koji_build, outcome: PASSED}, {item: bugemos, type: koji_build, outcome: ABORTED}]}

is also effectively a valid format of the resultyaml, and can be easier for some check-developers to produce, while exactly the same for us to consume.

It seems to me that @tflink didn't intend to have the comments part of the specification, just wanted to have it pretty printed when we dump debug output into stdout (in resultsdb directive and during variable export process).

It seems to be impossible to dump comments with PyYAML (we would have to use other libraries). So as long as we don't want this debug output to be valid yaml, we can dump it in any structure we want, and make it well readable. It won't be yaml, but it can be really nice and succinct, like this:

[DEBUG] The following would get reported:
OK:
xchat-0.5-1.fc20:    depcheck PASSED
firefox-44.1-2.fc20: depcheck PASSED
NOT OK:
bugemos-0.2-1.fc20:  depcheck ABORTED

Or introduce another third-party library for yaml-with-comments dumping (which I must say I'm not really enthusiastic about, just for some debug prints).

Personally I wouldn't mind dumping bare yaml structure directly in the logs, because it's easy and they are just debug logs. We have other places to present nicely structured and readable results (we can even generate some nice html overview by default and attach it as an artifact). And it has the benefit of developers seeing the exact structure that we pass to reporting directives. But I wouldn't mind some simplified non-yaml structure as the one shown above either.

In D555#10930, @tflink wrote:

I think that TAP13 is a lot easier to scan and see where the failure is. It's not as big of a deal now that we're doing artifacts but I still think it'd be worth printing out a summary or a pretty-printed form for the times when people are reading through the raw output

While I do agree, I have not found a reliable (and easy) way to add comments "to the right places", so the output could look like:


results:
   #ok
   - item: xchat-0.5-1.fc20
     type: koji_build
     outcome: PASSED
   #not ok
   - item: bugemos-0.2-1.fc20
     type: koji_build
     outcome: ABORTED

I'm not at all against us adding the comments, so one can easily scan through the results, but I would not even mention it in the documentation (of the format), as it's just eye-candy in the end (and requiring the eye-candy comments would effectively make it "not yaml", and not that easy to produce while not using libtaskotron tooling).

I saw that example output and was thinking "yes, that's pretty much exactly what I was looking for" until I read the text :)

While I think it's a bit more than just eye candy, I also think that there is a limit to how much that extra understandability is worth. In my mind, one of the big things that TAP has going for it is high human understandability and if that was the only factor, we probably wouldn't be moving away from that.

The good thing about yaml (without the required comments) is that this:

{results: [{item: xchat, type: koji_build, outcome: PASSED}, {item: bugemos, type: koji_build, outcome: ABORTED}]}

is also effectively a valid format of the resultyaml, and can be easier for some check-developers to produce, while exactly the same for us to consume.

Yeah, I think that one of the other wins for us switching to a results format that is YAML is that we don't have to worry about libraries that may or may not output the correct version of TAP and may not support the yamlish extension.

In D555#10988, @kparal wrote:

It seems to me that @tflink didn't intend to have the comments part of the specification, just wanted to have it pretty printed when we dump debug output into stdout (in resultsdb directive and during variable export process).

Honestly, I'd be all for adding comments if I thought we could do that sanely. However, I don't think that's going to happen unless we start writing our own parser/generator code and that sounds like a bad idea.

It seems to be impossible to dump comments with PyYAML (we would have to use other libraries). So as long as we don't want this debug output to be valid yaml, we can dump it in any structure we want, and make it well readable. It won't be yaml, but it can be really nice and succinct, like this:

[DEBUG] The following would get reported:
OK:
xchat-0.5-1.fc20:    depcheck PASSED
firefox-44.1-2.fc20: depcheck PASSED
NOT OK:
bugemos-0.2-1.fc20:  depcheck ABORTED

Yeah, that would satisfy what I was looking for - a way to quickly scan the output and see what did or didn't pass. I think that indenting the items so that they're not inline with OK/NOT OK helps make it more readable but the concept seems good.

Or introduce another third-party library for yaml-with-comments dumping (which I must say I'm not really enthusiastic about, just for some debug prints).

Yeah, that doesn't sound worth it to me. I thought about trying to print out something like what @jskladan listed (even if it wasn't part of the format) but I suspect that wouldn't be trivial and we might risk confusion over what the actual format is.

Personally I wouldn't mind dumping bare yaml structure directly in the logs, because it's easy and they are just debug logs. We have other places to present nicely structured and readable results (we can even generate some nice html overview by default and attach it as an artifact). And it has the benefit of developers seeing the exact structure that we pass to reporting directives. But I wouldn't mind some simplified non-yaml structure as the one shown above either.

I don't think it's critical to have a pretty-printed representation in the stdout but I do think that it would make it easier to read command line output for folks running tasks locally on the command line. It'd also make it easier to scan log output in one of the public instances - I usually end up scanning logs for the TAP13 when looking through logs.

In D555#11037, @tflink wrote:

While I think it's a bit more than just eye candy, I also think that there is a limit to how much that extra understandability is worth. In my mind, one of the big things that TAP has going for it is high human understandability and if that was the only factor, we probably wouldn't be moving away from that.

Yeah, that makes sense. I'm all up for an pretty-printer, would you be ok with it having a separate Diff?

In D555#10909, @tflink wrote:

Is there a situation where we might reasonably want nested results? I think that'd be a "deal with that later, if we need it" kind of thing but figured that I'd ask the question.

I was thinking about tests like rpmgril, which might want to nest the results per-tier. So there is a "deal with it later" way in the format, which allows you to add results: to the result, thus creating a nested structure (which will then get represented in the ResultsDB, but... later :))

Sometimes the docs/comments refer to a resultyaml format and other times, like here, it's just called YAML. I think we'd be better off being consistent about naming and sticking with the resultyaml moniker to reduce the potential for confusion with all the other places we're using YAML
Following on this - I'd like to see consistent naming of the resultyaml (or whatever we end up calling it) from titles in documentation to filenames and comments.

Good catches - I probably introduced the inconsistencies while discussing random stuff with @kparal. If there are no objections, I'll switch to "resultyaml" for the sake of consistency and reducing the confusion.

In D555#11037, @tflink wrote:

While I think it's a bit more than just eye candy, I also think that there is a limit to how much that extra understandability is worth. In my mind, one of the big things that TAP has going for it is high human understandability and if that was the only factor, we probably wouldn't be moving away from that.

Yeah, that makes sense. I'm all up for an pretty-printer, would you be ok with it having a separate Diff?

Yeah, another ticket/diff is fine by me.

In D555#10909, @tflink wrote:

Is there a situation where we might reasonably want nested results? I think that'd be a "deal with that later, if we need it" kind of thing but figured that I'd ask the question.

I was thinking about tests like rpmgril, which might want to nest the results per-tier. So there is a "deal with it later" way in the format, which allows you to add results: to the result, thus creating a nested structure (which will then get represented in the ResultsDB, but... later :))

Just to make sure I understand - there is something like a plan for resultsdb to handle nested results but that's to be added later when we need it?

Sometimes the docs/comments refer to a resultyaml format and other times, like here, it's just called YAML. I think we'd be better off being consistent about naming and sticking with the resultyaml moniker to reduce the potential for confusion with all the other places we're using YAML
Following on this - I'd like to see consistent naming of the resultyaml (or whatever we end up calling it) from titles in documentation to filenames and comments.

Good catches - I probably introduced the inconsistencies while discussing random stuff with @kparal. If there are no objections, I'll switch to "resultyaml" for the sake of consistency and reducing the confusion.

No objections here.

In D555#11040, @tflink wrote:

Just to make sure I understand - there is something like a plan for resultsdb to handle nested results but that's to be added later when we need it?

Yes

Closed by commit rLTRN0172820d148d: Switched from TAP to YAML (authored by Josef Skladanka <jskladan@redhat.com>). · Explain WhySep 16 2015, 1:08 PM
This revision was automatically updated to reflect the committed changes.