Differential D19

add CheckDetail class for easy results tracking
ClosedPublic
Actions

Authored by kparal on Feb 27 2014, 5:59 PM.

Details

Reviewers

tflink

Maniphest Tasks

T79: Port upgradepath from AutoQA to Taskotron

Commits

rLTRNbc3b5b6e3cb6: add CheckDetail class for easy results tracking

Summary

This adds a very stripped down version of TestDetail class from AutoQA
as CheckDetail. It's purpose is to allow simple tracking and evaluation
of check results during check execution. It's mainly useful for
multi-item checks like upgradepath. In the future, this class should
offer a simple CheckDetail->TAP output conversion (not implemented yet).

This is separated from D10 review and it's closely related to T85.

Test Plan

Unit tests are included and pass. Please review closely, this is my first
unit tests submission. Thanks.

Diff Detail

Lint

Lint Skipped

Unit

Unit Tests Skipped

For example usage, have a look at:
https://bitbucket.org/fedoraqa/task-upgradepath/src/282a315c52363eefac466b717ca5017efc7b8c4a/upgradepath.py?at=feature/T79-upgradepath#cl-46
(look for check_detail variable usage)

tflink added a reviewer: tflink.Feb 27 2014, 6:06 PM

The code itself is pretty good, the only changes I'd like to see is to have the tests broken up into individual tests instead of larger tests that check multiple things in the same test.

Can you file a task for generating TAP output from CheckDetail? It doesn't have to be done now, but I want to make sure it isn't lost.

testing/test_check.py
16	This should be broken up into multiple tests. In general, you want to check one thing per unit test so that when one fails, you know exactly where the failure is without a bunch more debugging.
40	this, in particular, needs to be in its own test. it wasn't immediately obvious to me why updating the result to PASS would return FAIL as a result
testing/test_python_utils.py
10	same with test_check - please split this up into separate tests instead of one big one

Can you file a task for generating TAP output from CheckDetail? It doesn't have to be done now, but I want to make sure it isn't lost.

Created as T88.

split tests into smaller chunks

I'd still like to see the tests split up even farther than this. Instead of writing comments, I went through and split them up as shown in the attached diff.

{F555}

From my initial conversation with Kamil, there is some disagreement on whether or not my proposed level of splitting up the tests is a good idea and I'm planning to start a related conversation on qa-devel@

In the meantime, I've added more more comments to tests

Now that I'm not just looking at the tests, I have a few other quesitons/concerns about this code.

What do we want to use CheckDetail for? I was under the impression that it would be an interface point between a check's internal status and reporting through/to taskotron - a uniform way for collecting results that could then be farther processed so that we don't repeat the mistake of having different result handling code in every check.

If that is the case, what's the purpose of the log() method?

libtaskotron/check.py
73	why not just set "NEEDS_INSPECTION" as the default result instead of adding a property method? That seems simpler to me
86	Why do we need a setter method and an update_result method? I don't see the advantage of not combining them

What do we want to use CheckDetail for? I was under the impression that it would be an interface point between a check's internal status and reporting through/to taskotron - a uniform way for collecting results that could then be farther processed so that we don't repeat the mistake of having different result handling code in every check.

I've tried to provide a reasonable explanation in the class docstring. If it's still not clear, I'll try to improve it. My idea is that CheckDetails should serve as a results container for complex test cases. In simple cases, like rpmlint, you don't need to care about it. You just run the rpmlint binary, capture the output and then call something like tap.format_TAP_msg(result='PASSED', name='firefox-20-1.fc20', desc=rpmlint_output).

However, imagine how you do it for upgradepath. You have 50 builds to test, and you need to capture and construct the TAP output properly. It's no longer that simple. You need some kind of container class to contain result+name+output for each of these 50 builds. We don't want every such check to invent a wheel, that's why there is CheckDetail class. You create a new CheckDetail instance for every build tested, and track it like this:

details = { 'ucpp-1.3.4-3.fc19': CheckDetail,
             'vpnc-0.5.3-19.svn457.fc19': CheckDetail,
             'xmp-4.0.7-1.fc19': CheckDetail,
             ...
          }

(Which reminds me, it might be a good idea to add name attribute to CheckDetail, because not everyone will track the names externally as I do. It was there before, but I removed it when stripping down the class. I'll probably add it back.)

One such instance could contain something like this:
result='PASSED'
output=

============================================================
ucpp-1.3.4-3.fc19 into f19-updates
============================================================
[ OK ] f19
	Latest package: None
[ OK ] f20 + f20-updates
	Latest package: None
	Latest pending package: ucpp-1.3.4-3.fc20
[ OK ] f21
	Latest package: ucpp-1.3.4-3.fc21
	Latest pending package: None
RESULT: PASSED

And another instance:
result='PASSED'
output=

============================================================
vpnc-0.5.3-19.svn457.fc19 into f19-updates
============================================================
[ OK ] f19
	Latest package: vpnc-0.5.3-17.svn457.fc19
[ OK ] f20 + f20-updates
	Latest package: vpnc-0.5.3-19.svn457.fc20
	Latest pending package: vpnc-0.5.3-20.svn457.fc20
[ OK ] f21
	Latest package: vpnc-0.5.3-20.svn457.fc21
	Latest pending package: None
RESULT: PASSED

And at the end, you'll call something like CheckDetail.createTAP(details.values()) and it will create a combined (multi-item) TAP output of all CheckDetail instances. This is not implemented yet.

If that is the case, what's the purpose of the log() method?

I want the script to be usable as a standalone tool as well. That means I don't want to print just the TAP output, I want a 'normal' output printed to stdout as well, so that people can use it without running through taskotron, or can easily debug it. I probably want to print TAP output only when --tap option given, or maybe --tap <filename>. Also, the 'normal' output is also useful for debugging.

This all means that I often need the same output to go to several places:
a) stdout
b) CheckDetail.output, for future conversion to TAP

That's what CheckDetail.log() does. It allows people to easily print to stdout and track the output in a CheckDetail instance using a single call.

libtaskotron/check.py
73	I went ahead and re-implemented that and thanks to my test suite only then found out why it was implemented this way. Good that we use unit tests now! :-) There's a difference between an uninitialized state and `NEEDS_INSPECTION` when it comes to `update_result()`. If you have uninitialized state (`None`), you can call `update_result('PASSED')` and it is set to `PASSED`. However, if you already assign `NEEDS_INSPECTION` in the constructor, and then call `update_result('PASSED')`, it stays at `NEEDS_INSPECTION`, because it is a higher-priority result. Of course you can assign to `result` attribute manually, but that defeats the purpose of `update_result()`. When you're dealing with result states, you want have the result uninitialized until you set it for the first time. That's why it is working with `None` internally. But if you forget to set the result completely, and call `createTAP()` or similar, you want to be notified that something is wrong, you don't want to see `result: None`. That's why it returns `NEEDS_INSPECTION` if you have failed to assign any result value at all (`self._result` is still `None`).
86	I'll clarify the docstring more: '''Update self.result with the provided value only and only if it has a higher priority than its current value (according to self.result_order). Otherwise this call does nothing. This is useful if your check performs a number of 'sub-tasks', each passing or failing, and you want to final result to be the worst/most serious one of those. You can just keep calling self.update_result() and the highest priority result type will be stored at self.result at the end. @param result the result value to assign to self.result if it has a higher priority than its current value. Handles None values gracefully (no action). @throw TaskotronValueError if any of the result keywords are not specified in @classatr result_order''' Does it explain why `update_result()` is different from setting the `result` attribute directly? Do you still see any way of combining them?

add missing test for invalid result type in constructor also update docstring for update_result

add name attribute

I have realized that it will be needed for createTAP() method anyway.

My question about the purpose of CheckDetail was more a leadup to the question about log() and not understanding why having print statements in log() fit with that use case. What you described fits with what I had understood of the purpose.

I want the script to be usable as a standalone tool as well. That means I don't want to print just the TAP output, I want a 'normal' output printed to stdout as well, so that people can use it without running through taskotron, or can easily debug it. I probably want to print TAP output only when --tap option given, or maybe --tap <filename>. Also, the 'normal' output is also useful for debugging.

That's what CheckDetail.log() does. It allows people to easily print to stdout and track the output in a CheckDetail instance using a single call.

Then I'll be a bit more specific about what I'm concerned about. I understand that there are other uses for CheckDetail and I agree that it should support more output types than TAP. What I'm not clear on is why it's printing during detail creation. The specific concerns I have are:

it's using print instead of a logging mechanism. this makes the job of centralized logging and log output control more difficult since print statements like that always go to stdout
it's printing while the detail object is being filled out. If someone is going to use the CheckDetail object outside of the production runner (which I also expect), why isn't it better to create the detail object, populate it with data and then log the completed CheckDetail object? What is the use case for having the output printed when the details are being added to the CheckDetail object in addition to when the output is created (in TAP or string form)?

Wouldn't both usecases be better served by handling log output(including stdout in the local case) in the task and using the CheckDetail object as a method for pulling together results and details about those results?

Does it explain why update_result() is different from setting the result attribute directly? Do you still see any way of combining them?

That depends, is there a normal/common use case where you'd want to be able to override the current _result with a lower priority result? If not, I don't see why they couldn't be combined into the same method.

libtaskotron/check.py
63	Wouldn't it be better to have name and summary as a default to "" (empty string) and outptut to [] (that's covered in the 'output or []' statment, though)? I'd rather not have to worry about getting None when looking for the name or summary

Then I'll be a bit more specific about what I'm concerned about. I understand that there are other uses for CheckDetail and I agree that it should support more output types than TAP.

Actually, I don't consider the logging/printing mechanism to be an output type. I just consider it handy to sometimes print the contents during check run. If it's more often useful to not print than to print it, we can switch print=True to print=False.

What I'm not clear on is why it's printing during detail creation.

I guess the number one answer is interactivity. I don't want my check to be silent for long minutes and then dump all the results to stdout at the end. That's very annoying for anyone who runs that script, even for developers. Of course it doesn't matter that much (it still matters a bit) for automated testing environment.

So, as soon as I have something, I want to add it to the relevant CheckDetail instance, and I want to print it to stdout at the same time. Of course I can do that only because I handle CheckDetails sequentially, one by one. If I needed to access more of them in parallel, it would be make sense to print to stdout at realtime. In that case I would print to stdout once some CheckDetail output was complete, or, as the last resort, at the end of the check execution.

The specific concerns I have are:

it's using print instead of a logging mechanism. this makes the job of centralized logging and log output control more difficult since print statements like that always go to stdout

I was thinking a bit about our way of logging check output in the past, and maybe I have some assumptions that are not true. Let's discuss them.

I believe that whenever it's possible, we should provide a targeted log output for a specific test item in question (NVR, update, compose, etc). Therefore, for upgradepath check, we shouldn't provide full 500 lines of output for foo-1.0-1.fc20, but only those 10 lines that are directly relevant to it. We already had this in AutoQA. This is kind of trivial for simple checks like rpmlint, because full output == targeted output. For upgradepath, this can be achieved using CheckDetail. For some checks like depcheck, it might be hard to provide any targeted output at all, therefore we might only provide the full check output (but we should try to make this rare).
I believe it should be possible to access the full log of the check execution for any check result. We already had this in AutoQA as the full.log. But, in AutoQA it was really difficult to get a clear picture of what log is used for which purpose, and how to fill their contents properly. We must make it much easier this time. For this reason, I assume there will be only two user-available logs in Taskotron - targeted log (optional), and full log. Moreover, I really believe that full log must be automatically collected. In AutoQA, we had the requirement to use our Python class and that way we nudged people to populate the log correctly. In Taskotron, we want to allow the checks to be implemented in any language, without forcing them to use Python wrappers. Only the yaml file will control what gets run. Therefore we will need to either redirect the commands output automatically (change the stdout/stderr file descriptor and intercept all output). The benefit of this approach is its very simplicity - what gets printed gets logged, it's as simple as that.
Python logging mechanism is something that we use for our libraries, but not something that we use for our or third-party checks. Because full check output gets logged anyway (point 2) -> therefore simple prints are enough; and the checks can be in any programming language anyway -> therefore python-based logging can't be used in general. That's why CheckDetail uses plain printing, and no python logging calls. Of course, the check author can use python-based logging if he wishes to do so (and implements the check in Python), but I think our centralized logging should intercept mainly libtaskotron.* namespace by default. Of course this is up to discussion, I don't think it's really important at the moment.

If you combine these three bits, I hope that explains the implementation. Of course, maybe I assumed something that you're planning to do differently.

it's printing while the detail object is being filled out. If someone is going to use the CheckDetail object outside of the production runner (which I also expect), why isn't it better to create the detail object, populate it with data and then log the completed CheckDetail object?

As already mentioned, my primary use case was interactivity. Of course your way is also possible:

cd = CheckDetail()
cd.log('foobar', print=False)
# do something
# do something
print '\n'.join(cd.output)

If you think that log(msg, print=False) will be used a log, we could create the method in a different way:

def log(msg):
   '''Add the string to self.output'''

def logp(msg):
   '''Add the string to self.output and print it to stdout'''

That saves you from typing print=False too often.

By the way, I'm not sure what you mean by using the CheckDetail object outside of the production runner. AIUI, it should behave the same, no matter whether run standalone or using the Taskotron runner.

What is the use case for having the output printed when the details are being added to the CheckDetail object in addition to when the output is created (in TAP or string form)?

I hope I already answered this. Realtime printing is better for interactivity, rather then post-execution printing.

Wouldn't both usecases be better served by handling log output(including stdout in the local case) in the task and using the CheckDetail object as a method for pulling together results and details about those results?

The way you distinguish local execution and taskotron runner execution seems to indicate we're not on the same page there, but I'm not actually sure what the difference should be. Maybe a piece of code could help.

> Does it explain why update_result() is different from setting the result attribute directly? Do you still see any way of combining them?
That depends, is there a normal/common use case where you'd want to be able to override the current _result with a lower priority result? If not, I don't see why they couldn't be combined into the same method.

I'm not totally sure at the moment, but I wouldn't like to disallow that. Otherwise people would have to assign directly to self._result (if they needed it), therefore circumventing all the value checks.

The basic idea is that CheckDetail is a simple container for check results. update_result() is simply convenience method for updating the result according to its severity. But if I force this behavior by default, it is no longer a simple container, with a few convenience methods on top of it. Maybe it would help to rename update_result() to something better reflecting the fact that it works with result priority/severity? New name ideas welcome.

libtaskotron/check.py
63	Yeah, you're right, that's better. I'll update name and summary defaults. Output is already [] by default (that can't be assigned at the method declaration, otherwise that [] would be shared across the whole class).

set name='' and summary='' by default

I believe that whenever it's possible, we should provide a targeted log output for a specific test item in question (NVR, update, compose, etc). Therefore, for upgradepath check, we shouldn't provide full 500 lines of output for foo-1.0-1.fc20, but only those 10 lines that are directly relevant to it. We already had this in AutoQA. This is kind of trivial for simple checks like rpmlint, because full output == targeted output. For upgradepath, this can be achieved using CheckDetail. For some checks like depcheck, it might be hard to provide any targeted output at all, therefore we might only provide the full check output (but we should try to make this rare).

Sure, that makes sense. I agree that we don't want to be presenting overly verbose logs to users unless they specifically ask for those verbose logs.

I believe it should be possible to access the full log of the check execution for any check result. We already had this in AutoQA as the full.log. But, in AutoQA it was really difficult to get a clear picture of what log is used for which purpose, and how to fill their contents properly. We must make it much easier this time.

Sure, easier is generally better but I also think that there is a limit to how far we can take the idea of simplicity while still keeping things useful. IIRC, we weren't really making full use of the logging mechanisms in Autotest, anyways.

For this reason, I assume there will be only two user-available logs in Taskotron - targeted log (optional), and full log. Moreover, I really believe that full log must be automatically collected.

As a semi-side note, I've been wondering whether we want to start doing reports similar to how autoqa did them with pretty html instead of just plain test log files. Not required for the initial deployment but something that we may want to look into for the semi-near future.

In AutoQA, we had the requirement to use our Python class and that way we nudged people to populate the log correctly. In Taskotron, we want to allow the checks to be implemented in any language, without forcing them to use Python wrappers.

Realistically, there will have to be some form of wrapper in Python - even if that's just the directive that handles shell scripts, perl or whatever else we may end up supporting in the future. While we do want to refrain from python-specific stuff to keep the door open to supporting other languages, I don't see why that means we can't use python-specific conveniences when working with checks written in python.

The way I see non-python language support working is through support libraries and common definitions for:

logging format
return values

While not strictly required, I suspect that the most popular languages would benefit from common libraries to make this process easier, similar to what you're doing with CheckDetail.

Only the yaml file will control what gets run. Therefore we will need to either redirect the commands output automatically (change the stdout/stderr file descriptor and intercept all output). The benefit of this approach is its very simplicity - what gets printed gets logged, it's as simple as that.

While I see where you're going, I think that's too simple and makes any filtering and log monitoring more difficult than it needs to be. There is value in being able to easily search for and filter out errors and warnings. Just because we're limited to stdout/stderr doesn't mean that we can't use a log format to distinguish errors and warnings from info and debug messages, even if they are dumped to the same logfile.

Python logging mechanism is something that we use for our libraries, but not something that we use for our or third-party checks. Because full check output gets logged anyway (point 2) -> therefore simple prints are enough; and the checks can be in any programming language anyway -> therefore python-based logging can't be used in general. That's why CheckDetail uses plain printing, and no python logging calls. Of course, the check author can use python-based logging if he wishes to do so (and implements the check in Python), but I think our centralized logging should intercept mainly libtaskotron.* namespace by default. Of course this is up to discussion, I don't think it's really important at the moment.

And this is where we start disagreeing. I'm not overly attached to the idea of using the same logger as the rest of Taskotron. I see what you're saying about limiting output to stdout/stderr in the same way that non-python languages would be treated. At the same time, I think that just using print statements is going to be problematic and removes a valuable level of information in the form of saying "this was an error message" or "this was just a warning" or "this was just a debug message and we can probably ignore it in most cases". Using something like logging also allows us to have different log messages emitted in different configurations. This could mean that a check developer might enable debug logging when triaging a check failure/bug or that we enable debug logging in production only when we're seeing problems which need another level of information. I'm not saying that we need to do either of those specific things but it does allow us the flexibility of making any of those changes in the future. With relying solely on print statements, any change to logging verbosity needs to be done in code. It would encourage either overly verbose full logs or fewer logging statements than might be needed in a debug situation.

In the interest of being able to filter log messages for monitoring and to enable extra debug levels as needed without code changes, I think that using something like logging in the checks, even if it's a separate setup from the rest of libtaskotron, is a better choice than using bare print statements.

By the way, I'm not sure what you mean by using the CheckDetail object outside of the production runner. AIUI, it should behave the same, no matter whether run standalone or using the Taskotron runner.

While I agree that it needs to be as similar as possible, I don't see how it's possible for the runner to run identically in both production and in a dev setting. There will be some differences in environment: disabling reporting in most cases, faking some variables etc. but those changes should be minimal. At the same time, I see value in being able to use (almost) the same runner, the same configuration facility just using different values to make the local development/execution experience better and more flexible than it would be in production.

I do see value in having an easy way to mock the production environment regardless of other configuration settings, though. We would be well served by making it easy for folks to demo their tasks in a production-esque environment to minimize the amount of triage that is needed after their tasks make it into production.

I'm not totally sure at the moment, but I wouldn't like to disallow that. Otherwise people would have to assign directly to self._result (if they needed it), therefore circumventing all the value checks.

Valid point. Makes more sense to me now.

The basic idea is that CheckDetail is a simple container for check results.

If it's deciding when to print straight to stdout, I'm not sure it's a simple container

update_result() is simply convenience method for updating the result according to its severity. But if I force this behavior by default, it is no longer a simple container, with a few convenience methods on top of it. Maybe it would help to rename update_result() to something better reflecting the fact that it works with result priority/severity? New name ideas welcome.

update_result works for me

Realistically, there will have to be some form of wrapper in Python - even if that's just the directive that handles shell scripts, perl or whatever else we may end up supporting in the future.

Yeah, but that's under the hood, invisible to the check developer. I was talking about AutoQATest-like wrappers.

While we do want to refrain from python-specific stuff to keep the door open to supporting other languages, I don't see why that means we can't use python-specific conveniences when working with checks written in python.

Sure, no disagreement here.

While I see where you're going, I think that's too simple and makes any filtering and log monitoring more difficult than it needs to be. There is value in being able to easily search for and filter out errors and warnings. Just because we're limited to stdout/stderr doesn't mean that we can't use a log format to distinguish errors and warnings from info and debug messages, even if they are dumped to the same logfile.

Hmm, but taskotron itself is not limited to stdout. Only the check is/can be. Our library will have logging handlers that can send all the warnings/errors/etc to a log file (/var/log/taskotron), or remote logging server, whatever we wish. So there's no need to parse stdout of the check. Some of the critical errors from our library will be reflected there, but they will also be reflected in our advanced logging targets, i.e. the logstash server. In my view, the check output is completely orthogonal to taskotron logging. They are two separate things, and even if the check simply prints to stdout, we don't lose any important message from libtaskotron.

Of course, if the check author uses the same logging system, we can do more - we can filter the very check output for errors (not originating from our library, but from the check itself). But honestly, how many of the third-party checks will do that? I assume none, only our own checks might have this integration. Also, are we really interested in it? Let's suppose some third-party check is really integrated with our python logging system and starts spewing lots of infos and warnings in the process. Do we really want to see all of this on our log server? Because personally I don't care much if glibc-test encounters some errors - it's its test author problem, not mine. I only care for libtaskotron to work properly.

As for the checks we have under our control - say upgradepath - it might be nice to be able to filter for errors directly coming from upgradepath, not just libtaskotron. But for the basic functionality we don't really need this. If upgradepath ends in NEEDS_INSPECTION, I go through its log and find the cause. Anything more advanced might be nice, but I'm not even sure it would be related to CheckDetail, and even if it were, it's definitely not in the scope of this patch.

At the same time, I think that just using print statements is going to be problematic and removes a valuable level of information in the form of saying "this was an error message" or "this was just a warning" or "this was just a debug message and we can probably ignore it in most cases".

In my reworked upgradepath, I use printing when I want to print, and I use python logging when I want to log. Maybe I should rename CheckDetail.log() to CheckDetail.store() or something. Because it was never meant to replace logging. You can still log and I do it.

Using something like logging also allows us to have different log messages emitted in different configurations. This could mean that a check developer might enable debug logging when triaging a check failure/bug or that we enable debug logging in production only when we're seeing problems which need another level of information.

Yeah, good idea. I assume we could have a yaml directive:

debug: True

to enable libtaskotron debugging.
If the developer needs check debugging, the best way is to implement --debug cmdline option and then modify the yaml file to run the command with this option. We might think of even better ways, but again, it's probably out of scope here.

In the interest of being able to filter log messages for monitoring and to enable extra debug levels as needed without code changes, I think that using something like logging in the checks, even if it's a separate setup from the rest of libtaskotron, is a better choice than using bare print statements.

Yeah, look here:
https://bitbucket.org/fedoraqa/task-upgradepath/src/282a315c52363eefac466b717ca5017efc7b8c4a/upgradepath.py?at=feature/T79-upgradepath

import logging
logger = logging.getLogger('upgradepath')
logger.setLevel(logging.INFO)

And I use it heavily. But it's a different use case from using CheckDetail.log(), which is for storing the main output of the check. Heck, I'll really need to rename the method to avoid this confusion.

The basic idea is that CheckDetail is a simple container for check results.
If it's deciding when to print straight to stdout, I'm not sure it's a simple container

A simple container with a few convenience methods. This is one of them. Because I got tired of this:

msg = '  Latest package: youtube-dl-2014.03.07.1-1.fc21'
check_detail.output.append(msg)
print msg

and this seemed to be much more convenient:

check_detail.log('  Latest package: youtube-dl-2014.03.07.1-1.fc21')

I think this review process has become a bit too heavyweight. I feel like we're trying to have the whole project designed and thought through into the smallest detail in advance, before committing even the simplest patch. It doesn't seem like an iterative process much. And we don't have too much time to spare. I'd rather make changes faster and adjust if we find out we need to adjust, than trying to think of all the possibilities in advance. If we discuss our logging infrastructure and how it should work in tickets like these, it will get buried in the pages and pages of text, and we will never find it again when dealing with it in the really relevant tickets. I'd rather discuss the topic separately and once we reach a conclusion, then make changes in parts of code where it is needed.

rename log() to store()

Hmm, but taskotron itself is not limited to stdout. Only the check is/can be. Our library will have logging handlers that can send all the warnings/errors/etc to a log file (/var/log/taskotron), or remote logging server, whatever we wish. So there's no need to parse stdout of the check. Some of the critical errors from our library will be reflected there, but they will also be reflected in our advanced logging targets, i.e. the logstash server. In my view, the check output is completely orthogonal to taskotron logging. They are two separate things, and even if the check simply prints to stdout, we don't lose any important message from libtaskotron.

While I agree that they aren't the same things, I don't see why we wouldn't want to collect at least some of the task logs outside of the log messages produced by libtaskotron directly.

One of the things I'd eventually like to do is have searching and analytics over all of the task logs. One thought on making this happen is to use logstash + elasticsearch. While I can think of other ways to accomplish task log searching, that would likely be one of the easiest options.

Of course, if the check author uses the same logging system, we can do more - we can filter the very check output for errors (not originating from our library, but from the check itself). But honestly, how many of the third-party checks will do that? I assume none, only our own checks might have this integration. Also, are we really interested in it? Let's suppose some third-party check is really integrated with our python logging system and starts spewing lots of infos and warnings in the process. Do we really want to see all of this on our log server? Because personally I don't care much if glibc-test encounters some errors - it's its test author problem, not mine. I only care for libtaskotron to work properly.

I suppose a lot of this comes down to how much control we want to attempt asserting over 'third-party' checks. For the error concern, we could disable execution of any tasks which abuse the error tags or have noisy output. Not sure if that's a good idea or not but it would certainly be an option.

I don't see how suggesting/requiring a format akin to [<checkname> DEBUG] <timestamp>: some debug message here would be that bad and it would be completely compatible with stdout-only.

Yeah, good idea. I assume we could have a yaml directive:

Yeah, allowing a debug state to be set via directive or via internal runner setting sounds like a good plan. We could change it on a global or a per-task basis.

I think this review process has become a bit too heavyweight. I feel like we're trying to have the whole project designed and thought through into the smallest detail in advance, before committing even the simplest patch. It doesn't seem like an iterative process much.

Yeah, I think this is the longest review I've ever participated in - both in term of time wspent on it and lines of text written :)

And we don't have too much time to spare. I'd rather make changes faster and adjust if we find out we need to adjust, than trying to think of all the possibilities in advance. If we discuss our logging infrastructure and how it should work in tickets like these, it will get buried in the pages and pages of text, and we will never find it again when dealing with it in the really relevant tickets. I'd rather discuss the topic separately and once we reach a conclusion, then make changes in parts of code where it is needed.

You bring up an excellent point. I'm reminded of the saying "never let perfect be the enemy of good". I'm still not a huge fan of print statements but not enough to delay this any longer.

Closed by commit rLTRNbc3b5b6e3cb6 (authored by @kparal).

Revision Contents

		Path
A	M	libtaskotron/check.py (191 lines)
A	M	libtaskotron/python_utils.py (17 lines)
A	M	testing/test_check.py (176 lines)
A	M	testing/test_python_utils.py (30 lines)

Diff	ID	Base	Description	Created	Lint	Unit
Base			Base
Diff 1	40	bb0cdd6		Feb 27 2014, 5:59 PM	★	★
Diff 2	49	73fb2d3	- split tests into smaller chunks	Mar 4 2014, 12:53 PM	★	★
Diff 3	59	73fb2d3	In the meantime, I've added more more comments to tests	Mar 5 2014, 7:05 PM	★	★
Diff 4	60	73fb2d3	- add missing test for invalid result type in constructor	Mar 6 2014, 10:29 AM	★	★
Diff 5	62	73fb2d3	- add name attribute	Mar 6 2014, 12:46 PM	★	★
Diff 6	67	73fb2d3	- set name='' and summary='' by default	Mar 10 2014, 1:40 PM	★	★
Diff 7	70	73fb2d3	- rename log() to store()	Mar 11 2014, 1:56 PM	★	★
Diff 8	72	456904b	Commit rLTRNbc3b5b6e3cb62ab74dcd8bdaaae70a188db9e1c0	Mar 12 2014, 12:17 PM	★	★

Commit	Tree	Parents	Author	Summary	Date
949d69af5c19	bd4691911099	b94b774a91a1	Kamil Páral	rename log() to store()	Mar 11 2014, 1:56 PM
b94b774a91a1	6807249b16ab	919fecc0d7f6	Kamil Páral	set name='' and summary='' by default	Mar 10 2014, 1:40 PM
919fecc0d7f6	ade1c4ea4564	44a3e54c0cfc	Kamil Páral	add name attribute	Mar 6 2014, 12:44 PM
44a3e54c0cfc	49a483ca6a05	ef1b6692cd11	Kamil Páral	add missing test for invalid result type in constructor (Show More…)	Mar 6 2014, 10:27 AM
ef1b6692cd11	c48934cee973	36135072bfb9	Kamil Páral	add more comments to tests	Mar 5 2014, 7:04 PM
36135072bfb9	79ef327bf565	52049b90534b	Kamil Páral	split tests into smaller chunks (Show More…)	Mar 4 2014, 12:53 PM
52049b90534b	bf4506172c8d	73fb2d39dad0	Kamil Páral	add CheckDetail class for easy results tracking (Show More…)	Feb 27 2014, 5:48 PM

Status	Author	Revision
Abandoned	kparal	D10 transfer modules from AutoQA for upgradepath test to run
Closed	mkrizek	D17 Add koji download tag directive
Closed	kparal	D19 add CheckDetail class for easy results tracking
Closed	kparal	D18 add module for custom exceptions
Closed	kparal	D34 add config support
Closed	kparal	D52 implement yumrepoinfo
Closed	kparal	D66 add bodhi_utils.build2update and rpm_utils.rpmformat
Closed	kparal	D71 add koji_utils.getNEVR() and rpm_utils.cmpNEVR()

Diff 72

View Options

libtaskotron/check.py

This file was added.

1		# -- coding: utf-8 --
2		# Copyright 2014, Red Hat, Inc.
3		# License: GNU General Public License version 2 or later
4
5		'''Helper tools for managing check status, results, and output.'''
6
7		from . import python_utils
8		from . import exceptions as exc
9
10
11		class CheckDetail(object):
12		'''Class encompassing everything related to the outcome of a check run.
13		It contains convenience methods for tracking, evaluating and reporting
14		check results.
15
16		For some checks, it's trivial to parse its output at the end of its
17		execution and evaluate the results. For some, however, it's much easier
18		to do this continuously during check execution. This is especially true
19		for multi-result checks (a single check providing results for many items -
20		builds, updates, etc). That's when this class comes very handy (it can be
21		used even for the simple use cases, of course).
22
23		@classattr result_order a tuple of result keywords sorted by priority from
24		the least important to the most important
25		@attr name a string describing the item being tested; for example a build
26		NVR ('vpnc-0.5-1.fc20'), update ID ('FEDORA-2014-3309') or a
27		repository name ('f20-updates')
28		@attr result a keyword specifying the final result of the check. Available
29		result keywords:
30		* PASSED - everything went well
31		* INFO - everything went well, but there is some important
32		information that needs to be pointed out
33		* FAILED - the item in question fails the check
34		* NEEDS_INSPECTION - the result can't be determined and a
35		manual inspection is needed
36		* ABORTED - the check aborted itself because of some unexpected
37		problems, i.e. a necessary network server is not
38		reachable. Running this check with the same
39		arguments later can help to mitigate this problem.
40		* CRASHED - the check crashed and did not provide usable
41		results
42		If no result is set, this attribute returns NEEDS_INSPECTION.
43		@attr summary a few words or one-sentence summary about the outcome of the
44		check run (e.g. '5 WARNINGS, 1 ERROR' for an rpmlint result).
45		Should not unnecessarily duplicate @attr name or @attr result,
46		if possible.
47		@attr output output from the check run (a list of strings). This will form
48		the check log. You can easily populate this by using
49		@method log(), or you can modify it directly (for example
50		filter out some messages you don't want to see in the check
51		log).
52		@attr keyvals all key-value pairs in this dictionary are stored in ResultsDB
53		as 'extra data'. This can be used to store e.g. a compose id
54		or a kernel version, if that information is important for
55		querying the results. Keep it as short as possible.
56		'''
57
58		result_order = ('PASSED', 'INFO', 'FAILED', 'NEEDS_INSPECTION',
59		'ABORTED', 'CRASHED')
60
61
62		def __init__(self, name='', result=None, summary='', output=None):
63		'''Create new CheckDetail.
	tflinkUnsubmitted Not Done Wouldn't it be better to have name and summary as a default to "" (empty string) and outptut to [] (that's covered in the 'output or []' statment, though)? I'd rather not have to worry about getting None when looking for the name or summary
	kparalAuthorUnsubmitted Not Done Yeah, you're right, that's better. I'll update name and summary defaults. Output is already [] by default (that can't be assigned at the method declaration, otherwise that [] would be shared across the whole class).
64		For param descriptions see class docstring.'''
65
66		# validate input
67		if output is not None and not python_utils.listlike(output):
68		raise exc.TaskotronValueError("'output' parameter must be a list. "
69		"Yours was: %s" % type(output))
70
71		self.name = name
72		self._result = None
73		if result:
	tflinkUnsubmitted Not Done why not just set "NEEDS_INSPECTION" as the default result instead of adding a property method? That seems simpler to me
	kparalAuthorUnsubmitted Not Done I went ahead and re-implemented that and thanks to my test suite only then found out why it was implemented this way. Good that we use unit tests now! :-) There's a difference between an uninitialized state and `NEEDS_INSPECTION` when it comes to `update_result()`. If you have uninitialized state (`None`), you can call `update_result('PASSED')` and it is set to `PASSED`. However, if you already assign `NEEDS_INSPECTION` in the constructor, and then call `update_result('PASSED')`, it stays at `NEEDS_INSPECTION`, because it is a higher-priority result. Of course you can assign to `result` attribute manually, but that defeats the purpose of `update_result()`. When you're dealing with result states, you want have the result uninitialized until you set it for the first time. That's why it is working with `None` internally. But if you forget to set the result completely, and call `createTAP()` or similar, you want to be notified that something is wrong, you don't want to see `result: None`. That's why it returns `NEEDS_INSPECTION` if you have failed to assign any result value at all (`self._result` is still `None`).
74		self.result = result
75		self.summary = summary
76		self.output = output or []
77		self.keyvals = {}
78
79
80		@property
81		def result(self):
82		'''See class documentation.'''
83		return self._result or 'NEEDS_INSPECTION'
84
85
86		@result.setter
	tflinkUnsubmitted Not Done Why do we need a setter method and an update_result method? I don't see the advantage of not combining them
	kparalAuthorUnsubmitted Not Done I'll clarify the docstring more: '''Update self.result with the provided value only and only if it has a higher priority than its current value (according to self.result_order). Otherwise this call does nothing. This is useful if your check performs a number of 'sub-tasks', each passing or failing, and you want to final result to be the worst/most serious one of those. You can just keep calling self.update_result() and the highest priority result type will be stored at self.result at the end. @param result the result value to assign to self.result if it has a higher priority than its current value. Handles None values gracefully (no action). @throw TaskotronValueError if any of the result keywords are not specified in @classatr result_order''' Does it explain why `update_result()` is different from setting the `result` attribute directly? Do you still see any way of combining them?
87		def result(self, value):
88		'''See class documentation.'''
89		if value not in self.result_order:
90		raise exc.TaskotronValueError('Unknown result keyword: %s' % value)
91		self._result = value
92
93
94		def update_result(self, result):
95		'''Update self.result with the provided value only and only if it has
96		a higher priority than its current value (according to
97		self.result_order). Otherwise this call does nothing.
98
99		This is useful if your check performs a number of 'sub-tasks', each
100		passing or failing, and you want to final result to be the worst/most
101		serious one of those. You can just keep calling self.update_result()
102		and the highest priority result type will be stored at self.result at
103		the end.
104
105		@param result the result value to assign to self.result if it has a
106		higher priority than its current value. Handles None
107		values gracefully (no action).
108		@throw TaskotronValueError if any of the result keywords are not
109		specified in @classatr result_order'''
110		if self.cmp_result(result, self._result) > 0:
111		self.result = result
112
113
114		def broken(self):
115		'''Tells whether the check result is set to one of the broken states
116		(i.e. ABORTED or CRASHED).'''
117		return self.result in ['ABORTED', 'CRASHED']
118
119
120		def store(self, message, printout=True):
121		'''Add a string to the CheckDetail output, and print it optionally as
122		well.
123
124		This is just a convenience method for most common use case, you can
125		of course always access and modify self.output directly, and print or
126		use logging facilities directly as well. This combines both into a
127		single call.
128
129		@param message a string to store in self.output
130		@param printout whether to print to standard output or not
131		'''
132
133		self.output.append(message)
134		if printout:
135		print message
136
137
138		@classmethod
139		def cmp_result(cls, result1, result2):
140		'''Compare two results according to self.result_order and return -1/0/1
141		if result1 has lower/equal/higher priority than result2.
142
143		@param result1 a result keyword to compare
144		@param result2 a result keyword to compare
145		@throw TaskotronValueError if any of the result keywords are not
146		specified in @classatr result_order
147		'''
148
149		# validate input
150		for result in [result1, result2]:
151		if (result not in cls.result_order) and (result is not None):
152		raise exc.TaskotronValueError('Unknown result keyword: %s' %
153		result)
154
155		index1 = cls.result_order.index(result1) if result1 is not None else -1
156		index2 = cls.result_order.index(result2) if result2 is not None else -1
157		return cmp(index1, index2)
158
159
160		@classmethod
161		def create_multi_item_summary(cls, results):
162		'''Create a string containing a sum of all results, like this:
163		"3 PASSED, 1 INFO, 2 FAILED"
164
165		@param results either one CheckDetail instance or an iterable of
166		CheckDetails or an iterable of result strings
167		'''
168		# validate input
169		if isinstance(results, CheckDetail):
170		results = (results,)
171
172		if len(results) <= 0:
173		return ''
174
175		if isinstance(iter(results).next(), CheckDetail):
176		all_results = [detail.result for detail in results]
177		else: # list of strings
178		if not python_utils.listlike(results):
179		raise exc.TaskotronValueError("'results' parameter type is "
180		'incorrect: %s' % type(results))
181		all_results = results
182
183		# create the summary
184		summary = []
185		for res in cls.result_order:
186		count = all_results.count(res)
187		if count > 0:
188		summary.append('%d %s' % (count, res))
189		summary = ', '.join(summary)
190
191		return summary

View Options

libtaskotron/python_utils.py

This file was added.

		1		# -- coding: utf-8 --
		2		# Copyright 2014, Red Hat, Inc.
		3		# License: GNU General Public License version 2 or later
		4
		5		'''A collection of convenience methods related to Python base libraries.'''
		6
		7		import collections
		8
		9
		10		def listlike(obj):
		11		'''Return whether the object is iterable (list, tuple, set, etc), but not
		12		string (which is iterable, but we don't often want to consider it as such).
		13		This method is useful e.g. if you want to check that a variable contains
		14		a list/set/etc of strings.
		15		'''
		16		return isinstance(obj, collections.Iterable) and not isinstance(obj,
		17		(str, unicode))

View Options

testing/test_check.py

This file was added.

1		# -- coding: utf-8 --
2		# Copyright 2014, Red Hat, Inc.
3		# License: GNU General Public License version 2 or later
4
5		'''Unit tests for libtaskotron/check.py'''
6
7		import pytest
8
9		from libtaskotron import check
10		from libtaskotron import exceptions as exc
11
12		class TestCheckDetail:
13
14		def setup_method(self, method):
15		'''Run this before every test invocation'''
16		# for most methods, we want a default empty CheckDetail instance
	tflinkUnsubmitted Not Done This should be broken up into multiple tests. In general, you want to check one thing per unit test so that when one fails, you know exactly where the failure is without a bunch more debugging.
17		self.cd = check.CheckDetail()
18
19		def test_init_empty(self):
20		'''Test instance creation with empty parameters'''
21		# by default, result should be 'NEEDS_INSPECTION', not empty
22		assert self.cd.result == 'NEEDS_INSPECTION'
23		assert not self.cd.name
24		assert not self.cd.summary
25		assert not self.cd.output
26		assert not self.cd.keyvals
27
28		def test_init_params(self):
29		'''Test instance creation with non-empty parameters'''
30		cd = check.CheckDetail(name='foo-1.2-3.fc20', result='FAILED',
31		summary='untested', output=['line1', 'line2'])
32		assert cd.name == 'foo-1.2-3.fc20'
33		assert cd.result == 'FAILED'
34		assert cd.summary == 'untested'
35		assert cd.output == ['line1', 'line2']
36
37		def test_init_raise(self):
38		'''Test instance creation that raises an error - invalid parameters'''
39		with pytest.raises(exc.TaskotronValueError):
40		check.CheckDetail(result='INVALID TYPE')
	tflinkUnsubmitted Not Done this, in particular, needs to be in its own test. it wasn't immediately obvious to me why updating the result to PASS would return FAIL as a result
41
42		with pytest.raises(exc.TaskotronValueError):
43		check.CheckDetail(output='foobar')
44
45		def test_result(self):
46		'''Test 'result' attribute'''
47		# default value test
48		# (already tested in constructor test, but it logically belongs here,
49		# so let's include it again, in case constructor test changes)
50		assert self.cd.result == 'NEEDS_INSPECTION'
51
52		# test setter method with correct values
53		self.cd.result = 'INFO'
54		assert self.cd.result == 'INFO'
55
56		self.cd.result = 'PASSED'
57		assert self.cd.result == 'PASSED'
58
59		def test_result_raise(self):
60		'''Test 'result' attribute when it should raise an error'''
61		with pytest.raises(exc.TaskotronValueError):
62		self.cd.result = 'INVALID RESULT TYPE'
63
64		def test_update_result(self):
65		'''Test 'update_result' method'''
66		# basic use case
67		self.cd.update_result('FAILED')
68		assert self.cd.result == 'FAILED'
69
70		# update_result respects the result type ordering, it never overrides
71		# a more important result with a less important one; therefore it should
72		# retain FAILED here
73		self.cd.update_result('PASSED')
74		assert self.cd.result == 'FAILED'
75
76		# None should do nothing
77		self.cd.update_result(None)
78		assert self.cd.result == 'FAILED'
79
80		def test_update_result_raise(self):
81		'''Test 'update_result' method when it should raise an error'''
82		with pytest.raises(exc.TaskotronValueError):
83		self.cd.update_result('INVALID RESULT TYPE')
84
85		def test_broken(self):
86		'''Test 'broken' method'''
87		# by default we're not broken
88		assert not self.cd.broken()
89
90		# two values should make it look broken
91		self.cd.result = 'ABORTED'
92		assert self.cd.broken()
93
94		self.cd.result = 'CRASHED'
95		assert self.cd.broken()
96
97		# and we can go back to a non-broken state
98		self.cd.result = 'FAILED'
99		assert not self.cd.broken()
100
101		def test_store(self, capsys):
102		'''Test 'store' method'''
103		# default use case where we fill 'output' and print to stdout as well
104		self.cd.store('foobar')
105		assert self.cd.output == ['foobar']
106		out, err = capsys.readouterr()
107		assert out == 'foobar\n'
108		assert not err
109
110		# use case without printing to stdout
111		self.cd.store('barbaz', printout=False)
112		assert self.cd.output == ['foobar', 'barbaz']
113		out, err = capsys.readouterr()
114		assert not out
115		assert not err
116
117		def test_cmp_result(self):
118		'''Test 'cmp_result' method'''
119		# sample comparisons where the first parameter is 'larger' (more
120		# important) than the second one
121		assert check.CheckDetail.cmp_result('FAILED', 'PASSED') == 1
122		assert check.CheckDetail.cmp_result('NEEDS_INSPECTION', 'INFO') == 1
123		assert check.CheckDetail.cmp_result('CRASHED', 'ABORTED') == 1
124		assert check.CheckDetail.cmp_result('PASSED', None) == 1
125
126		# sample comparisons where both the parameters are equal (same
127		#importance)
128		assert check.CheckDetail.cmp_result('PASSED', 'PASSED') == 0
129		assert check.CheckDetail.cmp_result(None, None) == 0
130
131		# sample comparisons where the first parameter is 'smaller' (less
132		# important) than the second one
133		assert check.CheckDetail.cmp_result('PASSED', 'FAILED') == -1
134		assert check.CheckDetail.cmp_result(None, 'PASSED') == -1
135
136		def test_cmp_result_raise(self):
137		'''Test 'cmp_result' method when it should raise an error'''
138		# test invalid result types as an input
139		with pytest.raises(exc.TaskotronValueError):
140		assert check.CheckDetail.cmp_result('PASSED', 'INVALID RESULT')
141
142		with pytest.raises(exc.TaskotronValueError):
143		assert check.CheckDetail.cmp_result('INVALID RESULT', 'FAILED')
144
145		with pytest.raises(exc.TaskotronValueError):
146		assert check.CheckDetail.cmp_result('FOOBAR', 'FOOBAR')
147
148		def test_create_multi_item_summary(self):
149		'''Test create_multi_item_summary method'''
150		# a basic use case with a list of strings
151		summary = check.CheckDetail.create_multi_item_summary(['PASSED',
152		'PASSED', 'FAILED', 'ABORTED', 'INFO', 'INFO'])
153		assert summary == '2 PASSED, 2 INFO, 1 FAILED, 1 ABORTED'
154
155		# an empty list should return an empty string
156		summary = check.CheckDetail.create_multi_item_summary([])
157		assert summary == ''
158
159		# a use case with one CheckDetail instance input
160		summary = check.CheckDetail.create_multi_item_summary(self.cd)
161		assert summary == '1 NEEDS_INSPECTION'
162
163		# a use case with multiple CheckDetail instances input
164		cd1 = check.CheckDetail(result='PASSED')
165		cd2 = check.CheckDetail(result='PASSED')
166		cd3 = check.CheckDetail(result='INFO')
167		summary = check.CheckDetail.create_multi_item_summary([self.cd,
168		cd1, cd2, cd3])
169		assert summary == '2 PASSED, 1 INFO, 1 NEEDS_INSPECTION'
170
171		def test_create_multi_item_summary_raise(self):
172		'''Test create_multi_item_summary method when it should raise an
173		error'''
174		# a single string input is not allowed, only a list of them
175		with pytest.raises(exc.TaskotronValueError):
176		assert check.CheckDetail.create_multi_item_summary('PASSED')

View Options

testing/test_python_utils.py

This file was added.

1		# -- coding: utf-8 --
2		# Copyright 2014, Red Hat, Inc.
3		# License: GNU General Public License version 2 or later
4
5		'''Unit tests for libtaskotron/python_utils.py'''
6
7		from libtaskotron import python_utils
8
9		class TestPythonUtils:
10
	tflinkUnsubmitted Not Done same with test_check - please split this up into separate tests instead of one big one
11		def test_listlike_true(self):
12		'''Test 'listlike' method when it returns True'''
13		assert python_utils.listlike([1, 2])
14		assert python_utils.listlike(['a', 'b'])
15		assert python_utils.listlike(('foo',))
16		assert python_utils.listlike(set(('foo','bar','baz')))
17		assert python_utils.listlike(set())
18		assert python_utils.listlike(())
19		assert python_utils.listlike([u'foo', u'bar'])
20		assert python_utils.listlike({'a': 1, 'b': 2})
21
22		def test_listlike_false(self):
23		'''Test 'listlike' method when it returns False'''
24		assert not python_utils.listlike('a')
25		assert not python_utils.listlike(u'a')
26		assert not python_utils.listlike(unicode('foo'))
27		assert not python_utils.listlike(3)
28		assert not python_utils.listlike(3.14)
29		assert not python_utils.listlike(None)
30		assert not python_utils.listlike(object())

add CheckDetail class for easy results trackingClosedPublicActions

Details

Diff Detail

Revision Contents

Diff 72

libtaskotron/check.py

libtaskotron/python_utils.py

testing/test_check.py

testing/test_python_utils.py

add CheckDetail class for easy results tracking
ClosedPublic
Actions