add CheckDetail class for easy results tracking
ClosedPublic

Authored by kparal on Feb 27 2014, 5:59 PM.

Details

Summary

This adds a very stripped down version of TestDetail class from AutoQA
as CheckDetail. It's purpose is to allow simple tracking and evaluation
of check results during check execution. It's mainly useful for
multi-item checks like upgradepath. In the future, this class should
offer a simple CheckDetail->TAP output conversion (not implemented yet).

This is separated from D10 review and it's closely related to T85.

Test Plan

Unit tests are included and pass. Please review closely, this is my first
unit tests submission. Thanks.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped
tflink requested changes to this revision.Feb 27 2014, 6:32 PM

The code itself is pretty good, the only changes I'd like to see is to have the tests broken up into individual tests instead of larger tests that check multiple things in the same test.

Can you file a task for generating TAP output from CheckDetail? It doesn't have to be done now, but I want to make sure it isn't lost.

testing/test_check.py
16

This should be broken up into multiple tests. In general, you want to check one thing per unit test so that when one fails, you know exactly where the failure is without a bunch more debugging.

40

this, in particular, needs to be in its own test. it wasn't immediately obvious to me why updating the result to PASS would return FAIL as a result

testing/test_python_utils.py
10

same with test_check - please split this up into separate tests instead of one big one

Can you file a task for generating TAP output from CheckDetail? It doesn't have to be done now, but I want to make sure it isn't lost.

Created as T88.

kparal updated this revision.Mar 4 2014, 12:53 PM
  • split tests into smaller chunks
tflink added a comment.Mar 5 2014, 5:19 PM

I'd still like to see the tests split up even farther than this. Instead of writing comments, I went through and split them up as shown in the attached diff.

{F555}

From my initial conversation with Kamil, there is some disagreement on whether or not my proposed level of splitting up the tests is a good idea and I'm planning to start a related conversation on qa-devel@

kparal updated this revision.Mar 5 2014, 7:05 PM

In the meantime, I've added more more comments to tests

tflink added a comment.Mar 5 2014, 7:35 PM

Now that I'm not just looking at the tests, I have a few other quesitons/concerns about this code.

What do we want to use CheckDetail for? I was under the impression that it would be an interface point between a check's internal status and reporting through/to taskotron - a uniform way for collecting results that could then be farther processed so that we don't repeat the mistake of having different result handling code in every check.

If that is the case, what's the purpose of the log() method?

libtaskotron/check.py
73

why not just set "NEEDS_INSPECTION" as the default result instead of adding a property method? That seems simpler to me

86

Why do we need a setter method and an update_result method? I don't see the advantage of not combining them

What do we want to use CheckDetail for? I was under the impression that it would be an interface point between a check's internal status and reporting through/to taskotron - a uniform way for collecting results that could then be farther processed so that we don't repeat the mistake of having different result handling code in every check.

I've tried to provide a reasonable explanation in the class docstring. If it's still not clear, I'll try to improve it. My idea is that CheckDetails should serve as a results container for complex test cases. In simple cases, like rpmlint, you don't need to care about it. You just run the rpmlint binary, capture the output and then call something like tap.format_TAP_msg(result='PASSED', name='firefox-20-1.fc20', desc=rpmlint_output).

However, imagine how you do it for upgradepath. You have 50 builds to test, and you need to capture and construct the TAP output properly. It's no longer that simple. You need some kind of container class to contain result+name+output for each of these 50 builds. We don't want every such check to invent a wheel, that's why there is CheckDetail class. You create a new CheckDetail instance for every build tested, and track it like this:

details = { 'ucpp-1.3.4-3.fc19': CheckDetail,
             'vpnc-0.5.3-19.svn457.fc19': CheckDetail,
             'xmp-4.0.7-1.fc19': CheckDetail,
             ...
          }

(Which reminds me, it might be a good idea to add name attribute to CheckDetail, because not everyone will track the names externally as I do. It was there before, but I removed it when stripping down the class. I'll probably add it back.)

One such instance could contain something like this:
result='PASSED'
output=

============================================================
ucpp-1.3.4-3.fc19 into f19-updates
============================================================
[ OK ] f19
	Latest package: None
[ OK ] f20 + f20-updates
	Latest package: None
	Latest pending package: ucpp-1.3.4-3.fc20
[ OK ] f21
	Latest package: ucpp-1.3.4-3.fc21
	Latest pending package: None
RESULT: PASSED

And another instance:
result='PASSED'
output=

============================================================
vpnc-0.5.3-19.svn457.fc19 into f19-updates
============================================================
[ OK ] f19
	Latest package: vpnc-0.5.3-17.svn457.fc19
[ OK ] f20 + f20-updates
	Latest package: vpnc-0.5.3-19.svn457.fc20
	Latest pending package: vpnc-0.5.3-20.svn457.fc20
[ OK ] f21
	Latest package: vpnc-0.5.3-20.svn457.fc21
	Latest pending package: None
RESULT: PASSED

And at the end, you'll call something like CheckDetail.createTAP(details.values()) and it will create a combined (multi-item) TAP output of all CheckDetail instances. This is not implemented yet.

If that is the case, what's the purpose of the log() method?

I want the script to be usable as a standalone tool as well. That means I don't want to print just the TAP output, I want a 'normal' output printed to stdout as well, so that people can use it without running through taskotron, or can easily debug it. I probably want to print TAP output only when --tap option given, or maybe --tap <filename>. Also, the 'normal' output is also useful for debugging.

This all means that I often need the same output to go to several places:
a) stdout
b) CheckDetail.output, for future conversion to TAP

That's what CheckDetail.log() does. It allows people to easily print to stdout and track the output in a CheckDetail instance using a single call.

libtaskotron/check.py
73

I went ahead and re-implemented that and thanks to my test suite only then found out why it was implemented this way. Good that we use unit tests now! :-)

There's a difference between an uninitialized state and NEEDS_INSPECTION when it comes to update_result(). If you have uninitialized state (None), you can call update_result('PASSED') and it is set to PASSED. However, if you already assign NEEDS_INSPECTION in the constructor, and then call update_result('PASSED'), it stays at NEEDS_INSPECTION, because it is a higher-priority result. Of course you can assign to result attribute manually, but that defeats the purpose of update_result().

When you're dealing with result states, you want have the result uninitialized until you set it for the first time. That's why it is working with None internally. But if you forget to set the result completely, and call createTAP() or similar, you want to be notified that something is wrong, you don't want to see result: None. That's why it returns NEEDS_INSPECTION if you have failed to assign any result value at all (self._result is still None).

86

I'll clarify the docstring more:

'''Update self.result with the provided value only and only if it has
a higher priority than its current value (according to
self.result_order). Otherwise this call does nothing.

This is useful if your check performs a number of 'sub-tasks', each
passing or failing, and you want to final result to be the worst/most
serious one of those. You can just keep calling self.update_result()
and the highest priority result type will be stored at self.result at
the end.

@param result the result value to assign to self.result if it has a
              higher priority than its current value. Handles None
              values gracefully (no action).
@throw TaskotronValueError if any of the result keywords are not
                           specified in @classatr result_order'''

Does it explain why update_result() is different from setting the result attribute directly? Do you still see any way of combining them?

kparal updated this revision.Mar 6 2014, 10:29 AM
  • add missing test for invalid result type in constructor also update docstring for update_result
kparal updated this revision.Mar 6 2014, 12:46 PM
  • add name attribute

I have realized that it will be needed for createTAP() method anyway.

tflink added a comment.Mar 6 2014, 3:16 PM

My question about the purpose of CheckDetail was more a leadup to the question about log() and not understanding why having print statements in log() fit with that use case. What you described fits with what I had understood of the purpose.

I want the script to be usable as a standalone tool as well. That means I don't want to print just the TAP output, I want a 'normal' output printed to stdout as well, so that people can use it without running through taskotron, or can easily debug it. I probably want to print TAP output only when --tap option given, or maybe --tap <filename>. Also, the 'normal' output is also useful for debugging.

That's what CheckDetail.log() does. It allows people to easily print to stdout and track the output in a CheckDetail instance using a single call.

Then I'll be a bit more specific about what I'm concerned about. I understand that there are other uses for CheckDetail and I agree that it should support more output types than TAP. What I'm not clear on is why it's printing during detail creation. The specific concerns I have are:

  • it's using print instead of a logging mechanism. this makes the job of centralized logging and log output control more difficult since print statements like that always go to stdout
  • it's printing while the detail object is being filled out. If someone is going to use the CheckDetail object outside of the production runner (which I also expect), why isn't it better to create the detail object, populate it with data and then log the completed CheckDetail object? What is the use case for having the output printed when the details are being added to the CheckDetail object in addition to when the output is created (in TAP or string form)?

Wouldn't both usecases be better served by handling log output(including stdout in the local case) in the task and using the CheckDetail object as a method for pulling together results and details about those results?

Does it explain why update_result() is different from setting the result attribute directly? Do you still see any way of combining them?

That depends, is there a normal/common use case where you'd want to be able to override the current _result with a lower priority result? If not, I don't see why they couldn't be combined into the same method.

libtaskotron/check.py
63

Wouldn't it be better to have name and summary as a default to "" (empty string) and outptut to [] (that's covered in the 'output or []' statment, though)? I'd rather not have to worry about getting None when looking for the name or summary

Then I'll be a bit more specific about what I'm concerned about. I understand that there are other uses for CheckDetail and I agree that it should support more output types than TAP.

Actually, I don't consider the logging/printing mechanism to be an output type. I just consider it handy to sometimes print the contents during check run. If it's more often useful to not print than to print it, we can switch print=True to print=False.

What I'm not clear on is why it's printing during detail creation.

I guess the number one answer is interactivity. I don't want my check to be silent for long minutes and then dump all the results to stdout at the end. That's very annoying for anyone who runs that script, even for developers. Of course it doesn't matter that much (it still matters a bit) for automated testing environment.

So, as soon as I have something, I want to add it to the relevant CheckDetail instance, and I want to print it to stdout at the same time. Of course I can do that only because I handle CheckDetails sequentially, one by one. If I needed to access more of them in parallel, it would be make sense to print to stdout at realtime. In that case I would print to stdout once some CheckDetail output was complete, or, as the last resort, at the end of the check execution.

The specific concerns I have are:

  • it's using print instead of a logging mechanism. this makes the job of centralized logging and log output control more difficult since print statements like that always go to stdout

I was thinking a bit about our way of logging check output in the past, and maybe I have some assumptions that are not true. Let's discuss them.

  1. I believe that whenever it's possible, we should provide a targeted log output for a specific test item in question (NVR, update, compose, etc). Therefore, for upgradepath check, we shouldn't provide full 500 lines of output for foo-1.0-1.fc20, but only those 10 lines that are directly relevant to it. We already had this in AutoQA. This is kind of trivial for simple checks like rpmlint, because full output == targeted output. For upgradepath, this can be achieved using CheckDetail. For some checks like depcheck, it might be hard to provide any targeted output at all, therefore we might only provide the full check output (but we should try to make this rare).
  2. I believe it should be possible to access the full log of the check execution for any check result. We already had this in AutoQA as the full.log. But, in AutoQA it was really difficult to get a clear picture of what log is used for which purpose, and how to fill their contents properly. We must make it much easier this time. For this reason, I assume there will be only two user-available logs in Taskotron - targeted log (optional), and full log. Moreover, I really believe that full log must be automatically collected. In AutoQA, we had the requirement to use our Python class and that way we nudged people to populate the log correctly. In Taskotron, we want to allow the checks to be implemented in any language, without forcing them to use Python wrappers. Only the yaml file will control what gets run. Therefore we will need to either redirect the commands output automatically (change the stdout/stderr file descriptor and intercept all output). The benefit of this approach is its very simplicity - what gets printed gets logged, it's as simple as that.
  3. Python logging mechanism is something that we use for our libraries, but not something that we use for our or third-party checks. Because full check output gets logged anyway (point 2) -> therefore simple prints are enough; and the checks can be in any programming language anyway -> therefore python-based logging can't be used in general. That's why CheckDetail uses plain printing, and no python logging calls. Of course, the check author can use python-based logging if he wishes to do so (and implements the check in Python), but I think our centralized logging should intercept mainly libtaskotron.* namespace by default. Of course this is up to discussion, I don't think it's really important at the moment.

If you combine these three bits, I hope that explains the implementation. Of course, maybe I assumed something that you're planning to do differently.

  • it's printing while the detail object is being filled out. If someone is going to use the CheckDetail object outside of the production runner (which I also expect), why isn't it better to create the detail object, populate it with data and then log the completed CheckDetail object?

As already mentioned, my primary use case was interactivity. Of course your way is also possible:

cd = CheckDetail()
cd.log('foobar', print=False)
# do something
# do something
print '\n'.join(cd.output)

If you think that log(msg, print=False) will be used a log, we could create the method in a different way:

def log(msg):
   '''Add the string to self.output'''

def logp(msg):
   '''Add the string to self.output and print it to stdout'''

That saves you from typing print=False too often.

By the way, I'm not sure what you mean by using the CheckDetail object outside of the production runner. AIUI, it should behave the same, no matter whether run standalone or using the Taskotron runner.

What is the use case for having the output printed when the details are being added to the CheckDetail object in addition to when the output is created (in TAP or string form)?

I hope I already answered this. Realtime printing is better for interactivity, rather then post-execution printing.

Wouldn't both usecases be better served by handling log output(including stdout in the local case) in the task and using the CheckDetail object as a method for pulling together results and details about those results?

The way you distinguish local execution and taskotron runner execution seems to indicate we're not on the same page there, but I'm not actually sure what the difference should be. Maybe a piece of code could help.

> Does it explain why update_result() is different from setting the result attribute directly? Do you still see any way of combining them?
That depends, is there a normal/common use case where you'd want to be able to override the current _result with a lower priority result? If not, I don't see why they couldn't be combined into the same method.

I'm not totally sure at the moment, but I wouldn't like to disallow that. Otherwise people would have to assign directly to self._result (if they needed it), therefore circumventing all the value checks.

The basic idea is that CheckDetail is a simple container for check results. update_result() is simply convenience method for updating the result according to its severity. But if I force this behavior by default, it is no longer a simple container, with a few convenience methods on top of it. Maybe it would help to rename update_result() to something better reflecting the fact that it works with result priority/severity? New name ideas welcome.

libtaskotron/check.py
63

Yeah, you're right, that's better. I'll update name and summary defaults. Output is already [] by default (that can't be assigned at the method declaration, otherwise that [] would be shared across the whole class).

kparal updated this revision.Mar 10 2014, 1:40 PM
  • set name='' and summary='' by default

I believe that whenever it's possible, we should provide a targeted log output for a specific test item in question (NVR, update, compose, etc). Therefore, for upgradepath check, we shouldn't provide full 500 lines of output for foo-1.0-1.fc20, but only those 10 lines that are directly relevant to it. We already had this in AutoQA. This is kind of trivial for simple checks like rpmlint, because full output == targeted output. For upgradepath, this can be achieved using CheckDetail. For some checks like depcheck, it might be hard to provide any targeted output at all, therefore we might only provide the full check output (but we should try to make this rare).

Sure, that makes sense. I agree that we don't want to be presenting overly verbose logs to users unless they specifically ask for those verbose logs.

I believe it should be possible to access the full log of the check execution for any check result. We already had this in AutoQA as the full.log. But, in AutoQA it was really difficult to get a clear picture of what log is used for which purpose, and how to fill their contents properly. We must make it much easier this time.

Sure, easier is generally better but I also think that there is a limit to how far we can take the idea of simplicity while still keeping things useful. IIRC, we weren't really making full use of the logging mechanisms in Autotest, anyways.

For this reason, I assume there will be only two user-available logs in Taskotron - targeted log (optional), and full log. Moreover, I really believe that full log must be automatically collected.

As a semi-side note, I've been wondering whether we want to start doing reports similar to how autoqa did them with pretty html instead of just plain test log files. Not required for the initial deployment but something that we may want to look into for the semi-near future.

In AutoQA, we had the requirement to use our Python class and that way we nudged people to populate the log correctly. In Taskotron, we want to allow the checks to be implemented in any language, without forcing them to use Python wrappers.

Realistically, there will have to be some form of wrapper in Python - even if that's just the directive that handles shell scripts, perl or whatever else we may end up supporting in the future. While we do want to refrain from python-specific stuff to keep the door open to supporting other languages, I don't see why that means we can't use python-specific conveniences when working with checks written in python.

The way I see non-python language support working is through support libraries and common definitions for:

  • logging format
  • return values

While not strictly required, I suspect that the most popular languages would benefit from common libraries to make this process easier, similar to what you're doing with CheckDetail.

Only the yaml file will control what gets run. Therefore we will need to either redirect the commands output automatically (change the stdout/stderr file descriptor and intercept all output). The benefit of this approach is its very simplicity - what gets printed gets logged, it's as simple as that.

While I see where you're going, I think that's too simple and makes any filtering and log monitoring more difficult than it needs to be. There is value in being able to easily search for and filter out errors and warnings. Just because we're limited to stdout/stderr doesn't mean that we can't use a log format to distinguish errors and warnings from info and debug messages, even if they are dumped to the same logfile.

Python logging mechanism is something that we use for our libraries, but not something that we use for our or third-party checks. Because full check output gets logged anyway (point 2) -> therefore simple prints are enough; and the checks can be in any programming language anyway -> therefore python-based logging can't be used in general. That's why CheckDetail uses plain printing, and no python logging calls. Of course, the check author can use python-based logging if he wishes to do so (and implements the check in Python), but I think our centralized logging should intercept mainly libtaskotron.* namespace by default. Of course this is up to discussion, I don't think it's really important at the moment.

And this is where we start disagreeing. I'm not overly attached to the idea of using the same logger as the rest of Taskotron. I see what you're saying about limiting output to stdout/stderr in the same way that non-python languages would be treated. At the same time, I think that just using print statements is going to be problematic and removes a valuable level of information in the form of saying "this was an error message" or "this was just a warning" or "this was just a debug message and we can probably ignore it in most cases". Using something like logging also allows us to have different log messages emitted in different configurations. This could mean that a check developer might enable debug logging when triaging a check failure/bug or that we enable debug logging in production only when we're seeing problems which need another level of information. I'm not saying that we need to do either of those specific things but it does allow us the flexibility of making any of those changes in the future. With relying solely on print statements, any change to logging verbosity needs to be done in code. It would encourage either overly verbose full logs or fewer logging statements than might be needed in a debug situation.

In the interest of being able to filter log messages for monitoring and to enable extra debug levels as needed without code changes, I think that using something like logging in the checks, even if it's a separate setup from the rest of libtaskotron, is a better choice than using bare print statements.

By the way, I'm not sure what you mean by using the CheckDetail object outside of the production runner. AIUI, it should behave the same, no matter whether run standalone or using the Taskotron runner.

While I agree that it needs to be as similar as possible, I don't see how it's possible for the runner to run identically in both production and in a dev setting. There will be some differences in environment: disabling reporting in most cases, faking some variables etc. but those changes should be minimal. At the same time, I see value in being able to use (almost) the same runner, the same configuration facility just using different values to make the local development/execution experience better and more flexible than it would be in production.

I do see value in having an easy way to mock the production environment regardless of other configuration settings, though. We would be well served by making it easy for folks to demo their tasks in a production-esque environment to minimize the amount of triage that is needed after their tasks make it into production.

I'm not totally sure at the moment, but I wouldn't like to disallow that. Otherwise people would have to assign directly to self._result (if they needed it), therefore circumventing all the value checks.

Valid point. Makes more sense to me now.

The basic idea is that CheckDetail is a simple container for check results.

If it's deciding when to print straight to stdout, I'm not sure it's a simple container

update_result() is simply convenience method for updating the result according to its severity. But if I force this behavior by default, it is no longer a simple container, with a few convenience methods on top of it. Maybe it would help to rename update_result() to something better reflecting the fact that it works with result priority/severity? New name ideas welcome.

update_result works for me

Realistically, there will have to be some form of wrapper in Python - even if that's just the directive that handles shell scripts, perl or whatever else we may end up supporting in the future.

Yeah, but that's under the hood, invisible to the check developer. I was talking about AutoQATest-like wrappers.

While we do want to refrain from python-specific stuff to keep the door open to supporting other languages, I don't see why that means we can't use python-specific conveniences when working with checks written in python.

Sure, no disagreement here.

While I see where you're going, I think that's too simple and makes any filtering and log monitoring more difficult than it needs to be. There is value in being able to easily search for and filter out errors and warnings. Just because we're limited to stdout/stderr doesn't mean that we can't use a log format to distinguish errors and warnings from info and debug messages, even if they are dumped to the same logfile.

Hmm, but taskotron itself is not limited to stdout. Only the check is/can be. Our library will have logging handlers that can send all the warnings/errors/etc to a log file (/var/log/taskotron), or remote logging server, whatever we wish. So there's no need to parse stdout of the check. Some of the critical errors from our library will be reflected there, but they will also be reflected in our advanced logging targets, i.e. the logstash server. In my view, the check output is completely orthogonal to taskotron logging. They are two separate things, and even if the check simply prints to stdout, we don't lose any important message from libtaskotron.

Of course, if the check author uses the same logging system, we can do more - we can filter the very check output for errors (not originating from our library, but from the check itself). But honestly, how many of the third-party checks will do that? I assume none, only our own checks might have this integration. Also, are we really interested in it? Let's suppose some third-party check is really integrated with our python logging system and starts spewing lots of infos and warnings in the process. Do we really want to see all of this on our log server? Because personally I don't care much if glibc-test encounters some errors - it's its test author problem, not mine. I only care for libtaskotron to work properly.

As for the checks we have under our control - say upgradepath - it might be nice to be able to filter for errors directly coming from upgradepath, not just libtaskotron. But for the basic functionality we don't really need this. If upgradepath ends in NEEDS_INSPECTION, I go through its log and find the cause. Anything more advanced might be nice, but I'm not even sure it would be related to CheckDetail, and even if it were, it's definitely not in the scope of this patch.

At the same time, I think that just using print statements is going to be problematic and removes a valuable level of information in the form of saying "this was an error message" or "this was just a warning" or "this was just a debug message and we can probably ignore it in most cases".

In my reworked upgradepath, I use printing when I want to print, and I use python logging when I want to log. Maybe I should rename CheckDetail.log() to CheckDetail.store() or something. Because it was never meant to replace logging. You can still log and I do it.

Using something like logging also allows us to have different log messages emitted in different configurations. This could mean that a check developer might enable debug logging when triaging a check failure/bug or that we enable debug logging in production only when we're seeing problems which need another level of information.

Yeah, good idea. I assume we could have a yaml directive:

debug: True

to enable libtaskotron debugging.
If the developer needs check debugging, the best way is to implement --debug cmdline option and then modify the yaml file to run the command with this option. We might think of even better ways, but again, it's probably out of scope here.

In the interest of being able to filter log messages for monitoring and to enable extra debug levels as needed without code changes, I think that using something like logging in the checks, even if it's a separate setup from the rest of libtaskotron, is a better choice than using bare print statements.

Yeah, look here:
https://bitbucket.org/fedoraqa/task-upgradepath/src/282a315c52363eefac466b717ca5017efc7b8c4a/upgradepath.py?at=feature/T79-upgradepath

import logging
logger = logging.getLogger('upgradepath')
logger.setLevel(logging.INFO)

And I use it heavily. But it's a different use case from using CheckDetail.log(), which is for storing the main output of the check. Heck, I'll really need to rename the method to avoid this confusion.

The basic idea is that CheckDetail is a simple container for check results.

If it's deciding when to print straight to stdout, I'm not sure it's a simple container

A simple container with a few convenience methods. This is one of them. Because I got tired of this:

msg = '  Latest package: youtube-dl-2014.03.07.1-1.fc21'
check_detail.output.append(msg)
print msg

and this seemed to be much more convenient:

check_detail.log('  Latest package: youtube-dl-2014.03.07.1-1.fc21')

I think this review process has become a bit too heavyweight. I feel like we're trying to have the whole project designed and thought through into the smallest detail in advance, before committing even the simplest patch. It doesn't seem like an iterative process much. And we don't have too much time to spare. I'd rather make changes faster and adjust if we find out we need to adjust, than trying to think of all the possibilities in advance. If we discuss our logging infrastructure and how it should work in tickets like these, it will get buried in the pages and pages of text, and we will never find it again when dealing with it in the really relevant tickets. I'd rather discuss the topic separately and once we reach a conclusion, then make changes in parts of code where it is needed.

kparal updated this revision.Mar 11 2014, 1:56 PM
  • rename log() to store()
tflink accepted this revision.Mar 12 2014, 7:26 AM

Hmm, but taskotron itself is not limited to stdout. Only the check is/can be. Our library will have logging handlers that can send all the warnings/errors/etc to a log file (/var/log/taskotron), or remote logging server, whatever we wish. So there's no need to parse stdout of the check. Some of the critical errors from our library will be reflected there, but they will also be reflected in our advanced logging targets, i.e. the logstash server. In my view, the check output is completely orthogonal to taskotron logging. They are two separate things, and even if the check simply prints to stdout, we don't lose any important message from libtaskotron.

While I agree that they aren't the same things, I don't see why we wouldn't want to collect at least some of the task logs outside of the log messages produced by libtaskotron directly.

One of the things I'd eventually like to do is have searching and analytics over all of the task logs. One thought on making this happen is to use logstash + elasticsearch. While I can think of other ways to accomplish task log searching, that would likely be one of the easiest options.

Of course, if the check author uses the same logging system, we can do more - we can filter the very check output for errors (not originating from our library, but from the check itself). But honestly, how many of the third-party checks will do that? I assume none, only our own checks might have this integration. Also, are we really interested in it? Let's suppose some third-party check is really integrated with our python logging system and starts spewing lots of infos and warnings in the process. Do we really want to see all of this on our log server? Because personally I don't care much if glibc-test encounters some errors - it's its test author problem, not mine. I only care for libtaskotron to work properly.

I suppose a lot of this comes down to how much control we want to attempt asserting over 'third-party' checks. For the error concern, we could disable execution of any tasks which abuse the error tags or have noisy output. Not sure if that's a good idea or not but it would certainly be an option.

I don't see how suggesting/requiring a format akin to [<checkname> DEBUG] <timestamp>: some debug message here would be that bad and it would be completely compatible with stdout-only.

Yeah, good idea. I assume we could have a yaml directive:

Yeah, allowing a debug state to be set via directive or via internal runner setting sounds like a good plan. We could change it on a global or a per-task basis.

I think this review process has become a bit too heavyweight. I feel like we're trying to have the whole project designed and thought through into the smallest detail in advance, before committing even the simplest patch. It doesn't seem like an iterative process much.

Yeah, I think this is the longest review I've ever participated in - both in term of time wspent on it and lines of text written :)

And we don't have too much time to spare. I'd rather make changes faster and adjust if we find out we need to adjust, than trying to think of all the possibilities in advance. If we discuss our logging infrastructure and how it should work in tickets like these, it will get buried in the pages and pages of text, and we will never find it again when dealing with it in the really relevant tickets. I'd rather discuss the topic separately and once we reach a conclusion, then make changes in parts of code where it is needed.

You bring up an excellent point. I'm reminded of the saying "never let perfect be the enemy of good". I'm still not a huge fan of print statements but not enough to delay this any longer.

kparal closed this revision.Mar 12 2014, 12:25 PM

Closed by commit rLTRNbc3b5b6e3cb6 (authored by @kparal).