Nosetests Log Capture Pitfall

I use nosetests a lot, and one of its nice features is the ability to capture logs emitted during tests, and output them in case of failure to help you understand what's going on.

However, when using this feature, you should be aware of a small (yet annoying) pitfall. Let's take the following code, for example:

import unittest
import logging
class TestCase(unittest.TestCase):
 def test(self):
  x = []
  for i in range(10):
   x.append(i)
   logging.debug("x is now %s", x)
  assert False

It's pretty obvious to see why this test fails, and that is for the sake of simplicity. Running this test with nose, however, yields the following output:

======================================================================
FAIL: test (test.TestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test.py", line 10, in test
    assert False
AssertionError: 
-------------------- >> begin captured logging << --------------------
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
root: DEBUG: x is now [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
--------------------- >> end captured logging << ---------------------

What's going on? This can be very confusing at first - how does our x list start off at its final state? The answer lies in the way the logging module works, and the way nose uses that mechanism to capture logs.

When a log is emitted, logging actually emits a record describing the log. This record holds the metadata for the log (such as the time, log level, etc.). In addition to the message (or msg, as the log record holds it), the formatting args are kept separately. The reason for this is mostly for the sake of performance -- logging postpones the message formatting to the very last moment, when the log record is formatted and emitted by the logging handlers. This saves the formatting overhead in case the logs are filtered due to log level cutoff.

Nosetests bases off a standard library logging handler, BufferingHandler, which merely stores log records in-memory, without formatting them, only flushing them between tests. Since the records are not formatted, the args and msg are kept separately throughout the test execution. They're only formatted if necessary, i.e. on failure, at which point the args are formatted according to their state at that point. In our case, this is the already-full list.

It's needless to say this can happen in much less trivial situations, where it's even harder to spot the inconsistency in the logs.

So how do we solve this? Many would jump to the conclusion of doing this:

logging.debug("x is now %s" % (x,))

Or this:

logging.debug("x is now {}".format(x))

However, I wouldn't recommend it. In many cases the delayed formatting seems like a petty optimization, but it might have a great impact if removed - some objects have some seriously non-trivial __repr__/__str__ implementations, ranging anything from O(n!) computations to network traffic, such as in the case of network proxy objects. This means that you avoid this pitfall while testing, but are left with loads of unnecessary formatting in production code.

Another point is that if you have many tests (as you should), each with many occurrences of these logs, it is bound to eventually slow down your tests significantly. If you're anything like me, you always prefer your tests to be as fast as possible for immediate feedback, so this becomes annoying.

We should treat the delayed __str__/__repr__ call as a feature, not a bug. Representing objects nicely only if we actually need it is a key convenience for diagnosing applications and for interactivity -- as long as we are aware of the possible implications of this behavior.

Comments

blog comments powered by Disqus