3% of 666 Python codebases we checked had a silently failing unit test

4 min readFeb 16, 2022

Lets coin a name for a very special type of unit test:

Schroeder’s unit-test: a unit test that passes but fails to test the thing we want to test.

This article focuses on our code scanning of 666 Python codebases to detect one such Schroeder’s unit tests. Specifically, this gotcha (which we found in 20 codebases and later pushed an auto-generated fix for):

Do you see the mistake? It’s a relatively common mistake for a developer to make. assertEqual and assertTrue were muddled up. In fact of the 666 codebases we checked, we found 3% of them mixed the two up. This is a real problem that affects real codebases, even active large ones with a huge community and strong testing and code review practices:

Python.org (ironically enough)
Celery
UK Ministry of Justice
Ansible
Django CMS
Keras
PyTorch
Ray
Salt

The fact the problem is so widespread suggests a key cause of this mistake is because it’s so easy to miss during code review. A quick glance at the code does not raise any red flags. It “Looks Good To Me!”.

Schroeder’s unit-test

In the late 19th century Mark Twain apparently said It’s not what you know that gets you. it’s what you think you know but ain’t. What was true in late 19th century literature is also true in early 21st Python unit testing: tests that don’t really test the thing we want to test is worse than no tests at all because a Schroeder’s test gives unwarranted confidence that the feature is safe.

Have you ever encountered this workflow:

Code change committed, and the tests for the feature passes locally.
The test ran in CI during the pull request build, and no failures reported.
The change was peer reviewed and accepted during code review. (“LGTM!”)
The change was then deployed to prod… only for it to fail upon first encounter with the user
After reading the bug report proclaim to yourself “but the test passed!”

Upon revisiting the unit test it’s then noted that the unit test that gave such great confidence during the software development life cycle was in fact not testing the feature at all. If you encountered this problem then you encountered this class of unit tests which this specific TestCase.assertTrue is one of.

The TestCase.assertTrue gotcha

While pytest is very popular, 28% of codebases still use the built-in unittest package. The codebases that do use the built-in unittest package are at risk of the assertTrue gotcha:

The docs for assertTrue state that test that expr is True, but this is an incomplete explanation. Really it tests that the expr is truthy, meaning if any of the following values are used in as the first argument the test will pass:

'foo', ‘bar’, 1, [1], {‘a’}, True

And conversely with falsey values the test will fail:

'', 0, [], {}, False,

assertTrue also accepts a second argument, which is the custom error message to show if the first argument is not truthy. This call signature allows the mistake to be made and the test to pass and therefore possibly fail silently.

For example when making 20 PRs fixing the problems we found one of the tests started to fail due to faulty application logic once the test was no longer a Schroeder’s unit test:

We also found that at least two of the tests failed because the logic in the test was wrong. This is also bad if we consider unit tests to be a form of documentation describing how the product works. From a developer’s point of view, unit tests can be considered more trustworthy that comments and doc strings because comments and doc strings are claims written (perhaps long ago and perhaps now obsolete or incomplete), while a passing unit test shows a logic contract is being upheld…as long as the test is not a Schroeder’s test!

Perhaps more data is needed, but this could mean 15% (3 in 20) of Schroeder’s unit tests are hiding broken functionality.

TestCase.assertEqual

When the CodeReview.doctor github bot detects this mistake in a GitHub pull request the following solution is suggested:

You can securely scan your entire codebase online for free at CodeReview.doctor

3% of 666 Python codebases we checked had a silently failing unit test

Schroeder’s unit-test

The TestCase.assertTrue gotcha

TestCase.assertEqual

Written by Code Review Doctor

No responses yet