Summary
Instead of having to limit sanity checks to the boundaries of the program, we could re-use those as function contracts using the assert
keyword. Indeed, setting PYTHONOPTIMIZE
removes all assert
, making the check useful in dev, and free in production.
Unfortunately, the community doesn't know about the feature, and use assert
for things that should never be removed, so using the flag would likely introduce bugs into your program.
Keeping your sanity in checks
While type hints, linters and unit tests help a lot of developers to avoid introducing bugs, there is no substitute for sanity checks.
Now, experience tells us it's better to keep those checks at the program boundaries.
Consider this very important function:
def dog_to_human_years(dog_year: float, is_a_good_dog: bool=True) -> float:
dog_year *= 7
if is_a_good_dog:
dog_year -= 10
return dog_year
dog_year
should be > 0, yet you probably don't want to do this:
def dog_to_human_years(dog_year: float, is_a_good_dog: bool=True) -> float:
if dog_year <= 0:
raise ValueError("Dog years should be a positive real number")
dog_year *= 7
if is_a_good_dog:
dog_year-= 10
return dog_year
But rather put the check in a data validator module, that you use everywhere you get data input, such as file reading, web form, etc. That's what pydantic is good at. This keeps core functions simpler, separate concerns, will lead to better performances, potentially gives feedback to users and so on.
However, putting it in dog_to_human_years()
would help catching bugs in your code base, which the best practice would not.
Is there a way to have the best of both worlds?
Contract based programming with assert
Python comes with a peculiar keyword: assert
.
It's very well known from the users of the excellent pytest library, which makes the whole Python testing experience so much better.
In essence, it checks if something is True
, and if not, it raises an AssertionError
. You can even add a small error message:
>>> assert 1 > 0
>>> assert 1 < 0
Traceback (most recent call last):
Cell In[4], line 1
assert 1 < 0
AssertionError
>>> assert 1 < 0, "Yeah, no"
Traceback (most recent call last):
Cell In[5], line 1
assert 1 < 0, "Yeah, no"
AssertionError: Yeah, no
However, one thing that it little known about assert
, is that Python comes with a "-O" flag, and the associated PYTHONOPTIMIZE
environment variables, to turn on "optimized mode".
This skips all assert
completely, they are not executed at all:
$ python3.10 -O
Python 3.10.12 (main, Jun 7 2023, 12:45:35) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> assert 1 < 0
>>>
This means we can implement contract based programming with assert
from sanity_checks import is_valid_dog_years
def dog_to_human_years(dog_year: float, is_a_good_dog: bool=True) -> float:
assert is_valid_dog_years(dog_year), "Dog years should be a positive real number"
dog_year *= 7
if is_a_good_dog:
dog_year -= 10
return dog_year
We can now have a validation layer, and use it as a contract in this function, helping devs catching any mistake they would make themselves that would lead to dog_year being negative.
In production, you can then set PYTHONOPTIMIZE
to 1, and the check is removed, so no performance cost.
This is a very powerful tool, because you have access to the full power of the programming language, not just a subset like with a dedicated contract syntax. And unlike with very advanced typing systems, you don’t need to provide a complex proof that your function input matches your expectation. Of course this means your IDE is not going to be able to help your with that, which is a compromise we often have to make with Python.
You are not limited to one assert
, you can put several, and even one at the end of the function to check the properties of the calculation remain congruent.
Optional logging
The "optimized" mode does more than skipping assert
, it also skips any check on the __debug__
magic variable. In fact, it removes the block completely, it's not even part of the final byte code.
So you can have very expensive debugging logs like this:
for x in range(10000):
result = calculate_something_expensive()
if __debug__:
log.debug(result)
And only pay the price when you want it. It's way cheaper than using log.setLevel()
, it's even cheaper that structlog’s strategy of setting handlers to noop functions. Because there is nothing faster than no code.
Remember, we can't make the CPU go faster, we can only make it do less work.
But it was all a dream
This killer feature has been part of Python for years. Python 2.7 had it already.
And you can't use it.
You can't, because the vast majority of the community doesn't know about it.
Bet a lot of your didn't know this existed before reading the article.
And so Python coders all over the world use assert
for things that should not be removed.
Every year I attempt to use it once, in hope that this time, maybe, just maybe, I will be able to enjoy powerful free checks for my entire code base.
And every year, some dependency deep inside my virtual environment crashes.
I love this thinking and we need more of it. I assume this should work for projects with no or few dependencies.
This is akin to "Design by Contract", introduced (if I'm not mistaken) and popularized by Bertrand Meyer in his Eiffel programming language and in his great book "Object-Oriented Software Construction" (https://en.wikipedia.org/wiki/Object-Oriented_Software_Construction).