Yes, you should use a Python venv in a container like docker
Brought to you by "/r/python answers are so bad I had to write this"
Summary
While it's perfectly workable to not use a venv in a container like a Docker image, you probably want to use one.
Indeed, truly minimal images are very rare and costly. Unless you are sure you have one and can pay the price for maintaining it that way for the whole duration of the project no matter who works on it, don't bother.
The image you are working with is likely to have Python system libs and installing your project deps as root has a non-null chance of having a bad interaction with them.
You don't have to but...
Given the nature of a container is to contain, it's counter intuitive to think you would need to add another layer of isolation, and therefore indirection on top of it.
After all, isn't the whole point of having a venv to avoid conflict between several code bases? You only have one in there, so surely we can skip it.
First, yes, you can skip it.
There are many deployed software out there that install packages directly at the root of the container, even with sudo
. And they work.
As usual, it's not about possibility vs impossibility, but the value for your bucks.
And given how cheap venvs are, the ROI of using one is really good, even in a container.
The most obvious reason is homogeneity: no matter where you are, you use a venv. There is no need to deal with variations and the complexity that come with it.
You have a venv in your small prototype, and when it moves to its own container, you still have a venv.
All docs have the same stakes, no matter the project or container, no matter the image, once you are in the venv. All actions have the same consequences.
It's just less cognitive overload, fewer things to know, fewer things to care about. And fewer mistakes you can make.
The orgs that actually manage to have very homogeneous experience across their whole stack and dev processes are very rare even with virtualization. Images change, versions are upgraded, dev machines have different setups.
What’s more, if you are working in the context of those rare unicorns, you will work in other contexts, and people will come from other contexts.
So the main reason is like for PEP 8: the community is used to it, and this makes for a common experience.
And once again, venvs are so damn cheap.
How minimal is your image?
The chance you actually use a barebone image for your container is very low. Because it's super hard.
It's hard to do. And it's hard to keep doing.
It requires a tremendous amount of ever-shifting knowledge about the distro itself, the interactions between the various packages in an ever more and more complex ecosystem, and a great understanding of your own stack dependencies plus all the consequences that come with bootstrapping them.
It's like expecting a homemaker to know the molecular composition of each of the materials used to build a house. It's something a big project may want and can afford, but most day-to-day constructions don't and can't.
It's for Netflix's level of reproducibility and security requirements.
For the average dev, however, having things like cryptography libs, a network stack or a package manager working out of the box without having to become a part-time Debian maintainer is pretty important.
After all, it's not just about finding the exact arcane combination of minimal deps that will work with your project, it's also about making sure it stays that way for the whole project lifespan, no matter how many people work on it.
Plus in the end, are you sure you got all the Python out?
Because it turns out a lot of system packages are written in Python. yum
, dnf
, do-release-upgrade
, aptdcon
, ubuntu-security-status
...
Even without all that, you will want to install dev tools in your image at some point, and they may very well be written in Python as well. Cause I hear it's a pretty popular language.
And if you start pip installing things without a venv, there will come a day where something from Pypi and something from the OS will meet, and they will not like each other.
It may not even happen right now, but in a few months down the road, and you'll have a huge mess to troubleshoot.
Then it will be all about "python packaging sucks!" again. Except it will be your fault.
Just because you didn't want to use a one-liner to create a venv.
Even if the risk is 0.00001%, it's not worth it.
In my experience though, the risk is higher than that.
People don't realize it, because once they get a problem later on, they don't link it to their decision of not using a venv. They blame the language. Or the ecosystem.
But this corrupted *.so
, this missing certificate on a GET call or this weird bad import was totally avoidable. With a venv.
This is not PHP, Ruby or JS we are talking about: you are likely not the only Python user on your machine. The creators of the distros you are deploying on may be actively using it too. So don’t cross the streams. It’s bad.
But I am careful!
Maybe.
And maybe you are super competent.
Now, again my experience is that most devs are average (duh), and most, like most drivers, think they are above average. They either overestimate their ability to deal with complex systems or underestimate said complexity. They don't know what they don't know and are happy to be oblivious about it.
But above all, they have low skin in the game. They don't pay the price for their bad architecture decisions. They think they do, but what they pay is usually just a strong inconvenience. The price is split between the team, the company, and the users.
But ok, it's fair that I assume that's not the case for you. You are good enough. You are realistic. You are careful.
Who else is working on the project, right now? In a year?
Don't confuse "works now in my machine" and "will work in 3 months in prod after the intern changed a thing & openssl released a zero-day patch".
Not to mention, how much do you want to spend in maintaining this beautiful pure state of venv-less installation? In doc? In training? In process?
One wrong command while changing your image and you may introduce back a Python system dep you didn’t know about. It’s so easy to mess up.
And what for?
What's the goal?
If you can't answer this in a split-second, just use a venv.
There are good reasons to skip a venv, of course. But someoneone who has one of them know about them already, and there is no philosophy involved, no debate.
E.G:
you need access to system deps from your Python code that are in the distro packages, but not in Pypi, and
--system-site-packages
didn’t work for you.you use another isolation layer such as, God may have mercy on your soul, buildout.
you compile stuff yourself, everything is entirely custom with exotic linking rules, and you want a tight level of control.
A professional with those needs, however, doesn't need such an article and would know, by reading it, to self-exclude from it.
But...
I use
--user
!
You still risk PATH
and shadowing issues. Use a venv.
I use
pyenv
!
It's orthogonal. Use a venv with pyenv
as well.
It's slower to build my image!
Image-building time consideration is probably not a good concern compared to the rest except in extremely rare cases. Make sure you are in this case.
Also, if you use uv
, it will be faster than installing at the system level.
Soon, you won't have a choice because major distros are adopting PEP 668. This will make pip install
fail in the default system Python and show an error telling you to use a virtual environment.
system-wide shared libraries are becoming less common as newer languages avoid them. Python can’t easily remove the shared package directory due to compatibility issues, but both Python and distros can push users towards using isolated package environments for each project, avoiding conflicts with the system. And this will cascade to the base images eventually.
Other good reasons to use a venv in containers
If avoiding the risk of breaking your image and limiting complexity are not sufficient reasons, here are a few bonuses:
The list of your project deps in production will be easy to compare to the ones in dev. Or if you also use the container for dev, across upgrade of images.
No need to have admin rights, which you may want to make attacks harder for an intruder in your container.
You can easily use a third-party tool such as
uv
,poetry
,anaconda
with default values to install the dependencies, instead of being careful to force them to install at the system level.You can have several venvs with several versions of Python or libs to quickly test upgrades. It's much lighter than using several images.
You can have your venv in a volume, allowing all sorts of interesting decoupling between the image and your Python deps.
Great fragment about keeping things uniform. It helps so much and still is not established even amongst the biggest python projects.
>>> Brought to you by "/r/python answers are so bad I had to write this"
:D :D :D