Summary
PDB is an ugly but convenient debugger that is always available with Python.
Using the breakpoint()
function, you can pause any program at a specific line, and enter a debugging shell.
In this shell you can run any Python code and access the program state at this line.
You can also use PDB commands to help you explore your program:
help
lists all commands or shows the help on a commandquit
exits the debuggerlist .
shows you where you arenext
executes the next linecontinue
runs the program until the next stopuntil line
runs the program until a line numberjump line
skips the execution until a line numberdisplay
shows the result of an expression when it changesstep
andreturn
go in and out of function callsup
anddown
zoom in and out of the call stack
Why learn to use PDB?
When I started programming, I used a very crude method of debugging: print()
.
But now that I have almost 2 decades of experience, I can tell you that... I still use mostly print()
.
Although now I type print()
way faster.
Once in a while, some bug escape the magical transcendence of print()
, and I have to use tooling.
The most fundamental of tools for the job is the debugger, and I meet more and more coders that have never used one, so I decided to write a post for them.
There are plenty of debuggers for Python, a lot of editors come with one. However we are very lucky because the language itself provides one by default!
It's really bare-bone, but it's always there. No matter your OS, the Python version, what tools you use, PDB is always there for you.
Also, it's fast. Debugger tends to slow down the program they debug but PDB has a very minimal overhead.
This is why, despite the fact it is not the comfiest debugger in town, nor the prettiest, ...nor the anythingest, it's good to know how to use it.
In fact, if you know how to use PDB, you know how to use any other debugger, so it's time well invested.
First step in PDB
Imagine you have a small script that checks blood type compatibility:
compat = {
"O-": ["O-"],
"O+": ["O+", "O-"],
"A-": ["A-", "O-"],
"A+": ["A+", "A-", "O-", "O+"],
"B-": ["B-", "O-"],
"B+": ["B-", "B+", "O-", "O+"],
"AB-": ["AB-", "B-", "O-", "A-"],
"AB+": ["AB+", "O+", "A-", "A+" "B-", "B+", "AB-", "AB+", "O-"],
}
def survive(blood_type, donated_blood):
return donated_blood in compat[blood_type]
def main():
blood_type = input("Enter your blood type: ")
donated_blood = input("Enter the blood type you received: ")
if survive(blood_type, donated_blood):
print("No, not I, I will survive")
else:
print("ded")
if __name__ == "__main__":
main()
Sometimes, it gets a KeyError:
Enter your blood type: a+
Enter the blood type you received: b-
Traceback (most recent call last):
File "/home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py", line 28, in <module>
main()
File "/home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py", line 21, in main
if survive(blood_type, donated_blood):
File "/home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py", line 14, in survive
return donated_blood in compat[blood_type]
KeyError: 'a+'
We can explore the state of the program right before the error by calling breakpoint()
just before the line 14:
def survive(blood_type, donated_blood):
breakpoint()
return donated_blood in compat[blood_type]
And start the program all over again.
This will run the program until this point, and start the Python debugger:
Enter your blood type: a+
Enter the blood type you received: a-
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(15)survive()
-> return donated_blood in compat[blood_type]
(Pdb)
(Pdb)
is the prompt of a new type of shell, a debugging shell. It has access to the entire state of the program at this point. You can enter any valid python code, provided it fits on one line:
(Pdb) print('hello')
hello
(Pdb) from datetime import date
(Pdb) date.today()
datetime.date(2023, 5, 9)
(Pdb)
But more interestingly, you can use the current state of the program in the code:
(Pdb) print(donated_blood)
a-
(Pdb) compat[blood_type]
*** KeyError: 'a+'
And just like that, we have found that this is the particular part of the code that triggers the error. We can now experiment live:
(Pdb) "a+" in compat
False
(Pdb) compat.keys()
dict_keys(['O-', 'O+', 'A-', 'A+', 'B-', 'B+', 'AB-', 'AB+'])
(Pdb) "A+" in compat
True
So the problem was that we used a lowercase "a", and the dictionary contains uppercase "A". We can even check a solution in the debugger:
(Pdb) compat[blood_type.upper()]
['A+', 'A-', 'O-', 'O+']
At this stage, we can stop our session by using quit
(without parentheses):
(Pdb) quit
Traceback (most recent call last):
...
File "/usr/lib/python3.8/bdb.py", line 113, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
This quite literally crashes the program to exit immediately. Don’t worry, no snakes have been harmed.
PDB commands
We just used quit
without parentheses, and it did something. It's uncommon in python.
It's because quit
it is not regular Python code, but rather a PDB command.
The most import command is help
, which alone list all other commands:
(Pdb) help
Documented commands (type help <topic>):
========================================
EOF c d h list q rv undisplay
a cl debug help ll quit s unt
alias clear disable ignore longlist r source until
args commands display interact n restart step up
b condition down j next return tbreak w
break cont enable jump p retval u whatis
bt continue exit l pp run unalias where
Miscellaneous help topics:
==========================
exec pdb
And if help
is given a command name, it prints some information about this command. E.G, to get some help about the next
command:
(Pdb) help next
n(ext)
Continue execution until the next line in the current function
is reached or it returns.
The list
command
list .
(notice the dot) will list the line of code where you are. E.G., if I put a break point here:
breakpoint()
print("No, not I, I will survive")
Then:
(Pdb) list .
18 blood_type = "A+" or input("Enter your blood type: ")
19 donated_blood = "A+" or input("Enter the blood type you received: ")
20
21 if survive(blood_type, donated_blood):
22 breakpoint()
23 -> print("No, not I, I will survive")
24 else:
25 print("ded")
26
27
28 if __name__ == "__main__":
The arrow tells you the next line to be executed is line 23.
If you don't pass the dot, list
will paginate 11 lines from the previous list
call. It's not that useful, so use the dot.
The next
command
next
will execute the next line (the one with the arrow in list
). If we are in this context:
(Pdb) list .
18 blood_type = "A+" or input("Enter your blood type: ")
19 donated_blood = "A+" or input("Enter the blood type you received: ")
20
21 if survive(blood_type, donated_blood):
22 breakpoint()
23 -> print("No, not I, I will survive")
24 else:
25 print("ded")
26
27
28 if __name__ == "__main__":
Then using next
will do:
(Pdb) next
No, not I, I will survive
--Return--
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(23)main()->None
-> print("No, not I, I will survive")
You can see before the "--Return--" that "No, not I, I will survive" has been printed.
The continue
command
continue
will carry on the program execution until the next break point. If no break point is encountered, the program will execute normally until it ends.
The until
command
until line
will carry on the program execution until the line number you give to it is reached. Very useful to go through a loop. If we are here:
(Pdb) list .
18 breakpoint()
19
20 -> for x in range(10):
21 print(x)
22
23 print("Dobby is freeeeeeee")
24
Then we can do:
(Pdb) until 23
0
1
2
3
4
5
6
7
8
9
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(23)main()
-> print("Dobby is freeeeeeee")
To execute everything until we reach line 23 (which is not executed yet).
until
will honor break points, so it may not reach the line you give to it if there is a break point on the way.
The jump
command
jump
is like until
, but it goes directly to the line, and does not execute any code in between. Useful to skip some code you don't want to run. Also, it skips break points on the way. With the same example as before:
(Pdb) jump 23
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(23)main()
-> print("Dobby is freeeeeeee")
You can see nothing is printed, because the loop is not executed at all and we jump directly to line 23.
The display
command
display code
will run the Python code you give to it and display the value. It will do this when you call it, then every time some part of the program is executed AND the value changes. So it may display the value if you call next
, continue
, until
, etc.
You can use it to keep track of some calculation as you explore the program without having to print()
it every time. E.G., lets print the current time:
(Pdb) display datetime.now()
display datetime.now(): datetime.datetime(2023, 5, 9, 9, 26, 9, 113475)
(Pdb) next
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(20)main()
-> donated_blood = "A+" or input("Enter the blood type you received: ")
display datetime.now(): datetime.datetime(2023, 5, 9, 9, 26, 10, 922923) [old: datetime.datetime(2023, 5, 9, 9, 26, 9, 113475)]
(Pdb) next
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(22)main()
-> if survive(blood_type, donated_blood):
display datetime.now(): datetime.datetime(2023, 5, 9, 9, 26, 12, 93853) [old: datetime.datetime(2023, 5, 9, 9, 26, 10, 922923)]
In some debugger, this feature is called "watch expression" or "spy expression".
If you don't want to see it anymore, call undisplay
.
The "step" and "return" commands
step
and return
are both used together to get inside and exit functions or methods.
Indeed, if you are here:
25
26 breakpoint()
27 -> if survive(blood_type, donated_blood):
28 print("No, not I, I will survive")
And you call next
, you will execute survive(blood_type, donated_blood)
and go to line 28. But you will not see what happens inside survive
.
step
is like next
, but for this particular case. It will execute the survive
function, but put you in the first line inside it:
(Pdb) step
--Call--
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(13)survive()
-> def survive(blood_type, donated_blood):
(Pdb) list .
8 "AB-": ["AB-", "B-", "O-", "A-"],
9 "AB+": ["AB+", "O+", "A-", "A+" "B-", "B+", "AB-", "AB+", "O-"],
10 }
11
12
13 -> def survive(blood_type, donated_blood):
14 return donated_blood in compat[blood_type]
15
16
17 def main():
18 for x in range(10):
This way, you can call next
and see how this function is working step by step.
return
does the opposite of step
. You use it from inside a function to get to the end of its execution immediately:
(Pdb) return
--Return--
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(14)survive()->True
-> return donated_blood in compat[blood_type]
Now, you can next
your way out of the function and go back where you were before step
.
The up
and down
commands
up
and down
are my favorite commands. They don't execute anything, but they zoom in and out of the code by letting you go up and down the call stack.
If this means nothing to you, it's a way to answer the old question, "who is the idiot passing this function the wrong parameters?".
Let's say I'm here:
(Pdb) list .
10 }
11
12
13 def survive(blood_type, donated_blood):
14 breakpoint()
15 -> return donated_blood in compat[blood_type]
16
17
18 def main():
19 for x in range(10):
20 print(x)
If I want to know what part of the code is passing me blood_type
, I can use up
:
(Pdb) up
> /home/user/Work/ecriture/bytecode.dev/newsletter/20230508_pdb/script.py(27)main()
-> if survive(blood_type, donated_blood):
(Pdb) list .
22 print("Dobby is freeeeeeee")
23
24 blood_type = "A+" or input("Enter your blood type: ")
25 donated_blood = "A+" or input("Enter the blood type you received: ")
26
27 -> if survive(blood_type, donated_blood):
28 print("No, not I, I will survive")
29 else:
30 print("ded")
31
32
I've now zoomed out to see the bigger picture, and I can see that line 27 is where this parameter comes from.
Of course I can down
to go back to my previous zoom level, or use up
again to zoom out even more (although in this program, I'm already at the top).
It's easy to confuse up
/down
with step
/return
, but they don't do the same thing. up
/down
don't execute anything, they just change your point of view. Plus you can use step
on any function on its way, but down
can only be called if called up
before.
Despite that, after a step
, I like to use up
+ next
to get out of a function instead of return
+ next
. I find using up
usually gives me the same result and is easier. Also up
lets you peek at what happens upstairs and go back, while return
is quite definitive.
It's a matter of taste, really.
Tips and tricks
Commands all have a one letter shortcut.
next
can be abbreviatedn
.help
can be shortened toh
, etc.If you need to run a command several times in a row, you can enter it once, then press
enter
several times. In PDB, an empty prompt reruns the previous command. Very useful if you want tonext
many times.Commands can conflict with regular Python code. E.G: the
list()
Python function and thelist
PDB command, or the“a”
PDB command and a“a”
variable. In this case, PDB commands have the priority if they are at the start of the line, and Python code otherwise. You can force PDB to understand something as Python code by starting with“!”
. E.G:!list()
.You can have as many
breakpoint()
as you want, don't limit yourself to one. In fact, there is even a command to add breakpoints to any line while running PDB:break
.You can put
breakpoint()
in aif
, if you want to trigger it only under a condition. It's so useful there is even a command for that:condition
. But I confess I usually write theif
manually because my editor makes it easy.Prior Python 3.7,
breakpoint()
didn't exist, and people had to typeimport pdb; pdb.set_trace()
.There are better shell debuggers. E.G: ipdb is like PDB + iPython. You can set which debugger
breakpoint()
starts with the environment variablePYTHONBREAKPOINT
. If this means nothing to you, there is another article for that.We didn't cover all commands in this article. Once you get comfortable with PDB, explore the other ones.
If you start a Python script using
python -m pdb path/to/script.py
, you will start PDB in post-mortem mode. This will start a PDB shell immediately, but if youcontinue
, it will run the entire program normally. However, if the program crashes, it will open a debugging shell right where the exception occurred. Very useful to debug a crash. The command supports-m
itself, so you can dopython -m pdb -m module_to_debug
Annoyingly, post-mortem debugging drops you into a debugger at the very starts of the program, which forces you to
continue
to get to the exception. You can pass-c c
to avoid this, as it runs thecontinue
command automatically for you.The PDB shell allows only one line of code, and has weird scoping. If you feel that it’s limiting you, type
interact
and you will be dropped into a regular shell. Exit the shell to resume debugging.If you pass
--pdb
to pytest, it will start a debugging shell at every failing test, at the line where it failed.The following alias is very useful in linux:
debug_module() {
if python -c "import ipdb" &>/dev/null; then
python -m ipdb -c c -m "$@"
else
python -m pdb -c c -m "$@"
fi
}
With this, calling debug_module your_module
will automatically use all the trick in the book: use ipdb if it exists, use the double “-m”, call “-c c”, etc.
Great article. pdb is one of those tools which has a pretty large ROI payoff and everyone should learn to use. One point I feel should be mentioned (and the main reason I switched to using the PyCharm debugger) is that pdb doesn't support debugging multithreaded programs very well (See https://github.com/python/cpython/issues/85743, https://github.com/python/cpython/issues/67352 and https://github.com/python/cpython/issues/65480)
Loving this post. I use pdb because I’m debugging code that executes within a docker container and docker compose. I use breakpoint(), step, continue, next, and list quite a bit. Jump, until, and display are new to me though. Does display stop and show a value whenever a value changes, even if you’re next-ing over a function call?