This post talks about Python exception reference cycles: what they are, why they can be problematic and how to avoid them.
What are Python exception reference cycles
A Python exception has a reference cycle when it has a reference chain back to itself (i.e. “exc -> … -> exc”). The following code shows an example of the cycle:
1 2 3 4 5 6 7 8 9 | |
Why are Python exception reference cycles bad
To understand why Python exception reference cycles are bad, we need to understand some behaviors of Python GC and Python exceptions.
Python GC
Python has two ways to find objects to deallocate: reference counting finds objects whose ref counts are zero and the garbage collector finds objects with circular references. For exceptions with reference cycles, only the garbage collector can deallocate them.
Unless manually triggered by gc.collect(), Python GC runs every X (allocations - deallocations). Unlike other languages, it’s not triggered under memory pressure so it’s purely count based. The implication is that a program can hold lots of memory in a few large objects with reference cycles and doesn’t trigger GC because the count threshold isn’t reached. In the worst case, the program can OOM.
Python exceptions
Unlike many other languages, a raised Python exception has references to all frames of the call stack from main all the way to where the exception is raised, and each frame has references to all local variables. As a result, a Python exception can keep lots of objects alive (i.e. prevent ref counts from going down to zero).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
The net effect is that an exception with a reference cycle can hold lots of objects alive until the next GC happens which can be greatly delayed and the program may just OOM or run out of resources (e.g. open files) before that.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |
If you run the above program, you will get
1 2 | |
and this shows that we now rely on GC to deallocate the Giant object which is undesirable.
How to avoid Python exception reference cycles
It turns out the issue we described above is well known to the Python developers and there is a PEP that’s implemented to help us break the exception reference cycle. Basically we should avoid having local variables point to the exception. In our example, we can do the following to break the reference cycle:
1 2 3 4 5 6 7 8 9 10 11 12 13 | |