Part IV: Memory Management and Object Layout

Part IV of this guide delves into the intricate details of how Python manages memory and structures its objects in memory. Understanding these concepts is crucial for writing efficient Python code, optimizing performance, and debugging memory-related issues. This part covers the low-level memory layout of Python objects, the garbage collection mechanism, and how Python’s memory allocator works.

10. Deep Dive Into Object Memory Layout

10.1. The Great Unification: PyObject Layout - Covers the low‑level PyObject C struct, including reference count, type pointer, and variable‑sized object headers. Explains how this uniform layout supports generic object handling.
10.2. Memory Layout of User Defined Classes - Explains how user‑defined classes are represented in memory, including the __dict__ for dynamic attributes and the __weakref__ slot for weak references. Discusses how this layout supports dynamic typing and introspection.
10.3. Memory Layout of Slotted Classes - Describes how using __slots__ optimizes memory usage by preventing the creation of a __dict__ for each instance, instead storing attributes in a fixed-size array.
10.4. Memory Layout of Core Built-ins - Explores the memory layout of core built‑in types and discusses how they are optimized for performance and memory efficiency, including the use of specialized C structs. The covered types are int, bool, float, string, list, tuple, set and dict.

11. Runtime Memory Management & Garbage Collection

11.1. PyObject Layout (Revision) - Describes the low‑level PyObject C struct, including reference count, type pointer, and variable‑sized object headers. Explains how this uniform layout supports generic object handling.
11.2. Reference Counting & The Garbage Collector - Details how CPython uses immediate reference counting to reclaim most objects deterministically, and the generational garbage collector built on top of reference counting to handle cyclic references.
11.3. Object Identity and Object Reuse - Covers the guarantees and pitfalls of the id() function, including object reuse for small integers and interned strings.
11.4. Weak References - Shows how the weakref module enables references that do not increment refcounts, supporting cache and memoization patterns without memory leaks.
11.5. Memory Usage Tracking - Introduces the gc module’s debugging flags and tracemalloc for snapshot‑based memory profiling and leak detection.
11.6. Stack Frames & Exceptions - Describes the structure of frame objects, how Python builds call stacks, and how exceptions unwind through frames.

12. Memory Allocator Internals & GC Tuning

12.1. Memory Allocation: obmalloc and Arenas - Explicates how CPython’s small‑object allocator (obmalloc) groups allocations into arenas and pools for performance.
12.2. Small-object Optimizations: Free Lists - Details the strategy of maintaining free lists for commonly used object sizes to avoid frequent system calls.
12.3. String Interning - Explains the intern pool for short strings, the rules for automatic interning, and how it reduces memory usage and speeds up comparisons.
12.4. GC Tunables and Thresholds - Covers configuration of generational thresholds and debug hooks to control garbage collection frequency and verbosity.
12.5. Optimizing Long Running Processes - Provides techniques for profiling memory behavior with gc.get_stats() and tracemalloc, and tuning thresholds for long‑running services.
12.6. GC Hooks and Callbacks - Shows how to register custom callbacks on collection events with gc.callbacks, enabling application‑specific cleanup.

10. Deep Dive into Object Memory Layout

Understanding how Python objects are structured in memory is perhaps one of the most fundamental insights for truly comprehending Python’s runtime behavior, performance characteristics, and memory footprint. Every piece of data in Python, from a simple integer to a complex custom class instance, adheres to a specific low-level memory layout. This chapter dissects these layouts in detail, revealing the underlying C structures that power Python’s object model and memory management.

10.1. The Great Unification: PyObject Layout

At the bedrock of CPython’s object system is the PyObject C structure. Every single Python object, regardless of its type, begins with this standard header. This uniformity is what allows the CPython interpreter to treat all data consistently: to count references, determine types, and perform basic operations generically. The PyObject header typically contains two crucial fields:

Py_ssize_t ob_refcnt: A signed integer that holds the object’s reference count. This count determines when an object can be deallocated, forming the basis of CPython’s primary memory management strategy.
struct _typeobject *ob_type: A pointer to the object’s type object. This pointer allows the interpreter to dynamically determine an object’s type at runtime, enabling polymorphic behavior and method dispatch.

While PyObject provides the essential object header, most complex Python objects (those that can participate in reference cycles, such as lists, dictionaries, custom class instances, and other mutable containers) are also tracked by the garbage collector. For these objects, an additional header, PyGC_Head, is prepended before the PyObject_HEAD in memory. This PyGC_Head contains two pointers, _gc_next and _gc_prev, which link the object into a doubly-linked list used by the generational garbage collector to manage and traverse objects.

Let’s visualize this common prefix assuming Py_ssize_t and pointers (ptr) are both 8 bytes on a 64-bit system:

Mental Diagram: PyGC_Head and PyObject_HEAD

Assuming ssize_t = 8 bytes, ptr = 8 bytes

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+\
|                 *_gc_next                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyGC_Head (16 bytes)
|                 *_gc_prev                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                 ob_refcnt                     | \
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes)
|                 *ob_type                      | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /

Following this universal header, the specific data unique to that object type is stored.

10.2. Memory Layout of User Defined Classes

When you define a standard Python class without explicitly using __slots__, instances of that class are highly dynamic. Their attributes are stored in a dynamically allocated dictionary, accessible via a special instance attribute called __dict__. This dictionary provides immense flexibility, allowing you to add, remove, or modify attributes on an instance at any time, even after it’s been created. Consider the following class:

class A:
    def __init__(self):
        self.x = 0
        self.y = 1
        self.z = 2

The memory layout for an instance of this class starts with the standard PyGC_Head and PyObject_HEAD (total 32 bytes on 64-bit). Immediately following these headers, the instance holds pointers to two additional internal objects: a pointer to its __dict__ (the dictionary storing its attributes like x, y, z) and potentially a pointer to __weakref__ (if the object supports weak references). These are themselves Python objects and thus incur their own memory overhead.

Mental Diagram: Class (without __slots__)

             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
             |                    *_gc_next                  | |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyGC_Head (16 bytes)
             |                    *_gc_prev                  | |
C object --> +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
instance     |                    ob_refcnt                  | \
pointer      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes)
             |                    *ob_type                   | |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
             |                    *__dict__                  | --> 8 bytes (ptr to dict object holding x, y, z)
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
             |                    *__weakref__               | --> 8 bytes (ptr to weakref object)
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
             |                      ...                      | (additional instance-specific C data, rare)

The __dict__ itself is a PyDictObject which has its own memory footprint. Each attribute like self.x stores a pointer to the actual integer object within this __dict__. This flexibility comes at a memory cost: every instance carries a __dict__ pointer, and the dictionary itself consumes memory, even if the class doesn’t have any instance attributes. This overhead becomes significant when creating a large number of instances.

def recursive_size(obj):
    size = sys.getsizeof(obj)
    print(f"Size of {obj.__class__.__name__}: {size} bytes")
    if hasattr(obj, "__dict__"):
        print("Size of __dict__:", sys.getsizeof(obj.__dict__))
        size += sys.getsizeof(obj.__dict__) + sum([recursive_size(v) for v in obj.__dict__.values()])
    if hasattr(obj, "__slots__"):
        size += sum([recursive_size(getattr(obj, slot)) for slot in obj.__slots__])
    return size

print("A basic size:", A.__basicsize__)   # Output: 16 (PyObject_HEAD)
# sys.getsizeof: PyGC_Head + PyObject_HEAD + __dict__ + __weakref__ pointers
print("A instance size including all pointers:", sys.getsizeof(A()))     # Outout: 48
print("A instance size including all attributes:", recursive_size(A()))  # Output: 428

# Output:
# A basic size: 16
# A instance size including all pointers: 48
# Size of A: 48 bytes
# Size of __dict__: 296
# Size of int: 28 bytes
# Size of int: 28 bytes
# Size of int: 28 bytes
# A instance size including all attributes: 428

10.3. Memory Layout of Slotted Classes

The __slots__ mechanism provides a way to tell Python not to create a __dict__ for each instance of a class. Instead, it reserves a fixed amount of space directly within the object’s C structure for the specified attributes. This significantly reduces the memory footprint for instances, especially when you have many of them, and also speeds up attribute access since there’s no dictionary lookup involved.

class B:
    __slots__ = "x", "y", "z"
    def __init__(self):
        self.x = 0
        self.y = 1
        self.z = 2

When __slots__ is defined, the named attributes ("x", "y", "z") are essentially mapped to offsets within the instance’s memory block, right after the standard object headers. This is much like how C structs work. If __slots__ is an empty tuple (__slots__ = ()), the instance will be as small as possible, containing only the basic PyGC_Head and PyObject_HEAD. If specific slots are defined, pointers to the values for those slots are directly embedded in the instance memory.

Mental Diagram: B object instance layout (with __slots__)

             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
             |                    *_gc_next                  | |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyGC_Head (16 bytes)
             |                    *_gc_prev                  | |
B object --> +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /               \
instance     |                    ob_refcnt                  | \               |
pointer      +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD |
             |                    *ob_type                   | |               |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /               |
             |                       *x                      | \               | basic size
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |               |
             |                       *y                      | | extra slots   |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |               |
             |                       *z                      | |               |
             +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /               /

print("B basic size:", B.__basicsize__) # PyObject_Head + slot pointers
print("B instance size including all pointers:", sys.getsizeof(B()))    # include PyGC_Head
print("B instance size including all attributes:", recursive_size(B())) # include slot values

# Outout:
# B basic size: 40
# B instance size including all pointers: 56
# Size of B: 56 bytes
# Size of int: 28 bytes
# Size of int: 28 bytes
# Size of int: 28 bytes
# B instance size including all attributes: 140

10.4. Memory Layout of Core Built-ins

Beyond user-defined classes, Python’s core built-in types also adhere to specific memory layouts, often optimized for their particular behavior and common use cases. Most variable-sized built-in types utilize a PyObject_VAR_HEAD, which is an extension of the PyObject_HEAD that includes an ob_size field. This ob_size field stores the number of elements or items within the variable-sized part of the object. On a 64-bit system, the PyObject_VAR_HEAD typically is 24 bytes (16 bytes for PyObject_HEAD + 8 bytes for ob_size).

Integer `int`

Python integers are (almost) arbitrary-precision, meaning they can represent numbers of any size, limited only by available memory. This is achieved by storing the integer’s value in a sequence of base-$2^{30}$ digits. Type digit is stored in a unsigned 32-bit C integer type, meaning 4 bytes. (Older systems use base-$2^{15}$ digits and unsigned shorts). Python also needs to remember the number of digits in the integer, which is stored in the ob_size field of the PyObject_VAR_HEAD. The sign of the integer is also stored in this field, where a negative value indicates a negative integer, positive values are stored as positive integers, and zero is represented by ob_size = 0.

This means, that the theoretical limit for a 64-bit python integer is in practice bounded only by the available memory. The maximum size of a digit is $2^{30}-1$ and the maximum number of digits is $2^{63}-1$, which means, that the maximum possible python integer is

\[(2^{30}-1)^{2^{63}-1} \approx 2^{30 \cdot 2^{63}} \approx 2^{2^{68}} \approx 10^{88848372616373700000}\]

The number of digits of this number in base 10 is about $10^{20}$ and we would need about 130 milion terabytes of memory to store it.

Mental Diagram: int object layout

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                 ob_refcnt                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 *ob_type                      | | PyObject_VAR_HEAD (24 bytes)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 ob_size                       | | (8 bytes: number of 'digits', sign included)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|               digit[0] (value)                | (4 bytes for each digit)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|               digit[1] (value)                |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                     ...                       |

Note that even int(0) has one digit, so the smallest possible python integer takes up 28 bytes.

import sys
print("Int basic size:", int.__basicsize__)              # PyObject_VAR_HEAD (24 bytes)
print("Int item size:", int.__itemsize__)                # Size of each digit (4 bytes)
print("Int (0) size:", sys.getsizeof(0))                 # Even zero has 1 digit
print("Int (2^30-1) size:", sys.getsizeof(2**30 - 1))    # One digit, within 30 bits
print("Int (2^30) size:", sys.getsizeof(2**30))          # Two digits, exceeds 30 bits
print("Int (2^60-1) size:", sys.getsizeof(2**60 - 1))    # Two digits, still within 60 bits
print("Int (512-bit) size:", sys.getsizeof(2**511 - 1))  # Multiple digits

# Output:
# Int basic size: 24
# Int item size: 4
# Int (0) size: 28
# Int (2^30-1) size: 28
# Int (2^30) size: 32
# Int (2^60-1) size: 32
# Int (512-bit) size: 96

You can inspect the C implementation of int here.

Boolean `bool`

Python booleans are a subclass of integers, with True represented as 1 and False as 0, meaning that they have the memory footprint of a one digit ineteger.

import sys
print("Bool basic size:", bool.__basicsize__)  # Output: 24 (PyObject_VAR_HEAD)
print("Bool full size:", sys.getsizeof(True))  # Output: 28 (24 + 4 for the single digit)

However, only two instances of bool are ever created in a Python session (True and False) and python stores them as singletons. So when you create a list of 10 booleans, it will not take up $10\cdot28=280$ bytes, but only $10*8$ bytes for the pointers to the two singleton instances.

x = bool(1)    # bool(0) = False, bool(>0) = True
y = bool(10)
print(f"True is Bool(1) is Bool(10): {x is True and x is y}")  # Output: True

Floating Point Number `float`

Python’s floating-point numbers are implemented using the C double type, which typically occupies 8 bytes (64 bits) of memory. This representation follows the IEEE 754 standard for double-precision floating-point arithmetic, allowing for a wide range of values and precision. The float object in Python includes the standard PyObject_HEAD and a field for the actual floating-point value.

You can inspect the C implementation of float here

String `str`

Python strings are immutable sequences of Unicode characters. Their memory layout is highly optimized. A string object (PyUnicodeObject in C) includes PyGC_Head, PyObject_HEAD, and then specific fields for strings: length (number of characters), hash (cached hash value), state (flags like ASCII/compact), and finally, the actual Unicode character data. CPython uses an optimized “compact” representation where ASCII strings use 1 byte per character, while more complex Unicode characters might use 1, 2 or 4 bytes per character, based on the string content, to save memory. The actual character data is stored directly adjacent to the object header and it may or may not be terminated by a C null-terminator (\0).

Note that the character data needs to be correctly aligned (charachters can be 1, 2, or 4 bytes long), so there is a padding field that ensures the character data starts at the correct memory address.

Mental Diagram: str object layout

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                  ob_refcnt                    | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes)
|                  *ob_type                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                   length                      | (8 bytes)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    hash                       | (8 bytes)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                   state                       | (8 bytes) - internal details
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                  padding                      | (0 to 24 bytes to correctly align character data)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                char_data[0]                   | (1 byte per char for ASCII)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                char_data[1]                   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ...                        |

So the basic size of a string object is 40 bytes + padding.

import sys
print("String (empty) basic size:", str.__basicsize__)  # Output: 64 (includes max madding)
print("String (empty):", sys.getsizeof(""))             # Output: 41 = 40 + \0   (no padding needed)
print("String (1 char):", sys.getsizeof("a"))           # Output: 42 = 40 + 1 + \0
print("String (4 chars):", sys.getsizeof("abcd"))       # Output: 45 = 40 + 4 + \0
print("String (Unicode):", sys.getsizeof("řeřicha"))    # Output: 72

You can inspect the C implementation of str here.

Dynamic List `list`

Lists are mutable, ordered sequences that store pointers to other Python objects. Python lists are implemented as dynamic arrays, meaning they can grow and shrink in size as elements are added or removed. The list object (PyListObject in C) starts with a PyObject_VAR_HEAD, which includes the ob_size field indicating the current number of elements in the list. This is followed by a pointer to the actual array of pointers (**ob_item) that holds references to the elements in the list. The ob_alloc field indicates the total allocated capacity of this array. The following logical invariants always hold:

0 <= ob_size <= allocated
len(list) == ob_size
ob_item == NULL implies ob_size == allocated == 0

Mental Diagram: list object layout

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                 ob_refcnt                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 *ob_type                      | | PyObject_VAR_HEAD (24 bytes)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 ob_size                       | | (number of elements currently in list)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                **ob_item                      | --> pointer to array of pointers to PyObjects (elements)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                 ob_alloc                      | --> allocated capacity for internal array
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Meaning that the __basicsize__ of a list is 40 bytes (on a 64-bit system). Note that when you actually create a list, python adds the GC_Head to it, so the total size of an empty list is 56 bytes (40 + 16 for PyGC_Head).

import sys
print("List basic size:", list.__basicsize__)       # PyObject_HEAD + 2 pointers
print("Empty list size:", sys.getsizeof([]))        # includes GB_HEAD (16 bytes), ob_item=NULL
print("List with 1 item", sys.getsizeof([1]))       # includes the item pointer
print("List with 2 items:", sys.getsizeof([1, 2]))  # includes 2 item pointers

# Output (On a 64-bit system):
# List basic size: 40
# Empty list size: 56
# List with 1 item 64
# List with 2 items: 72

When you append to a list, if the current ob_alloc capacity is insufficient, Python allocates a larger array (size increases by a factor of 1.125 to amortize the allocation cost for average O(1) append), copies the existing pointers, and updates ob_alloc. This pre-allocation strategy means that sys.getsizeof() for a list reports the size of the list object itself plus the currently allocated space for its pointers, not just the space for its ob_size elements. The exact formula python uses to calculate the new array size is:

\[new\_alloc = 4 \cdot \Bigl\lfloor \frac{last\_alloc \cdot 1.125 + 3}{4} \Bigr\rfloor + 4\]

Which rounds up to the next multiple of 4 and adds 4 to ensure that the new allocation is always larger than the previous one.

# Initial capacity: 0 (size: 56 bytes)
# List size changed at 1 elements. New capacity = 4, factor = NaN
# List size changed at 5 elements. New capacity = 8, factor = 2.000
# List size changed at 9 elements. New capacity = 16, factor = 2.000
# List size changed at 17 elements. New capacity = 24, factor = 1.500
# List size changed at 25 elements. New capacity = 32, factor = 1.333
# List size changed at 33 elements. New capacity = 40, factor = 1.250
# List size changed at 41 elements. New capacity = 52, factor = 1.300
# List size changed at 53 elements. New capacity = 64, factor = 1.231
# List size changed at 65 elements. New capacity = 76, factor = 1.188
# List size changed at 77 elements. New capacity = 92, factor = 1.211
# List size changed at 93 elements. New capacity = 108, factor = 1.174
# List size changed at 109 elements. New capacity = 128, factor = 1.185
# List size changed at 129 elements. New capacity = 148, factor = 1.156
# List size changed at 149 elements. New capacity = 172, factor = 1.162
# List size changed at 173 elements. New capacity = 200, factor = 1.163
# List size changed at 201 elements. New capacity = 232, factor = 1.160
# List size changed at 233 elements. New capacity = 268, factor = 1.155
# List size changed at 269 elements. New capacity = 308, factor = 1.149

You can inspect the C implementation of list here.

Tuple `tuple`

Tuples are immutable, ordered sequences, storing pointers to other Python objects. Unlike lists, tuples are fixed-size once created. A tuple object (PyTupleObject in C) has a PyObject_VAR_HEAD (with ob_size indicating the number of elements), and its elements are stored directly as a contiguous array of PyObject* pointers immediately following the header. Since tuples are immutable, this array is allocated once and its size never changes.

Mental Diagram: tuple object layout

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                 ob_refcnt                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 *ob_type                      | | PyObject_VAR_HEAD (24 bytes)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                 ob_size                       | / (number of elements in tuple)
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|            items[0] (ptr to PyObject)         | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|            items[1] (ptr to PyObject)         | | --> fixed-size array of pointers to elements
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |
|                    ...                        | /

The size of a tuple is its __basicsize__ plus ob_size * __itemsize__.

import sys
print("Empty tuple basic size:", tuple.__basicsize__)  # PyObject_VAR_HEAD (24 bytes)
print("Empty tuple item size:", tuple.__itemsize__)    # Size of each item (ptr to element)
print("Empty tuple:", sys.getsizeof(()))               # includes GC_Head (+16 bytes)
print("Tuple (1,2,3):", sys.getsizeof((1, 2, 3)))      # includes 3 item pointers (3 * 8 bytes)

# Output:
# Empty tuple basic size: 24
# Empty tuple item size: 8
# Empty tuple: 40
# Tuple (1,2,3): 64

You can inspect the C implementation of tuple here.

Hashset `set`

Sets are unordered collections of unique, immutable elements. Their internal implementation (PySetObject in C) is based on a hash table. It is represented by an array of setentry structs. setentry represents an elements and stores the reference to it and its hash. Hash tables need to allocate much more memory that is the actual number of entries, to maintain sparsity, which helps reduce collisions and ensure O(1) average-case performance.

Mental Diagram: set object layout

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                  ob_refcnt                    | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes)
|                  *ob_type                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                    fill                       | --> Number active and dummy entries
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    used                       | --> Number active entries
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    mask                       | --> The table contains mask + 1 slots, and that's a power of 2
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                                               |
|              setentry *table                  | --> pointer to the internal hash table array
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                                               |
|           other set-specific fields           | --> in total 152 bytes (on a 64-bit system)
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Notably, the table points to a fixed-size small-table for small tables or to additional malloc’ed memory for bigger tables. The small-table is stored directly in the set object and contains 8 setentries, which is enough for very small sets.

import sys
print("Empty set basic size:", set.__basicsize__)      # base size
print("Empty set:", sys.getsizeof(set()))              # includes PyGC_Head (16 bytes)
print("Set {1,2,3}:", sys.getsizeof({1, 2, 3}))        # very small set, no resizing needed
print("Set {1,2,3,4,5}:", sys.getsizeof({1,2,3,4,5}))  # larger set

# Output (On a 64-bit system):
# Empty set basic size: 200
# Empty set: 216
# Set {1,2,3}: 216
# Set {1,2,3,4,5}: 472

The set grows proportionaly to the number of elements, with the internal has table always getting 4 times larger.

# Set size changed at 5 elements: New slots = 32
# Set size changed at 19 elements: New slots = 128, factor: 4.0
# Set size changed at 77 elements: New slots = 512, factor: 4.0
# Set size changed at 307 elements: New slots = 2048, factor: 4.0
# Set size changed at 1229 elements: New slots = 8192, factor: 4.0
# Set size changed at 4915 elements: New slots = 32768, factor: 4.0

You can inspect the C implementation of set here.

Dictionary `dict`

Dictionaries are mutable mappings of unique, immutable keys to values. Like sets, they are implemented using hash tables (PyDictObject in C), storing key-value pairs as entries in an internal array. Each entry typically holds the hash of the key, a pointer to the key object, and a pointer to the value object. Dictionaries also employ a strategy of over-allocating space to maintain a low load factor, which helps ensure efficient O(1) average-case lookup, insertion, and deletion times.

Mental Diagram: dict object layout (simplified)

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                   ob_refcnt                   | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes)
|                   *ob_type                    | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                   ma_used                     | --> Number of items in the dictionary
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                                               |
|       Internal dictionary representation      |
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

A dictionary has a smaller base size than a set, because it doesn’t contain a prealocated small table.

import sys
print("Empty dict basic size:", dict.__basicsize__)  # base size
print("Empty dict:", sys.getsizeof({}))              # includes PyGC_Head (16 bytes)
print("Dict {1:'a',2:'b',3:'c'}:", sys.getsizeof({1:'a', 2:'b', 3:'c'}))

# Output (on a 64-bit system):
# Empty dict basic size: 48
# Empty dict: 64
# Dict {1:'a',2:'b',3:'c'}: 224

The dictionary grows proportionaly to the number of elements, the internal representaion always doubling in size.

# Dict size changed at 1 elements: New representation = 160
# Dict size changed at 6 elements: New representation = 288, factor: 1.80
# Dict size changed at 11 elements: New representation = 568, factor: 1.97
# Dict size changed at 22 elements: New representation = 1104, factor: 1.94
# Dict size changed at 43 elements: New representation = 2200, factor: 1.99
# Dict size changed at 86 elements: New representation = 4624, factor: 2.10
# Dict size changed at 171 elements: New representation = 9240, factor: 2.00
# Dict size changed at 342 elements: New representation = 18448, factor: 2.00
# Dict size changed at 683 elements: New representation = 36888, factor: 2.00
# Dict size changed at 1366 elements: New representation = 73744, factor: 2.00
# Dict size changed at 2731 elements: New representation = 147480, factor: 2.00
# Dict size changed at 5462 elements: New representation = 294928, factor: 2.00

You can inspect the C implementation of dict here.

Key Takeaways

Universal Object Header: Every Python object in CPython starts with a 16 byte PyObject_HEAD (containing ob_refcnt and *ob_type). Most garbage-collected objects also have a 16 byte PyGC_Head prepended for GC tracking.
User-Defined Classes (No __slots__): Instances store attributes in __dict__, creating a large memory overhead. Their layout includes PyGC_Head, PyObject_HEAD, a pointer to __dict__, and a pointer to __weakref__.
User-Defined Classes (With __slots__): Instances do not have a __dict__, but __slots__. Attributes are stored directly within the object’s C structure at fixed offsets, saving significant memory. Their layout is PyGC_Head, PyObject_HEAD, and then the direct attribute values (pointers).
Simple Built-in Types
- int: Arbitrary-precision, uses an array of digits (4-byte chunks) to store its value, growing dynamically.
- bool: True and False are pre-allocated singletons of type int.
- str: Stores character data directly in the object structure, with a compact representation for ASCII strings.
Garbage Collected Built-in Types (include PyGC_Head)
- list: Mutable, uses PyObject_VAR_HEAD with ob_size and ob_alloc. Stores pointers to elements in a dynamically sized, pre-allocated internal array.
- tuple: Immutable, uses PyObject_VAR_HEAD and stores pointers to elements in a fixed-size internal array.
- set and dict: Implemented using hash tables with internal arrays of entries, leading to larger base sizes due to their internal data structures and pre-allocation for performance.
Memory Inspection Tools:
- sys.getsizeof(obj): The size of the object itself in bytes, including internal pointers and allocated capacity (for variable types), but not referenced objects.
- Class.__basicsize__: The size of the fixed part of a class instance’s C structure.
- Class.__itemsize__: The size of a single item in the variable part of PyVarObject types.

11. Runtime Memory Management & Garbage Collection

Python’s reputation for ease of use often belittles the sophisticated machinery humming beneath its surface, especially concerning memory management. Unlike languages where developers explicitly manage memory (e.g., C/C++), Python largely automates this crucial task. However, a deep understanding of its internal mechanisms, particularly CPython’s approach, is vital for writing performant, predictable, and memory-efficient applications, and for diagnosing subtle memory-related bugs. This chapter delves into the core principles and components that govern how Python allocates, tracks, and reclaims memory during program execution.

11.1. PyObject Layout (Revision)

At the very heart of CPython’s memory model lies a fundamental principle: everything is an object. Numbers, strings, lists, functions, classes, modules, and even None and True/False – all are represented internally as PyObject structs in C. This uniformity is a cornerstone of Python’s flexibility and dynamism, allowing the interpreter to handle disparate data types in a consistent manner through a generic object interface. This “object-all-the-way-down” philosophy simplifies the interpreter’s design, as it doesn’t need to special-case different data types for fundamental operations like memory management or type checking.

Every PyObject in CPython begins with a standard header, which provides essential metadata that the interpreter uses to manage the object. This header is typically composed of at least two fields: ob_refcnt (object reference count) and ob_type (pointer to type object). The ob_refcnt is a C integer that tracks the number of strong references pointing to this object, forming the basis of Python’s primary memory reclamation strategy, reference counting. The ob_type is a pointer to the object’s type object, which is itself a PyObject that describes the object’s type, methods, and attributes. This pointer allows the interpreter to perform runtime type checking and dispatch method calls correctly.

For objects whose size can vary, such as lists, tuples, or strings, CPython uses a slightly different but related structure called PyVarObject. This struct extends the basic PyObject header with an additional ob_size field, which stores the number of elements or bytes the variable-sized object contains. Imagine a layered structure: a PyObject provides the universal base, PyVarObject adds variability for collections, and then specific object implementations (like PyListObject or PyDictObject) append their unique data fields. This consistent header allows the interpreter’s core C routines to manipulate objects generically, casting them to a PyObject* pointer to access their reference count or type, regardless of the specific Python type they represent.

It’s crucial to understand that while PyObject_HEAD is fundamental to all Python objects, some objects additionally incorporate a PyGC_Head during their runtime existence. This PyGC_Head, typically prepended before the PyObject_HEAD in memory, contains two pointers: _gc_next and _gc_prev. These pointers are used by Python’s generational garbage collector to link objects into a doubly-linked list, enabling efficient traversal and management of objects that can participate in reference cycles. Therefore, a Python object in memory can be thought of as consisting of an optional PyGC_Head, followed by its fixed PyObject_HEAD, and then any object-specific data.

Mental Diagram: PyObject Layout (General Form)

+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ \
|                 *_gc_next                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyGC_Head (16 bytes), for GC-tracked objects
|                 *_gc_prev                     | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                 ob_refcnt                     | \
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | PyObject_HEAD (16 bytes), all objects have this
|                 *ob_type                      | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ /
|                                               |
|               Object-Specific Data            | --> optional, varies by type
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

This uniform PyObject interface means that any piece of Python code, or any C extension interacting with Python objects, can safely retrieve an object’s reference count or its type, enabling a consistent and efficient underlying object model.

11.2. Reference Counting & The Garbage Collector

CPython employs a hybrid memory management strategy that combines two primary mechanisms: reference counting for immediate object reclamation and a generational garbage collector for resolving reference cycles. This two-pronged approach aims to balance performance (quick reclamation) with correctness (handling circular references).

Reference Counting is the simplest and most prevalent form of memory management in CPython. Every PyObject maintains a counter, ob_refcnt, which tracks the number of strong references pointing to it. When an object is created, its ob_refcnt is initialized. Each time a new reference to the object is created (e.g., variable assignment, passing an object as an argument, storing it in a container), its ob_refcnt is incremented via Py_INCREF (a C macro). Conversely, when a reference is removed (e.g., variable goes out of scope, del statement, container cleared), its ob_refcnt is decremented via Py_DECREF. When ob_refcnt drops to zero, it means no strong references to the object remain. At this point, the object is immediately deallocated, and its memory is returned to the system. This deterministic and prompt reclamation makes reference counting very efficient for most common scenarios, as memory is freed as soon as it’s no longer needed, reducing memory footprint and fragmentation.

However, reference counting has a critical limitation: it cannot detect and reclaim reference cycles. A reference cycle occurs when two or more objects refer to each other in a closed loop, even if no external references point to the cycle. For example, if object A refers to object B, and object B refers to object A, both A and B will have an ob_refcnt of at least 1, preventing them from being deallocated, even if they are otherwise unreachable. This would lead to a memory leak. To address this, CPython incorporates a generational garbage collector (GC) that runs periodically.

The generational garbage collector operates on top of reference counting specifically to find and reclaim objects involved in reference cycles. It is based on the generational hypothesis, which posits that most objects are either very short-lived or very long-lived. To optimize collection, objects are divided into three “generations” (0, 1, and 2). Newly created objects start in generation 0. If an object survives a garbage collection cycle, it is promoted to the next generation. The GC runs more frequently on younger generations (e.g., generation 0 is collected most often, generation 2 least often) because it’s statistically more likely to find short-lived objects that are no longer needed there.

The PyGC_Head plays a crucial role in this generational garbage collection process. As mentioned earlier, this header is present in objects that are collectable, meaning they can potentially participate in reference cycles. The PyGC_Head contains two pointers, _gc_next and _gc_prev, which form a doubly-linked list. Each generation (0, 1, 2) maintains its own linked list of objects currently residing in that generation. When the garbage collector runs, it traverses these linked lists to identify unreachable objects, even those whose reference counts are non-zero due to circular references.

Now, let’s address why some objects, like int and str, do not have a PyGC_Head, while others, such as custom classes, list, set, and dict, do:

Objects without PyGC_Head (Non-collectable objects):
- Types Without References: Types like int, str, float, or bytes are immutable and more importantly, they cannot contain references to themselves or to other objects in a way that creates a cycle. Since these objects can never be part of a reference cycle, their memory can be safely managed solely by reference counting. Therefore, these objects are not tracked by the generational garbage collector, and thus they do not have the PyGC_Head.
Objects with PyGC_Head (Collectable objects):
- Container Types: Objects like list, dict, set or tuple act as containers and can hold references to other Python objects. It is precisely this ability to contain references that makes them susceptible to forming reference cycles (e.g., a list containing itself, or two dictionary objects referencing each other through their values).
- Custom Class Instances: Instances of any classes you define in Python are also collectable because they can have __dict__ attributes or __slots__ that store references to other objects, potentially forming cycles.
- Because these objects can form cycles that reference counting alone cannot resolve, they must be tracked by the generational garbage collector and contain the PyGC_Head to facilitate cycle detection and reclamation.

11.3. Object Identity and Object Reuse

In Python, every object has a unique identity, a type, and a value. The built-in id() function returns the “identity” of an object. This identity is an integer that is guaranteed to be unique and constant for that object during its lifetime. In CPython, id(obj) typically returns the memory address of the object in CPython’s memory space. This makes id() a powerful tool for understanding object uniqueness and how Python manages memory.

Understanding object identity is crucial for distinguishing between objects that have the same value but are distinct entities in memory, versus objects that are actually the same instance. For mutable objects like lists, this distinction is clear: a = [1, 2], b = [1, 2] will result in id(a) != id(b), even though a == b (they have the same value). This is because a and b are two separate list objects in memory. For immutable objects, the behavior can be less intuitive due to CPython’s optimization known as “interning.”

Interning is a technique where CPython pre-allocates and reuses certain immutable objects to save memory and improve performance. The most common examples are small integers and certain strings. Small integers (typically from -5 to 256) are interned, meaning that any reference to these numbers will point to the exact same object in memory. This is why id(10) == id(10) holds true, and even id(10) == id(5 + 5). This optimization is possible because integers are immutable; their value never changes, so there’s no risk of one user inadvertently modifying another’s “10.” Similarly, short, common strings and string literals in source code are often interned.

# Small integers are interned
a = 10
b = 5 + int(2.5 * 2)
print(f"id(a): {id(a)}, id(b): {id(b)}, a is b: {a is b}")  # True

# Larger integers are not typically interned
x = 1000
y = 500 + int(2.5 * 200)
print(f"id(x): {id(x)}, id(y): {id(y)}, x is y: {x is y}")  # False (usually)

# Large integers with obvious same value start are typically interned
x = 100_001
y = 100_000 + 1
print(f"id(x): {id(x)}, id(y): {id(y)}, x is y: {x is y}")  # True (usually)

# String interning
s1 = "hello_world!!!"
s2 = "hello_world" + 3 * "!"
print(f"id(s1): {id(s1)}, id(s2): {id(s2)}, s1 is s2: {s1 is s2}")  # True

# May or may not be interned depending on CPython version/optimizations
s3 = "hello" + "_" + "world"
s4 = "hello_world"
print(f"id(s3): {id(s3)}, id(s4): {id(s4)}, s3 is s4: {s3 is s4}")  # True or False

# Output (mar vary based on CPython version):
# id(a): 1407361743597, id(b): 14073617435975, a is b: True
# id(x): 2381950788208, id(y): 23819535696482, x is y: False
# id(x): 2381953568560, id(y): 23819535685606, x is y: True
# id(s1): 238195384017, id(s2): 2381953840176, s1 is s2: True
# id(s3): 238195384305, id(s4): 2381953843056, s3 is s4: True

Understanding id() and interning is vital for debugging and for making correct comparisons. The is operator checks for object identity (id(obj1) == id(obj2)), while the == operator checks for value equality (obj1.__eq__(obj2)). While id() provides insight into CPython’s memory management, it should generally not be used for comparison in application logic, as Python’s interning behavior for non-small integers and non-literal strings is an implementation detail and not guaranteed across different Python versions or implementations. Always use == for value comparison unless you specifically need to check if two variables refer to the exact same object in memory.

11.4. Weak References

The reference counting mechanism, while efficient, imposes a strict ownership model: as long as an object has a strong reference, it cannot be garbage collected. This can become problematic in scenarios where you need to refer to an object without preventing its collection, such as implementing caches, circular data structures (where weak references can help break cycles), or event listeners where the listener should not keep the subject alive. This is where weak references come into play.

A weak reference to an object does not increment its ob_refcnt. This means that if an object is only referenced by weak references, and its strong reference count drops to zero, it becomes eligible for garbage collection. When the object is collected, any weak references pointing to it are automatically set to None or become “dead” (they will return None when dereferenced). This allows you to build data structures where objects can be “observed” or “linked” without creating memory leaks.

Python provides the weakref module to work with weak references. The most common weak reference types are weakref.ref() for individual objects and weakref.proxy() for proxy objects that behave like the original but raise an ReferenceError if the original object is collected. You can also use weakref.WeakKeyDictionary and weakref.WeakValueDictionary, which are specialized dictionaries that hold weak references to their keys or values, respectively. This makes them ideal for caches where you want entries to be automatically removed if the cached object is no longer referenced elsewhere.

For instances of standard user-defined classes (i.e., those not using __slots__), CPython’s memory layout includes a dedicated, internal slot for __weakref__ management. This is not an entry in the instance’s __dict__, but rather a specific pointer or offset within the underlying C structure of the PyObject itself, typically named tp_weaklistoffset in the PyTypeObject that defines the class. This internal slot serves as the anchor point for all weak references pointing to a particular object instance. When you create a weak reference (e.g., weakref.ref(obj)), the weakref machinery registers this weak reference with the object via this internal slot. This registration allows Python to efficiently iterate through and “clear” (set to None) all associated weak references immediately before the object is deallocated. It ensures that once an object is gone, any weak pointers to it become invalid.

import weakref

class MyObject:
    def __init__(self, name):
        self.name = name
        print(f"MyObject {self.name} created")
    def __del__(self):
        print(f"MyObject {self.name} deleted")
    def __str__(self) -> str:
        return f"MyObject({self.name})"

obj = MyObject("Strong")
weak_obj_ref = weakref.ref(obj)

# You can access the __weakref__ slot, though it's typically an internal detail
# It will be a weakref.ReferenceType object or None if no weakrefs exist
print(f"__weakref__ attribute before del: {type(obj.__weakref__)}")
print(f"Dereferencing weak_obj_ref: {weak_obj_ref()}")
del obj
print(f"Dereferencing weak_obj_ref after del: {weak_obj_ref()}")

# Output:
# MyObject Strong created
# __weakref__ attribute before del: <class 'weakref.ReferenceType'>
# Dereferencing weak_obj_ref: MyObject(Strong)
# MyObject Strong deleted
# Dereferencing weak_obj_ref after del: None

In contrast to standard classes, instances of classes that explicitly define __slots__ do not have the __weakref__ attribute by default. The core purpose of __slots__ is to achieve greater memory efficiency and faster attribute access by creating a fixed, compact memory layout for instances, without the overhead of a dynamic __dict__. When __slots__ are used, Python only allocates memory for the attributes explicitly listed in the __slots__ tuple. Since the __weakref__ slot is an optional feature for object instances, it is not included in this minimalist layout unless you specifically request it. If you define __slots__ in a class and still need its instances to be targetable by weak references, you must explicitly include '__weakref__' as one of the entries in the __slots__ tuple. This tells Python to reserve the necessary space in the instance’s fixed memory layout for managing weak references, thus allowing weakref.ref() to target instances of that slotted class.

The weakref module is an indispensable tool for advanced memory management patterns. It allows developers to break undesired strong reference cycles, implement efficient caches (like memoization or object pools) that don’t indefinitely hold onto memory, and design flexible event systems where listeners don’t prevent the objects they’re observing from being collected. By understanding how the __weakref__ attribute ties into the memory layout and the implications for slotted classes, you can write more robust and memory-efficient Python applications, especially those dealing with long-running processes or large numbers of objects where precise memory control is paramount.

11.5. Memory Usage Tracking

While Python’s automatic memory management simplifies development, it can sometimes hide memory issues, such as unexpected object retention or gradual memory growth (memory leaks). Python provides powerful tools in its standard library to inspect and track memory usage, helping developers diagnose and resolve such problems. The gc module provides an interface to the generational garbage collector, and tracemalloc offers detailed tracing of memory blocks allocated by Python.

The gc module allows you to interact with the generational garbage collector. You can manually trigger collections using gc.collect(), which can be useful for debugging or in scenarios where you need to reclaim memory at specific points. More importantly, gc provides debugging flags (gc.DEBUG_STATS, gc.DEBUG_COLLECTABLE, gc.DEBUG_UNCOLLECTABLE) that can be set to get detailed output during GC runs. gc.DEBUG_STATS will print statistics about collected objects, while gc.DEBUG_COLLECTABLE will print information about objects found to be collectable, and gc.DEBUG_UNCOLLECTABLE (the most critical for debugging leaks) will show objects that were found to be part of cycles but could not be reclaimed (e.g., due to __del__ methods in cycles).

import gc

# Enable GC debugging stats
gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_UNCOLLECTABLE)

class MyLeakObject:
    def __init__(self, name):
        self.name = name
        self.ref = None # Will create a cycle later

    def __del__(self):
        print(f"__del__ called for {self.name}")

# Create a reference cycle
a = MyLeakObject("A")
b = MyLeakObject("B")
a.ref = b
b.ref = a

# Delete external references; objects are now only part of a cycle
del a
del b

print("Attempting to collect garbage...")
gc.collect()
print("Collection finished.")

# Output:
# Attempting to collect garbage...
# gc: collecting generation 2...
# gc: objects in each generation: 357 5086 0
# gc: objects in permanent generation: 0
# __del__ called for A
# __del__ called for B
# gc: done, 2 unreachable, 0 uncollectable, 0.0010s elapsed
# Collection finished.

For more granular memory profiling, the tracemalloc module is invaluable. Introduced in Python 3.4, tracemalloc tracks memory allocations made by Python. It allows you to:

Start and Stop tracing: tracemalloc.start() and tracemalloc.stop().
Take snapshots: tracemalloc.take_snapshot() captures the current state of allocated memory blocks.
Compare snapshots: By comparing two snapshots, you can identify which files and lines of code allocated new memory blocks between the two points, making it highly effective for pinpointing memory leaks.
Get statistics: You can get top statistics by file, line, traceback, etc., showing where the most memory is being allocated.

import tracemalloc

tracemalloc.start()

# Simulate memory allocation in a loop
snapshots = []
data = []
for i in range(4):
    data += [str(j) * (100 + i * 50) for j in range(1000 + i * 500)]       # line 9
    more_data = [bytearray(1024 + i * 512) for _ in range(500 + i * 200)]  # line 10
    snapshots.append(tracemalloc.take_snapshot())
    # Deallocate some memory
    if i % 2 == 1:
        data.clear()
        del more_data

# Compare snapshots to see memory changes over iterations
for idx in range(1, len(snapshots)):
    print(f"\n--- Snapshot {idx} vs {idx-1} ---")
    stats = snapshots[idx].compare_to(snapshots[idx - 1], "lineno")
    for stat in stats[:2]:
        print(stat)

tracemalloc.stop()

# Output:
# --- Snapshot 1 vs 0 ---
# module.py:9:  size=1118 KiB (+788 KiB),  count=2501 (+1500), average=458 B
# module.py:10: size=1095 KiB (+563 KiB),  count=1401 (+400),  average=800 B

# --- Snapshot 2 vs 1 ---
# module.py:10: size=1858 KiB (+763 KiB),  count=1802 (+401),  average=1056 B
# module.py:9:  size=1441 KiB (+323 KiB),  count=2001 (-500),  average=738 B

# --- Snapshot 3 vs 2 ---
# module.py:9:  size=3731 KiB (+2290 KiB), count=4501 (+2500), average=849 B
# module.py:10: size=2820 KiB (+962 KiB),  count=2202 (+400),  average=1311 B

By combining the gc module for understanding garbage collection behavior and tracemalloc for detailed allocation tracing, developers can gain profound insights into their application’s memory footprint, detect unwanted object retention, and efficiently debug memory-related performance bottlenecks or leaks.

11.6. Stack Frames & Exceptions

Exception handling is a critical aspect of robust software, and Python’s mechanism for this is closely tied to its runtime execution model, particularly the concept of stack frames. When a function is called in Python, a new frame object (a PyFrameObject in C) is created on the call stack. This frame object encapsulates the execution context of that function call.

A stack frame contains all the necessary information for a function to execute and resume correctly:

Local variables: A dictionary-like structure holding the function’s local variables.
Cell variables and free variables: For closures, these store references to variables from enclosing scopes.
Code object: A pointer to the PyCodeObject (the compiled bytecode) of the function being executed.
Program counter (f_lasti): An index into the bytecode instructions, indicating the next instruction to be executed.
Value stack: A stack used for intermediate computations during expression evaluation.
Block stack: Used for managing control flow constructs like for loops, with statements, and try/except blocks.
Previous frame pointer (f_back): A pointer to the frame of the caller function, forming a linked list that represents the call stack.

When an exception occurs within a function, Python looks for an appropriate except block in the current frame. If none is found, the exception propagates up the call stack. This process is called stack unwinding. The interpreter uses the f_back pointer of the current frame to move to the caller’s frame, and the search for an except block continues there. This continues until an except block is found to handle the exception, or until the top of the call stack (the initial entry point of the program) is reached, at which point the unhandled exception causes the program to terminate and prints a traceback.

def third_function():
    print("Inside third_function - dividing by zero")
    # This will raise a ZeroDivisionError
    result = 1 / 0
    print("This line will not be reached.")

def second_function():
    print("Inside second_function")
    third_function()
    print("This line in second_function will not be reached.")

def first_function():
    print("Inside first_function")
    try:
        second_function()
    except ZeroDivisionError as e:
        print(f"Caught an error in first_function: {e}")
    print("first_function finished.")

first_function()

# Output:
# Inside first_function
# Inside second_function
# Inside third_function - dividing by zero
# Caught an error in first_function: division by zero
# first_function finished.

In the example above, when 1 / 0 occurs in third_function, an exception is raised. third_function doesn’t handle it, so the stack unwinds to second_function. second_function also doesn’t handle it, so it unwinds further to first_function. first_function has a try...except block for ZeroDivisionError, so it catches the exception, prints the message, and then continues execution from that point. The traceback you see when an unhandled exception occurs is essentially a representation of these stack frames, showing the function calls (and their locations) from where the exception originated, all the way up to where it was (or wasn’t) handled.

Understanding stack frames is essential for effective debugging and for optimizing recursive functions. Each frame object consumes memory, and excessively deep recursion can lead to a RecursionError (due to exceeding the interpreter’s recursion limit, which prevents stack overflow from unbounded recursion) before a true system stack overflow. This knowledge allows developers to reason about program flow, debug exceptions more effectively, and understand the memory overhead associated with function calls.

Advanced Exception Handling: `try`, `except`, `else`, and `finally`

While the basic try and except blocks are fundamental for catching and handling errors, Python’s exception handling construct offers more nuanced control flow through the else and finally clauses. Mastering these allows for cleaner, more robust, and semantically correct error management within your applications, moving beyond simple error suppression to proper resource management and distinct success/failure pathways.

The full syntax of an exception handling block is try...except...else...finally. Each part plays a distinct role:

try: This block contains the code that might raise an exception. It’s the primary section where the operation you want to perform is attempted.
except: If an exception occurs within the try block, execution immediately jumps to the first matching except block. You can specify different types of exceptions to catch, allowing for granular handling. A crucial best practice is to list except blocks from most specific to most general. This is because Python evaluates except clauses sequentially. If a more general exception type (e.g., Exception) is listed before a specific one (e.g., ValueError), the more general one would catch all errors, preventing the specific handler from ever being reached. For instance, if you expect a ValueError for bad input, handle ValueError first, and then perhaps TypeError if an incorrect type might also be passed, before a catch-all Exception.
```
try:
    value = int("abc") # This will raise a ValueError
    # value = 1 / 0    # This would raise a ZeroDivisionError
except ValueError as e:
    print(f"Caught a specific ValueError: {e}")
except ZeroDivisionError as e: # This block will not be reached by "abc"
    print(f"Caught a specific ZeroDivisionError: {e}")
except Exception as e: # Catch-all for other unexpected errors
    print(f"Caught a general Exception: {e}")
```
else: This optional block is executed only if no exception occurs in the try block. While perhaps less commonly seen in everyday Python code and sometimes overlooked, its purpose is to clearly delineate code that should run exclusively upon successful completion of the try clause. Placing code in else rather than simply appending it after the try-except structure can improve semantic clarity: it explicitly states that the code within the else block is a logical continuation of the try block’s success. This also helps in correctly scoping exception handling, as any exceptions raised within the else block itself would not be caught by the preceding except clauses, forcing you to handle them separately if needed, thus preventing unintended broad exception catches.

To clarify execution order: if the try block completes without an exception, the else block executes. Only after the else block (or immediately after the except block if an exception was caught), will the finally block execute.
```
try:
    num_str = "123"
    number = int(num_str)
except ValueError:
    print("Invalid number string. 'else' block will not execute.")
else:
    # This code only runs if int(num_str) succeeds.
    # Any exception here (e.g., if print fails) would NOT be caught by the ValueError except.
    print(f"Successfully converted {num_str} to {number}. 'else' block executed.")
    # Further processing that relies on 'number' being valid
finally:
    # This block always executes, regardless of try, except, or else outcome.
    print("Execution of try-except-else-finally block is complete. 'finally' always runs last.")
```
finally: This optional block is always executed, regardless of whether an exception occurred in the try block, was caught, or was left unhandled. The finally block is primarily used for cleanup operations that must be performed under any circumstances. This includes closing files, releasing locks, closing network connections, or ensuring resources are returned to a consistent state. Even if an exception is raised in the try block and not caught, or if an exception occurs within an except or else block, the finally block will still run before the exception propagates further.

A key best practice is to keep except blocks specific and minimal, handling only the direct error conditions you anticipate and know how to recover from. Avoid broad except Exception: unless absolutely necessary, as it can hide unexpected bugs. When using finally, focus solely on resource deallocation or state reset. Avoid complex logic within finally to prevent new exceptions that could obscure the original error. Properly structured try-except-else-finally blocks are a hallmark of robust Python code, ensuring both error resilience and proper resource management.

Key Takeaways

Everything is an Object: All data in Python is represented as a PyObject in CPython, containing an ob_refcnt (reference count) and ob_type (type pointer).
PyGC_Head: An optional header (with _gc_next and _gc_prev pointers) prepended to objects that are “collectable” (i.e., can participate in reference cycles).
Reference Counting: CPython’s primary memory management, decrementing ob_refcnt on reference removal. Objects are immediately deallocated when ob_refcnt reaches zero.
Generational Garbage Collector: Supplements reference counting to detect and reclaim reference cycles. It tracks “collectable” objects (mutable containers like lists, dicts, custom classes) using PyGC_Head linked lists in three generations. Immutable objects (like int, str) do not have PyGC_Head as they cannot form cycles.
Object Identity (id()): Returns an object’s unique, constant memory address. Used to distinguish between objects with the same value (==) but different identities (is). Small integers and common strings are interned for optimization.
Weak References (weakref): Allow referencing objects without incrementing their reference count, enabling caches and breaking cycles without memory leaks. weakref.ref, weakref.proxy, WeakKeyDictionary, and WeakValueDictionary are key tools.
Memory Tracking (gc, tracemalloc): The gc module allows interaction with the garbage collector (e.g., gc.collect(), debugging flags). tracemalloc tracks memory allocations, enabling detailed profiling and leak detection by comparing snapshots.
Stack Frames: Each function call creates a PyFrameObject on the call stack, holding local context, code, program counter, and a pointer to the previous frame. Exception handling involves unwinding these frames until an except block is found.

12. Memory Allocator Internals & GC Tuning

Having explored the fundamental PyObject structure, reference counting, and the generational garbage collector, we now descend another layer into CPython’s memory management: the underlying memory allocators and advanced garbage collector tuning. Understanding how CPython requests and manages memory from the operating system, and how it optimizes for common object types, is crucial for truly mastering performance and memory efficiency in long-running or memory-intensive Python applications. This chapter will reveal these intricate mechanisms and provide the tools to inspect and fine-tune them.

12.1. Memory Allocation: obmalloc and Arenas

CPython doesn’t directly call malloc and free for every single object allocation and deallocation. Doing so would incur significant overhead due to frequent system calls and general-purpose allocator complexities. Instead, CPython implements its own specialized memory management layer on top of the system’s malloc (or VirtualAlloc on Windows), primarily for Python objects. This layer is often referred to as obmalloc (object malloc). The obmalloc allocator is designed to be highly efficient for the small, numerous, and frequently created/destroyed objects that characterize typical Python programs.

The core strategy of obmalloc revolves around a hierarchical structure of arenas and pools. At the highest level, CPython requests large chunks of memory from the operating system. These large chunks, typically 256KB on 64-bit systems, are called arenas. An arena is essentially a large, contiguous block of memory designated for Python object allocation. CPython pre-allocates a few arenas, and more are requested from the OS as needed. This reduces the number of direct system calls to malloc, as subsequent Python object allocations can be satisfied from within these already-allocated arenas.

Each arena is further subdivided into a fixed number of pools. A pool is a smaller, fixed-size block of memory, typically 4KB. Critically, each pool is dedicated to allocating objects of a specific size class. For instance, one pool might only allocate 16-byte objects, another 32-byte objects, and so on. This “size-segregated” approach is incredibly efficient because it eliminates the need for complex metadata or fragmentation management within the pool. When an object of a particular size is requested, obmalloc can quickly find a pool designated for that size and allocate a block from it.

Mental Diagram: obmalloc Hierarchy

+-------------------------------------------------+
|                  Operating System               |
+-------------------------------------------------+
                        ^
                        |  Requests Large Chunks
                        v
+-------------------------------------------------+
|               CPython obmalloc Layer            |
+-------------------------------------------------+
|                                                 |
+-------------------------------------------------+
|            Arena 1  (e.g., 256KB)               |
|  +----------+ +----------+ +----------+         |
|  | Pool A   | | Pool B   | | Pool C   | ...     |
|  | (16-byte)| | (32-byte)| | (64-byte)|         |
|  +----------+ +----------+ +----------+         |
|     | Allocates specific size objects           |
|     v                                           |
|   +---+ +---+ +---+                             |
|   |Obj| |Obj| |Obj|  (e.g., 16-byte objects)    |
|   +---+ +---+ +---+                             |
+-------------------------------------------------+
|            Arena 2  (e.g., 256KB)               |
|  +----------+ +----------+ +----------+         |
|  | Pool D   | | Pool E   | | Pool F   | ...     |
|  +----------+ +----------+ +----------+         |
+-------------------------------------------------+
|           (More Arenas as needed)               |

This tiered allocation strategy offers several benefits: reduced system call overhead, improved cache locality (objects of similar sizes are often grouped), and minimized internal fragmentation within pools since all blocks within a pool are of the same uniform size. This specialized allocator is a cornerstone of CPython’s ability to handle the rapid creation and destruction of numerous small Python objects efficiently.

12.2. Small-object Optimizations: Free Lists

Building upon the obmalloc arena/pool structure, CPython employs further optimizations for very common, small, and frequently deallocated objects: free lists. A free list is essentially a linked list of deallocated objects of a particular type or size. When an object is deallocated (i.e., its reference count drops to zero), instead of immediately returning its memory to the system or even to the obmalloc pool, CPython might place it onto a type-specific free list.

The most prominent examples of objects managed by free lists are integers, floats, tuples, lists, and dicts, especially for smaller sizes or values. For instance, there’s a free list for small integer objects (outside the -5 to 256 range, which are singletons), a free list for float objects, and free lists for empty or small tuples, lists, and dictionaries. When a new object of that type and size is requested, CPython first checks its corresponding free list. If a previously deallocated object is available, it’s simply reinitialized and reused, bypassing the entire allocation process (system call, arena, pool, etc.). This is incredibly fast.

# Example of potential free list reuse (implementation detail, not guaranteed)
a = [1, 2]
print(f"id(a): {id(a)}")
del a # a is deallocated, its memory might go to a free list for empty lists

c = [1, 2] # Might reuse the memory block from 'a'
print(f"id(c): {id(c)}") # Could potentially be the same as id(a) if free list reuse happened

# Output:
id(a): 1725905330560
id(c): 1725905330560

The benefits of free lists are substantial: they virtually eliminate the overhead of memory allocation and deallocation for very common operations, drastically reducing CPU cycles spent on memory management. This mechanism leverages the observation that many programs exhibit patterns of creating and destroying temporary objects of similar types and sizes. By holding onto these deallocated blocks, CPython avoids repeated expensive trips to the underlying memory allocator. However, free lists are finite in size; if a free list exceeds a certain maximum length (e.g., 80 elements for empty tuples), excess deallocated objects are then returned to the obmalloc pool for general reuse. This prevents free lists themselves from consuming excessive amounts of memory for rarely reused objects.

12.3. String Interning

String interning is a powerful optimization in CPython aimed at reducing memory consumption and speeding up string comparisons. Because strings are immutable, identical string literal values can safely point to the same object in memory without any risk of one being modified and affecting others. String interning is the process by which CPython maintains a pool of unique string objects. When a new string literal is encountered, CPython first checks this pool. If an identical string already exists, the existing object is reused; otherwise, the new string is added to the pool. This is also one of the reasons why the str type has its hash directly stored in the PyObject structure, allowing for fast equality checks.

CPython automatically interns certain strings:

String literals found directly in the source code (e.g., "hello", 'world').
Strings that consist only of ASCII letters, digits, and underscores, and are syntactically valid Python identifiers.
Strings that are short (the exact length threshold can vary slightly between CPython versions but is generally quite small, e.g., 20-30 characters).

Strings created dynamically (e.g., from user input, network data, or string concatenation results) are generally not automatically interned unless they meet specific criteria or are explicitly interned using sys.intern().

import sys

s1 = "my_string" # Literal, likely interned
s2 = "my_string" # Same literal, refers to the same object
print(f"s1 is s2: {s1 is s2}") # True

s3 = "my" + "_" + "string" # Dynamically created, might not be interned
s4 = "my_string"           # Depends on CPython optimization at compile time
print(f"s3 is s4 (dynamic vs literal): {s3 is s4}") # False or True

s5 = sys.intern("another_string") # Explicitly interned
s6 = "another_string"
print(f"s5 is s6 (explicitly interned): {s5 is s6}") # True

The benefits of interning are two-fold:

Memory Reduction: Instead of multiple copies of identical string data, there’s only one. This can significantly reduce memory footprint in applications that use many repeated strings (e.g., parsing JSON/XML where keys repeat, or large sets of identical categorical data).
Performance Improvement for Comparisons: When comparing two interned strings, Python can simply compare their memory addresses (using is or a quick internal PyObject_RichCompareBool check) instead of performing a character-by-character comparison. This O(1) identity check is much faster than an O(N) character-by-character comparison, where N is the string length. While == still performs a value comparison, it often benefits from interning checks first.

Beyond strings, CPython also shares other immutable objects:

Small Integers: Integers in the range of -5 to 256 are pre-allocated and cached. Any time you reference an integer in this range, you get a reference to the same singleton object. This is a massive optimization as these are the most frequently used integers.
Empty Tuples: The empty tuple () is typically a singleton object.
None, True, False: These are also singletons, meaning there’s only one instance of each throughout the Python process’s lifetime.

These sharing mechanisms contribute significantly to CPython’s overall memory efficiency and performance, reducing both allocation overhead and the need for expensive comparisons.

12.4. GC Tunables and Thresholds

The generational garbage collector, described in Chapter 10, is not a black box; its behavior can be inspected and subtly tuned through the gc module. Understanding its internal “tunables” allows developers to optimize its performance for specific application workloads, especially in long-running services where predictable memory behavior is critical. The primary tunables are the collection thresholds.

CPython’s GC maintains three generations (0, 1, and 2). Each generation has a threshold associated with it: threshold0, threshold1, and threshold2. These thresholds represent the maximum number of new allocations (or more precisely, “allocations minus deallocations” of collectable objects) that can occur in that generation before the GC considers running a collection for that generation.

Generation 0: This is the youngest generation. A collection of generation 0 objects is triggered when the number of new allocations since the last collection of generation 0 (minus deallocations) exceeds threshold0.
Generation 1: A collection of generation 1 objects (which includes a collection of generation 0) is triggered when the count of objects that have survived the last generation 0 collection exceeds threshold1.
Generation 2: A collection of generation 2 objects (which includes collections of generation 0 and 1) is triggered when the count of objects that have survived the last generation 1 collection exceeds threshold2.

You can inspect and modify these thresholds using gc.get_threshold() and gc.set_threshold(). The default thresholds are typically (2000, 10, 10). This means:

Gen 0 collection: When 2000 more objects (that could be part of cycles) have been created than destroyed.
Gen 1 collection: When 10 objects survive a Gen 0 collection and are promoted to Gen 1.
Gen 2 collection: When 10 objects survive a Gen 1 collection and are promoted to Gen 2.

import gc

# Get current thresholds
print(f"Default GC thresholds: {gc.get_threshold()}")

# Set new thresholds (e.g., for more frequent/less frequent collection)
gc.set_threshold(1000, 5, 5) # Example: Collect Gen 0 less often, Gen 1/2 more often
print(f"New GC thresholds: {gc.get_threshold()}")

# Output:
# Default GC thresholds: (2000, 10, 10)
# New GC thresholds: (1000, 5, 5)

Tuning these thresholds depends heavily on your application’s memory allocation patterns. For applications with many short-lived objects, you might consider decreasing threshold0 to collect more frequently, freeing memory sooner. For applications with many long-lived objects and fewer cycles, increasing thresholds might reduce the overhead of unnecessary GC runs. However, overly aggressive collection can introduce performance pauses, while overly infrequent collection can lead to higher memory usage. The best approach involves profiling and experimentation, as discussed in the next section.

12.5. Optimizing Long-Running Processes

For long-running Python services, such as web servers, background workers, or data processing pipelines, memory behavior can be a critical concern. Gradual memory growth (memory leaks), sudden spikes in memory usage, or unpredictable pauses due to garbage collection cycles can severely impact performance and stability. Effective optimization requires systematic profiling and careful tuning.

Profiling Memory Behavior:

gc.get_stats(): This function provides a list of dictionaries, one for each generation, containing statistics about collections for that generation: collections (number of times collected), collected (number of objects collected), and uncollectable (number of objects detected in cycles but couldn’t be reclaimed, e.g., due to __del__ methods). Monitoring uncollectable objects is paramount for identifying true memory leaks related to reference cycles.
tracemalloc: As introduced in Chapter 10, tracemalloc is your primary tool for detailed memory allocation tracing. By taking snapshots at different points in your application’s lifecycle and comparing them (snapshot.compare_to()), you can pinpoint exactly where memory is being allocated and which specific lines of code are responsible for memory growth. This is invaluable for finding leaks or identifying unexpected large allocations.
System-level tools: Tools like htop, top, psutil (Python library), or platform-specific memory profilers (e.g., valgrind for CPython internals, although that’s more for C extension debugging) can give you an overview of the Python process’s total memory footprint and how it changes over time.

Tuning Strategies:

Adjusting GC Thresholds: Based on profiling data, you might adjust gc.set_threshold(). If your application frequently creates and destroys many short-lived objects, a lower threshold0 might free memory faster. If objects tend to be long-lived, higher thresholds could reduce collection overhead. Experimentation with different values while monitoring memory and performance is key.
Disabling/Enabling GC: For short, bursty tasks, or specific phases of an application where you know no cycles will form, gc.disable() can temporarily turn off the GC to avoid collection overhead. Remember to re-enable it with gc.enable() and ideally call gc.collect() afterward to clean up any cycles that might have accumulated. This is a powerful but risky tool and should only be used after thorough analysis.
Manual Collection: In some long-running processes, especially after processing a large batch of data or completing a significant logical unit of work, explicitly calling gc.collect() can be beneficial. This allows you to reclaim memory deterministically rather than waiting for the automatic thresholds to be met, which can smooth out performance by preventing large, unpredictable collection pauses.
Identifying and Breaking Cycles: The most effective way to optimize is to prevent memory leaks from reference cycles. Use gc.DEBUG_UNCOLLECTABLE and tracemalloc to find uncollectable objects. Often, these arise from circular references involving objects with __del__ methods (which make them uncollectable by the standard GC, as the order of __del__ calls in a cycle is ambiguous). Restructuring your code to break these cycles (e.g., using weakref as discussed in Chapter 10) is the ultimate solution.

Optimizing long-running processes is an iterative process of profiling, identifying bottlenecks, applying tuning strategies, and re-profiling to measure the impact.

12.6. GC Hooks and Callbacks

Beyond simple threshold tuning, the gc module provides powerful introspection and extensibility points through its advanced hooks and callbacks. These features allow developers to gain deeper insights into the GC’s operation and even influence application-specific behavior around collection events, facilitating advanced debugging and resource management.

The most prominent feature in this category is gc.callbacks. This is a list that you can append callable objects to. These callbacks are invoked by the garbage collector before it starts and after it finishes a collection cycle. Each callback receives two arguments:

phase: A string indicating the collection phase ("start" or "stop").
info: A dictionary containing additional information about the collection, such as the generation being collected, the number of objects collected, and the number of uncollectable objects.

By registering callbacks, you can:

Log GC events: Record when collections occur, for which generation, and how much memory was reclaimed, helping to understand GC overhead in production.
Perform application-specific cleanup: If your application manages external resources (e.g., custom C extensions, external file handles) that are not directly managed by Python’s GC, you might use a callback to trigger their cleanup when Python objects that wrap them are being collected.
Monitor for uncollectable objects: Use callbacks to specifically log or alert when uncollectable objects are detected, aiding in proactive leak detection.

import gc

def gc_callback(phase, info):
    if phase == "start":
        print(f"GC: Collection started for generation {info['generation']}")
    elif phase == "stop":
        print(f"GC: Collection ended. Collected: {info['collected']}, Uncollectable: {info['uncollectable']}")
        if info['uncollectable'] > 0:
            print("  Potential memory leak detected! Uncollectable objects found.")
            for obj in gc.garbage:
                print(f"    Uncollectable: {type(obj)} at {id(obj)}")

# Register the callback
gc.callbacks.append(gc_callback)
# Trigger a collection to see the callback in action
gc.collect()
# Don't forget to remove callbacks if no longer needed, especially in tests
gc.callbacks.remove(gc_callback)

The gc module also offers gc.get_objects() and gc.get_referrers(), which can be invaluable for advanced debugging. gc.get_objects() returns a list of all objects that the collector is currently tracking. This can be a very large list but is powerful for inspecting the state of your program. gc.get_referrers(*objs) returns a list of objects that directly refer to any of the arguments objs. This is incredibly useful for debugging reference cycles: if gc.get_referrers() shows an unexpected reference, it can lead you to the source of a leak. By combining these tools with custom callbacks, developers gain unparalleled control and insight into the garbage collection process, enabling them to build highly optimized and memory-stable Python applications.

Key Takeaways

CPython Memory Allocator (obmalloc): CPython uses a specialized allocator layered over system malloc for Python objects. It manages memory in arenas (large OS-allocated chunks) which are subdivided into pools (fixed-size blocks for specific object sizes).
Small-object Optimization (Free Lists): For very common, small, and frequently deallocated objects (e.g., small ints, floats, empty lists, tuples, dicts), CPython maintains type-specific free lists to reuse memory blocks without going through the full allocation process, significantly boosting performance.
String Interning: CPython automatically interns short, identifier-like string literals, storing them in a unique pool. This reduces memory usage by sharing identical strings and speeds up string comparisons to O(1) identity checks. Other immutable singletons like None, True, False, and small integers are also shared.
GC Tunables (Thresholds): The generational garbage collector’s frequency is controlled by three thresholds (threshold0, threshold1, threshold2), representing object allocation/survival counts in generations 0, 1, and 2 respectively. These can be inspected and modified using gc.get_threshold() and gc.set_threshold().
Profiling & Tuning Strategies: Use gc.get_stats() for collection statistics and tracemalloc for detailed allocation tracing to identify memory growth and leaks. Tuning involves adjusting GC thresholds, strategically using gc.disable()/gc.enable(), manually calling gc.collect(), and, most importantly, identifying and breaking explicit reference cycles (often using weakref).
Advanced gc Hooks (gc.callbacks): Register custom callable objects to gc.callbacks to receive notifications about GC collection phases ("start", "stop"). This enables logging, application-specific cleanup of external resources, and proactive detection of uncollectable objects. gc.get_objects() and gc.get_referrers() are powerful debugging tools for inspecting object references.

Where to Go Next

Part I: The Python Landscape and Execution Model: Delving into Python’s history, implementations, and the execution model that transforms your code into running programs.
Part II: Core Language Concepts and Internals: Exploring variables, scope, namespaces, the import system, functions, and classes in depth.
Part III: Advanced Type System and Modern Design: Mastering abstract base classes, protocols, type annotations, and advanced annotation techniques that enhance code reliability and maintainability.
Part V: Performance, Concurrency, and Debugging: Examining concurrency models, performance optimization techniques, and debugging strategies that help you write efficient and robust code.
Part VI: Building, Deploying, and The Developer Ecosystem: Covering packaging, dependency management, production deployment, and the essential tools and libraries that every Python developer should know.
Appendix: A collection of key takeaways, practical checklists and additional resources to solidify your understanding.

Part IV: Memory Management and Object Layout

Table of Contents

10. Deep Dive Into Object Memory Layout

11. Runtime Memory Management & Garbage Collection

12. Memory Allocator Internals & GC Tuning

10. Deep Dive into Object Memory Layout

10.1. The Great Unification: PyObject Layout

10.2. Memory Layout of User Defined Classes

10.3. Memory Layout of Slotted Classes

10.4. Memory Layout of Core Built-ins

Integer int

Boolean bool

Floating Point Number float

String str

Dynamic List list

Tuple tuple

Hashset set

Dictionary dict

Key Takeaways

11. Runtime Memory Management & Garbage Collection

11.1. PyObject Layout (Revision)

11.2. Reference Counting & The Garbage Collector

11.3. Object Identity and Object Reuse

11.4. Weak References

11.5. Memory Usage Tracking

11.6. Stack Frames & Exceptions

Advanced Exception Handling: try, except, else, and finally

Key Takeaways

12. Memory Allocator Internals & GC Tuning

12.1. Memory Allocation: obmalloc and Arenas

12.2. Small-object Optimizations: Free Lists

12.3. String Interning

12.4. GC Tunables and Thresholds

12.5. Optimizing Long-Running Processes

12.6. GC Hooks and Callbacks

Key Takeaways

Where to Go Next

Integer `int`

Boolean `bool`

Floating Point Number `float`

String `str`

Dynamic List `list`

Tuple `tuple`

Hashset `set`

Dictionary `dict`

Advanced Exception Handling: `try`, `except`, `else`, and `finally`