Skip to content

Commit 9a7e205

Browse files
sergey-miryanovnaschemezaniebhugovk
authored
[3.14] GH-148726: Forward-port generational GC (#148720)
Co-authored-by: Neil Schemenauer <nas@arctrix.com> Co-authored-by: Sergey Miryanov <sergey.miryanov@gmail.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Neil Schemenauer <nas-github@arctrix.com>
1 parent 78c5e54 commit 9a7e205

12 files changed

Lines changed: 5582 additions & 6096 deletions

File tree

Doc/data/python3.14.abi

Lines changed: 5054 additions & 5062 deletions
Large diffs are not rendered by default.

Doc/library/gc.rst

Lines changed: 23 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -40,18 +40,11 @@ The :mod:`!gc` module provides the following functions:
4040

4141
.. function:: collect(generation=2)
4242

43-
Perform a collection. The optional argument *generation*
43+
With no arguments, run a full collection. The optional argument *generation*
4444
may be an integer specifying which generation to collect (from 0 to 2). A
4545
:exc:`ValueError` is raised if the generation number is invalid. The sum of
4646
collected objects and uncollectable objects is returned.
4747

48-
Calling ``gc.collect(0)`` will perform a GC collection on the young generation.
49-
50-
Calling ``gc.collect(1)`` will perform a GC collection on the young generation
51-
and an increment of the old generation.
52-
53-
Calling ``gc.collect(2)`` or ``gc.collect()`` performs a full collection
54-
5548
The free lists maintained for a number of built-in types are cleared
5649
whenever a full collection or collection of the highest generation (2)
5750
is run. Not all items in some free lists may be freed due to the
@@ -63,6 +56,9 @@ The :mod:`!gc` module provides the following functions:
6356
.. versionchanged:: 3.14
6457
``generation=1`` performs an increment of collection.
6558

59+
.. versionchanged:: 3.14.5
60+
``generation=1`` performs collection of the middle generation.
61+
6662

6763
.. function:: set_debug(flags)
6864

@@ -78,20 +74,19 @@ The :mod:`!gc` module provides the following functions:
7874

7975
.. function:: get_objects(generation=None)
8076

81-
8277
Returns a list of all objects tracked by the collector, excluding the list
83-
returned. If *generation* is not ``None``, return only the objects as follows:
84-
85-
* 0: All objects in the young generation
86-
* 1: No objects, as there is no generation 1 (as of Python 3.14)
87-
* 2: All objects in the old generation
78+
returned. If *generation* is not ``None``, return only the objects tracked by
79+
the collector that are in that generation.
8880

8981
.. versionchanged:: 3.8
9082
New *generation* parameter.
9183

9284
.. versionchanged:: 3.14
9385
Generation 1 is removed
9486

87+
.. versionchanged:: 3.14.5
88+
Generation 1 is reintroduced to maintain GC behavior from 3.13.
89+
9590
.. audit-event:: gc.get_objects generation gc.get_objects
9691

9792
.. function:: get_stats()
@@ -118,33 +113,33 @@ The :mod:`!gc` module provides the following functions:
118113
Set the garbage collection thresholds (the collection frequency). Setting
119114
*threshold0* to zero disables collection.
120115

121-
The GC classifies objects into two generations depending on whether they have
122-
survived a collection. New objects are placed in the young generation. If an
123-
object survives a collection it is moved into the old generation.
124-
125-
In order to decide when to run, the collector keeps track of the number of object
116+
The GC classifies objects into three generations depending on how many
117+
collection sweeps they have survived. New objects are placed in the youngest
118+
generation (generation ``0``). If an object survives a collection it is moved
119+
into the next older generation. Since generation ``2`` is the oldest
120+
generation, objects in that generation remain there after a collection. In
121+
order to decide when to run, the collector keeps track of the number object
126122
allocations and deallocations since the last collection. When the number of
127123
allocations minus the number of deallocations exceeds *threshold0*, collection
128-
starts. For each collection, all the objects in the young generation and some
129-
fraction of the old generation is collected.
124+
starts. Initially only generation ``0`` is examined. If generation ``0`` has
125+
been examined more than *threshold1* times since generation ``1`` has been
126+
examined, then generation ``1`` is examined as well.
127+
With the third generation, things are a bit more complicated,
128+
see `Collecting the oldest generation <https://github.com/python/cpython/blob/ff0ef0a54bef26fc507fbf9b7a6009eb7d3f17f5/InternalDocs/garbage_collector.md#collecting-the-oldest-generation>`_ for more information.
130129

131130
In the free-threaded build, the increase in process memory usage is also
132131
checked before running the collector. If the memory usage has not increased
133132
by 10% since the last collection and the net number of object allocations
134133
has not exceeded 40 times *threshold0*, the collection is not run.
135134

136-
The fraction of the old generation that is collected is **inversely** proportional
137-
to *threshold1*. The larger *threshold1* is, the slower objects in the old generation
138-
are collected.
139-
For the default value of 10, 1% of the old generation is scanned during each collection.
140-
141-
*threshold2* is ignored.
142-
143135
See `Garbage collector design <https://github.com/python/cpython/blob/3.14/InternalDocs/garbage_collector.md>`_ for more information.
144136

145137
.. versionchanged:: 3.14
146138
*threshold2* is ignored
147139

140+
.. versionchanged:: 3.14.5
141+
*threshold2* is restored to match Python 3.13 behavior.
142+
148143

149144
.. function:: get_count()
150145

Include/internal/pycore_gc.h

Lines changed: 5 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -118,21 +118,6 @@ static inline void _PyObject_GC_SET_SHARED(PyObject *op) {
118118
/* Bit 1 is set when the object is in generation which is GCed currently. */
119119
#define _PyGC_PREV_MASK_COLLECTING ((uintptr_t)2)
120120

121-
/* Bit 0 in _gc_next is the old space bit.
122-
* It is set as follows:
123-
* Young: gcstate->visited_space
124-
* old[0]: 0
125-
* old[1]: 1
126-
* permanent: 0
127-
*
128-
* During a collection all objects handled should have the bit set to
129-
* gcstate->visited_space, as objects are moved from the young gen
130-
* and the increment into old[gcstate->visited_space].
131-
* When object are moved from the pending space, old[gcstate->visited_space^1]
132-
* into the increment, the old space bit is flipped.
133-
*/
134-
#define _PyGC_NEXT_MASK_OLD_SPACE_1 1
135-
136121
#define _PyGC_PREV_SHIFT 2
137122
#define _PyGC_PREV_MASK (((uintptr_t) -1) << _PyGC_PREV_SHIFT)
138123

@@ -159,13 +144,11 @@ typedef enum {
159144
// Lowest bit of _gc_next is used for flags only in GC.
160145
// But it is always 0 for normal code.
161146
static inline PyGC_Head* _PyGCHead_NEXT(PyGC_Head *gc) {
162-
uintptr_t next = gc->_gc_next & _PyGC_PREV_MASK;
147+
uintptr_t next = gc->_gc_next;
163148
return (PyGC_Head*)next;
164149
}
165150
static inline void _PyGCHead_SET_NEXT(PyGC_Head *gc, PyGC_Head *next) {
166-
uintptr_t unext = (uintptr_t)next;
167-
assert((unext & ~_PyGC_PREV_MASK) == 0);
168-
gc->_gc_next = (gc->_gc_next & ~_PyGC_PREV_MASK) | unext;
151+
gc->_gc_next = (uintptr_t)next;
169152
}
170153

171154
// Lowest two bits of _gc_prev is used for _PyGC_PREV_MASK_* flags.
@@ -207,10 +190,6 @@ static inline void _PyGC_CLEAR_FINALIZED(PyObject *op) {
207190

208191
extern void _Py_ScheduleGC(PyThreadState *tstate);
209192

210-
#ifndef Py_GIL_DISABLED
211-
extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
212-
#endif
213-
214193

215194
/* Tell the GC to track this object.
216195
*
@@ -220,7 +199,7 @@ extern void _Py_TriggerGC(struct _gc_runtime_state *gcstate);
220199
* ob_traverse method.
221200
*
222201
* Internal note: interp->gc.generation0->_gc_prev doesn't have any bit flags
223-
* because it's not object header. So we don't use _PyGCHead_PREV() and
202+
* because it's not an object header. So we don't use _PyGCHead_PREV() and
224203
* _PyGCHead_SET_PREV() for it to avoid unnecessary bitwise operations.
225204
*
226205
* See also the public PyObject_GC_Track() function.
@@ -244,19 +223,12 @@ static inline void _PyObject_GC_TRACK(
244223
"object is in generation which is garbage collected",
245224
filename, lineno, __func__);
246225

247-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
248-
PyGC_Head *generation0 = &gcstate->young.head;
226+
PyGC_Head *generation0 = _PyInterpreterState_GET()->gc.generation0;
249227
PyGC_Head *last = (PyGC_Head*)(generation0->_gc_prev);
250228
_PyGCHead_SET_NEXT(last, gc);
251229
_PyGCHead_SET_PREV(gc, last);
252-
uintptr_t not_visited = 1 ^ gcstate->visited_space;
253-
gc->_gc_next = ((uintptr_t)generation0) | not_visited;
230+
_PyGCHead_SET_NEXT(gc, generation0);
254231
generation0->_gc_prev = (uintptr_t)gc;
255-
gcstate->young.count++; /* number of tracked GC objects */
256-
gcstate->heap_size++;
257-
if (gcstate->young.count > gcstate->young.threshold) {
258-
_Py_TriggerGC(gcstate);
259-
}
260232
#endif
261233
}
262234

@@ -291,11 +263,6 @@ static inline void _PyObject_GC_UNTRACK(
291263
_PyGCHead_SET_PREV(next, prev);
292264
gc->_gc_next = 0;
293265
gc->_gc_prev &= _PyGC_PREV_MASK_FINALIZED;
294-
struct _gc_runtime_state *gcstate = &_PyInterpreterState_GET()->gc;
295-
if (gcstate->young.count > 0) {
296-
gcstate->young.count--;
297-
}
298-
gcstate->heap_size--;
299266
#endif
300267
}
301268

Include/internal/pycore_interp_structs.h

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,6 @@ struct gc_generation_stats {
195195
Py_ssize_t uncollectable;
196196
};
197197

198-
enum _GCPhase {
199-
GC_PHASE_MARK = 0,
200-
GC_PHASE_COLLECT = 1
201-
};
202-
203198
/* If we change this, we need to change the default value in the
204199
signature of gc.collect. */
205200
#define NUM_GENERATIONS 3
@@ -215,8 +210,13 @@ struct _gc_runtime_state {
215210
int enabled;
216211
int debug;
217212
/* linked lists of container objects */
213+
#ifndef Py_GIL_DISABLED
214+
struct gc_generation generations[NUM_GENERATIONS];
215+
PyGC_Head *generation0;
216+
#else
218217
struct gc_generation young;
219218
struct gc_generation old[2];
219+
#endif
220220
/* a permanent generation which won't be collected */
221221
struct gc_generation permanent_generation;
222222
struct gc_generation_stats generation_stats[NUM_GENERATIONS];
@@ -227,13 +227,6 @@ struct _gc_runtime_state {
227227
/* a list of callbacks to be invoked when collection is performed */
228228
PyObject *callbacks;
229229

230-
Py_ssize_t heap_size;
231-
Py_ssize_t work_to_do;
232-
/* Which of the old spaces is the visited space */
233-
int visited_space;
234-
int phase;
235-
236-
#ifdef Py_GIL_DISABLED
237230
/* This is the number of objects that survived the last full
238231
collection. It approximates the number of long lived objects
239232
tracked by the GC.
@@ -246,6 +239,7 @@ struct _gc_runtime_state {
246239
the first time. */
247240
Py_ssize_t long_lived_pending;
248241

242+
#ifdef Py_GIL_DISABLED
249243
/* True if gc.freeze() has been used. */
250244
int freeze_active;
251245

@@ -261,6 +255,22 @@ struct _gc_runtime_state {
261255
#endif
262256
};
263257

258+
#ifndef Py_GIL_DISABLED
259+
#define GC_GENERATION_INIT \
260+
.generations = { \
261+
{ .threshold = 2000, }, \
262+
{ .threshold = 10, }, \
263+
{ .threshold = 10, }, \
264+
},
265+
#else
266+
#define GC_GENERATION_INIT \
267+
.young = { .threshold = 2000, }, \
268+
.old = { \
269+
{ .threshold = 10, }, \
270+
{ .threshold = 10, }, \
271+
},
272+
#endif
273+
264274
#include "pycore_gil.h" // struct _gil_runtime_state
265275

266276
/**** Import ********/

Include/internal/pycore_runtime_init.h

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -137,13 +137,7 @@ extern PyTypeObject _PyExc_MemoryError;
137137
}, \
138138
.gc = { \
139139
.enabled = 1, \
140-
.young = { .threshold = 2000, }, \
141-
.old = { \
142-
{ .threshold = 10, }, \
143-
{ .threshold = 0, }, \
144-
}, \
145-
.work_to_do = -5000, \
146-
.phase = GC_PHASE_MARK, \
140+
GC_GENERATION_INIT \
147141
}, \
148142
.qsbr = { \
149143
.wr_seq = QSBR_INITIAL, \

0 commit comments

Comments
 (0)