gh-148571: [JIT] Preserve family-head recorder layouts for specialized opcode families#148730
gh-148571: [JIT] Preserve family-head recorder layouts for specialized opcode families#148730cocolato wants to merge 6 commits intopython:mainfrom
Conversation
| family_head = analysis.instructions[inst.family.name] | ||
| if family_head.properties.records_value: | ||
| source_inst = family_head | ||
| is_family_override = family_head is not inst |
There was a problem hiding this comment.
I think marking it as an override only when the member's recording differs from the head's is better than forcing GET_ITER through a redundant transformation.
There was a problem hiding this comment.
Yes, but more changes are needed. We might need to add a metadata mapping table for the family slot later on, which would also help us handle multi-record operations. Let’s wait for others to review this.
| uint8_t indices[MAX_RECORDED_VALUES]; | ||
| } _PyOpcodeRecordEntry; | ||
| PyAPI_DATA(const _PyOpcodeRecordEntry) _PyOpcode_RecordEntries[256]; | ||
| PyAPI_DATA(const uint8_t) _PyOpcode_RecordIsFamilyOverride[256]; |
markshannon
left a comment
There was a problem hiding this comment.
Thanks for working on this.
I'm a bit worried that we have to record values (instead of types), if any instruction in the family needs to record a value.
That could potentially cause a lot of values to be recorded that we don't need delaying objects from being freed.
The could be a problem if we want to record values for one specialization of a common instruction like LOAD_ATTR or BINARY_OP
Would it be possible to record the value(s) by family, but extract the type (or whatever we need) when copying the recording value to the uop?
It would be more complex, but it looks like you have most of the pieces already.
| return result; | ||
| } | ||
|
|
||
| static const _Py_RecordTraceTransformFn record_trace_transforms[MAX_UOP_ID + 1] = { |
There was a problem hiding this comment.
Having this written manually instead of generated seems error prone. Can we generate in record_function_generator.py?
Also this is going to be a large table. Given that we only reference it when needs_family_transform is true, it might be better as a switch statement in a function.
Refer: #148571 (comment)
TRACE_RECORDcaptures snapshot values before execution, but the bytecode subsequently translated into the trace may have been transformed into specialized ops within the same family.So if this specialized op inherits the recorder layout from the family head but actually requires a derived value—such as a type object, generator function, or bound method payload—a mismatch arises between the recorder layout and the consumption semantics.
The fix implemented in this change is as follows:
Have the base opcode of each affected family uniformly record the widest raw/base value required by the entire family, while leaving the semantics of existing
_RECORD_*uops unchanged. If a specialized family member still requires a derived view, the optimizer projects the raw recorded value into the form it actually needs to consume on-demand during the trace translation phase.Some related issues: #148716 #148618
recorded_valuesleak when specialized op deopts #148571