Fix segfault in TypeTreeHelper boost when reading past buffer end#374
Fix segfault in TypeTreeHelper boost when reading past buffer end#374SNWCreations wants to merge 1 commit into
Conversation
Add an 'exhausted' flag to ReaderT that is set when any bounds check fails. Check this flag at the entry of read_typetree_value and read_typetree_value_array to immediately return NULL (with a Python ValueError set) instead of continuing to operate on invalid state. Also add NULL checks after PyList_New calls that could fail when allocation is requested for a large count derived from misaligned data.
|
Unrelated to the PR which would be good to have, your sample files look fine so this is just a bug in parsing SerializeReference in whatever version of AssetStudio you're using and UnityPy. It may be a case of nested managedReference or some other bug. We should probably make a new Github issue about it. |
Thanks for the note. Just to clarify, this PR addresses exactly the crash scenario described: when the C++ boost path encounters a SerializeReference/ManagedReferencesRegistry structure where the serialized data is shorter than the type tree implies, it currently segfaults instead of raising a Python exception. The fix ensures it fails gracefully with a ValueError. Whether the data mismatch itself is a Unity serialization quirk or an upstream bug is a separate question, but at least UnityPy won't crash on it anymore. |
Problem
When
read_typetree(C++ boost path) parses certain MonoBehaviour objects whose type tree describes more data than is actually available in the object's byte range, the parser crashes with an access violation (exit code0xC0000005on Windows) instead of raising a Python exception.This happens with MonoBehaviours that have complex serialization structures (e.g.
[SerializeReference]fields withManagedReferencesRegistry). The embedded type tree correctly describes the field structure, but the actual serialized data for some fields is shorter than what the type tree implies. Asset Studio handles the same scenario gracefully with a warning:UnityPy's pure Python
read_valuepath would also handle this correctly (bounds checks raise Python exceptions). The crash only occurs in the C++ boost path.Root Cause
When a bounds check fires deep in a recursive
read_typetree_valuecall chain, the function correctly returnsNULLand setsPyErr. However, if the error occurs at a point where the caller has already partially consumed data and subsequent recursive calls are made before the NULL propagates fully, the parser can enter an inconsistent state. Additionally,PyList_New(length)calls were not checked for NULL — if a large (but positive) length value is read from misaligned data, the allocation can fail and subsequentPyList_SET_ITEMon the NULL pointer causes the segfault.Fix
Added
exhaustedflag toReaderT: Once any bounds check fails, this flag is set. At the entry ofread_typetree_valueandread_typetree_value_array, the flag is checked and NULL is returned immediately. This ensures that even if one code path doesn't perfectly propagate NULL, all subsequent parsing attempts fail fast with a proper PythonValueError.Added NULL checks after
PyList_New: In the generic array path andread_pair_array, the return value ofPyList_Newis now checked before use.Set
exhaustedflag in all bounds-check failure paths: Everyread_*function that detects an out-of-bounds condition now setsreader->exhausted = truebefore returning NULL/error.Impact
obj.read()will now receive aValueErrorexception instead of a process crash when the object data cannot be fully parsedReproduction
Any Unity bundle containing MonoBehaviour objects with
[SerializeReference]fields where the serialized data is shorter than the type tree's full structure. The crash occurs when callingobj.read()on such objects with UnityPyBoost available.Sample files:
erroring_bundles.zip