Memory usage of GC target object may increase by 8 bytes from Python 3.7.4

The x86-64 ABI requires that int128 and long double types be placed on 16byte boundaries.

However, Python's pymalloc implementation is designed to return blocks of memory with 8-byte boundaries. .. .. Are you okay? If you try to malloc a structure containing a type that requires 16byte alignment, it should be a multiple of 16byte, so malloc (sizoef (struct)) is safe and malloc (sizeof (structure) + not a multiple of 16). ) Seems to be a problem in such cases.

The more problematic is the GC header. The GC header in a 64-bit environment is 24 bytes. As the name "header" implies, it is placed at the beginning of GC-targeted objects (objects that may make up circular references, such as tuples, lists, and dicts).

When creating a GC target object, allocate memory like malloc (sizeof (GC header) + object size) and then use the address shifted by the GC header for a pointer of type PyObject *. So even if malloc returns a 16byte boundary address, the Python object will be stored at the +24byte address, and if the object's structure has an int128 or long double type, it will be placed across the 16byte boundary. I will.

It seems that recent compilers are actually using instructions that assume 16-byte boundaries, and instead of "theoretically violating but actually working without problems", they are "really segfaulting". It has become.

I didn't notice it until recently, but Python 2.7 reported this phenomenon earlier, and 2.7.15 increased the GC header from 24byte to 32byte. (changelog, [commit](https://github.com/python/cpython/ commit / 0b91f8a668201fc58fa732b8acc496caedfdbae0), issue)

A similar issue was recently reported in Python 3.7 (Crash issue, Old ubsan error issue. org / issue27987)) So it looks like we'll have to apply the same fix unless we find another good way to not break ABI compatibility. For rare types such as long double and int128, it's a shame for me to have to +8 bytes of memory consumption for tuples that are large and never contain those types.

By the way, in Python 3.8, the contents of the GC header are reduced from 3 words (24 bytes in a 64-bit environment) to 2 words (16 bytes in the same), and the memory usage of the GC target objects str, bytes, int, float, etc. is reduced to 8. I was successful in cutting the bite. Therefore, it is not affected by this problem and no extra padding is required. The result is a 16 byte reduction compared to Python 2.7.15 or later and maybe 3.7.4 or later. I'm really glad that this improvement was successful.

Recommended Posts

Memory usage of GC target object may increase by 8 bytes from Python 3.7.4
Roughly estimate the total memory usage of an object
Usage of Python locals ()
Find the diameter of the graph by breadth-first search (Python memory)
[Python] Correct usage of map
[python] Value of function object (?)
Sample usage of Python pickle
Basic usage of Python f-string
[Python] Correct usage of join
Executing a large number of Python3 Executor.submit may consume a lot of memory