The problem
Yesterday I wasted about half an hour chasing a bug because I was mindlessly generating a 5x5 matrix with the list repeat syntax, aka list * integer
in pure Python:
n = 5
array = [[0] * n] * n
When I tried to flip some specific bits in this matrix, I observed that entire columns got flipped, instead of just one element:
array[0][0] = 1
print(array)
# Output: [[1, 0, 0, 0, 0],
# [1, 0, 0, 0, 0],
# [1, 0, 0, 0, 0],
# [1, 0, 0, 0, 0],
# [1, 0, 0, 0, 0]]
In retrospect, the behaviour makes sense, but in the middle of a much larger program, it was a bit hard to find.
Furthermore, I usually have numpy
available to help with
multi-dimensional array, so I definitely forgot how pure Python worked.
To demonstrate what’s going on under the hood, let’s generate an array of objects with list repeat syntax:
from dataclasses import dataclass
@dataclass
class Num:
val: int
zero = Num(0)
array = [zero] * 5
print(array)
# Output: [Num(val=0), Num(val=0), Num(val=0), Num(val=0), Num(val=0)]
In the example above, I’m using the repeat syntax to generate an array with 5 Num
objects, each having the value of 0
.
The problem is it actually isn’t 5 different Num
objects. It’s the same Num
object used 5 times.
array[0] is array[1]
# Output: True
Modify any of these references will cause the entire list to get updated:
array[0].val = 1
print(array)
# Output: [Num(val=1), Num(val=1), Num(val=1), Num(val=1), Num(val=1)]
Following this logic, in my original multi-dimensional array, I didn’t generate 5 distinct rows of number but repeated the same row of number 5 times. In other words, since array[0]
and array[1]
refer to the same list
object, mutating array[0][0]
will mutate array[1][0]
as well.
The correct fix
To correctly generate distinct objects in a list, we will have to use a list comprehension to make sure that new objects are created in each iteration:
array = [Num(val=0) for _ in range(5)]
array[0] == array[1] # Output: True -- the objects in the list still have the same value (0)
array[0] is array[1] # Output: False -- but they are completely different object now
Putting it all together, if you want to generate a multi-dimensional array in pure Python, use list comprehension to ensure the rows are different list objects:
array = [[0 for _ in range(num_cols)] for _ in range(num_rows)]]
The wrong, but fun, fix
After finishing my program, I went back to take a look at CPython’s implementation of list_repeat
to see why it works the way it does under the hood:
static PyObject *
list_repeat(PyListObject *a, Py_ssize_t n)
{
Py_ssize_t i, j;
Py_ssize_t size;
PyListObject *np;
PyObject **p, **items;
PyObject *elem;
if (n < 0)
n = 0;
if (n > 0 && Py_SIZE(a) > PY_SSIZE_T_MAX / n)
return PyErr_NoMemory();
size = Py_SIZE(a) * n;
if (size == 0)
return PyList_New(0);
np = (PyListObject *) list_new_prealloc(size);
if (np == NULL)
return NULL;
if (Py_SIZE(a) == 1) {
items = np->ob_item;
elem = a->ob_item[0];
for (i = 0; i < n; i++) {
items[i] = elem;
Py_INCREF(elem);
}
}
else {
p = np->ob_item;
items = a->ob_item;
for (i = 0; i < n; i++) {
for (j = 0; j < Py_SIZE(a); j++) {
*p = items[j];
Py_INCREF(*p);
p++;
}
}
}
Py_SET_SIZE(np, size);
return (PyObject *) np;
}
It’s been a while since I did C, so to understand this code, it’s worth noting that:
a
is a pointer to aPyListObject
- The elements in
a
is accessed througha->ob_item
. This is a pointer to the pointers of the underlying PyObject in the list, as defined here. That’s why we seeelem = a->ob_item[0];
withelem
being a pointer to aPyObject
for example.
In this implemenation, to repeat list a
n
times in Python, the interpreter will first allocate a new list np
with the correct size. Then during each iteration from 1 to n, the value of each pointer in a->ob_item
get copied over to np
.
To illustrate, in our [Num(val=0)] * 5
example earlier:
Num(val=0)
will be aPyObject
. Let’s say its memory address is0xABC
.- The new list
np
will contain0xABC
repeated 5 times. - Modifying any element in
np
will modify the object located at0xABC
.
In other words, the list repeat syntax gives me a list of the same pointer to the same object.
Fun fact: I found Guido’s original commit implementing list_repeat from 1991 here. I ❤️ git.
I think this behaviour is desirable because the alternative is to deepcopy an arbitrary PyObject
, which, in my limited understanding, is virtually impossible to do safely in CPython. I tried to monkey around with memcpy
a bit:
static PyObject *
list_repeat(PyListObject *a, Py_ssize_t n)
{
Py_ssize_t i, j;
Py_ssize_t size;
PyListObject *np;
PyObject **p, **items;
PyObject *elem;
PyTypeObject *elem_type;
Py_ssize_t elem_size;
if (n < 0)
n = 0;
if (n > 0 && Py_SIZE(a) > PY_SSIZE_T_MAX / n)
return PyErr_NoMemory();
size = Py_SIZE(a) * n;
if (size == 0)
return PyList_New(0);
np = (PyListObject *) list_new_prealloc(size);
if (np == NULL)
return NULL;
if (Py_SIZE(a) == 1) {
items = np->ob_item;
elem = a->ob_item[0];
elem_type = Py_TYPE(elem);
elem_size = elem_type->tp_basicsize + abs(Py_SIZE(elem)) * elem_type->tp_itemsize;
for (i = 0; i < n; i++) {
PyObject *s = (PyObject *)PyObject_Malloc(elem_size);
memcpy(s, elem, elem_size);
items[i] = s;
Py_INCREF(s);
Py_INCREF(elem);
}
}
else {
p = np->ob_item;
items = a->ob_item;
for (i = 0; i < n; i++) {
for (j = 0; j < Py_SIZE(a); j++) {
elem = items[j];
elem_type = Py_TYPE(elem);
elem_size = elem_type->tp_basicsize + abs(Py_SIZE(elem)) * elem_type->tp_itemsize;
PyObject *s = (PyObject *)PyObject_Malloc(elem_size);
memcpy(s, elem, elem_size);
*p = s;
Py_INCREF(*p);
p++;
}
}
}
Py_SET_SIZE(np, size);
return (PyObject *) np;
}
As expected, due to the shallow nature of memcpy
, even though we get new pointer to the object in the list created by list_repeat
, it still won’t work correctly:
from dataclasses import dataclass
@dataclass
class Num:
val: int
zero = Num(0)
array = [zero] * 5
print(array)
# Output: [Num(val=0), Num(val=0), Num(val=0), Num(val=0), Num(val=0)]
array[0] is array[1]
# Output: False // it was True before
# Even though we have new pointer, they are still not deepcopies
array[0].val = 1
array[1].val
# Output: 1
In any case, it was really fun trying to work out the reason for a seemingly trivial bug and see if I could hack the language to behave differently.