Ayoub Omari
Ayoub Omari

Reputation: 906

Are strings addressed by reference or value in Python?

If I have a python list say : ['aaa', 'bbb']. Is this list stored in 2x8 bytes (for 64-bit addressing) - that is we have only pointers to strings in the list or is it stored in [len('aaa')+len('bbb')]*size_of_char - that is we have a contiguous storage of characters of each string in the list.

Upvotes: 1

Views: 544

Answers (2)

Mad Physicist
Mad Physicist

Reputation: 114330

Under the hood in CPython, everything is a pointer to PyObject. The subtype PyListObject has a pointer to an array of pointers to PyObjects among it's structure fields.

Strings are also a subtype of PyObject, generally implemented in PyUnicodeObject. Similarly to a list, a string contains a pointer to the buffer containing it's elements.

So the sequence of pointers actually looks like this:

  1. Pointer to list object
  2. Pointer to list buffer
  3. Pointer to string object
  4. Pointer to string data

You can deduce the fact that your list buffer can't have [len('aaa') + len('bbb')] * size_of_char elements from a number of reasons.

  1. Everything in Python is an object, so at the very least you need to have space for the additional metadata.
  2. Lists can hold any kind of object, not just fixed length strings. How do you index into a list where elements have different sizes?
  3. Characters can have different sizes in Unicode. The number of bytes in a string and the number of characters are not directly related. This brings us back to both #1 and #2.

In general, if you are curious about the internal workings of CPython, look into the API docs, and the source code.

Upvotes: 2

Marcus.Aurelianus
Marcus.Aurelianus

Reputation: 1518

A way to access python address is to use id().

>>> a=['aaa', 'bbb']

>>> id(a)
62954056

>>> id(a[0])
62748912

>>> id(a[1])
61749544

Further reading is here [understanding-python-variables and memory management].

Upvotes: 3

Related Questions