LeopardShark
LeopardShark

Reputation: 4416

Create a new numpy array from list or tuple

When creating new numpy arrays, you can make them like this:

a = numpy.array((2, 5))
b = numpy.array((a[0] + 1, 10))

or like this:

a = numpy.array([2, 5])
b = numpy.array([a[0] + 1, 10])

Which way is better?

Upvotes: 1

Views: 343

Answers (2)

srcLegend
srcLegend

Reputation: 74

After a few tests based on @leopardshark's answer, it seems that tuples are only better if you initialize the array based on constants. If you initialize from a tuple/list of variables, there's negligible difference

Constants test setup

import dis, timeit

list_timing = timeit.timeit('numpy.array([2, 5])', setup = 'import numpy', number = 1000000)
tuple_timing = timeit.timeit('numpy.array((2, 5))', setup = 'import numpy', number = 1000000)

print(f"List mean time: {list_timing}")
>>> 0.6392972
print(f"Tuple mean time: {tuple_timing}")
>>> 0.6296533

Disassembled outputs here are the same as @leopardshark's

Variables test setup

import dis, timeit
x, y = 2, 5

list_timing = timeit.timeit('numpy.array([x, y])', setup = 'import numpy; x, y = 2, 5', number = 1000000)
tuple_timing = timeit.timeit('numpy.array((x, y))', setup = 'import numpy; x, y = 2, 5', number = 1000000)

print(f"List mean time: {list_timing}")
>>> 0.6279472
print(f"Tuple mean time: {tuple_timing}")
>>> 0.6288363

print(dis.dis('numpy.array([x, y])'))
>>>  1           0 LOAD_NAME                0 (numpy)
>>>              2 LOAD_METHOD              1 (array)
>>>              4 LOAD_NAME                2 (x)
>>>              6 LOAD_NAME                3 (y)
>>>              8 BUILD_LIST               2
>>>             10 CALL_METHOD              1
>>>             12 RETURN_VALUE

print(dis.dis('numpy.array((x, y))'))
>>>  1           0 LOAD_NAME                0 (numpy)
>>>              2 LOAD_METHOD              1 (array)
>>>              4 LOAD_NAME                2 (x)
>>>              6 LOAD_NAME                3 (y)
>>>              8 BUILD_TUPLE              2
>>>             10 CALL_METHOD              1
>>>             12 RETURN_VALUE

The disassembled outputs are the same, while the timings are much closer

Upvotes: 2

LeopardShark
LeopardShark

Reputation: 4416

Tuples are about 10% faster.

>>> timeit.timeit("numpy.array((2, 5))", setup="import numpy")
0.9039838570024585
>>> timeit.timeit("numpy.array([2, 5])", setup="import numpy")
1.0044978570003877

I got the same results with the numpy.array((a[0] + 1, 10)) example as well. The dis tool reveals the reason for the difference:

>>> dis.dis("numpy.array((2, 5))")
  1           0 LOAD_NAME                0 (numpy)
              2 LOAD_METHOD              1 (array)
              4 LOAD_CONST               0 ((2, 5))
              6 CALL_METHOD              1
              8 RETURN_VALUE
>>> dis.dis("numpy.array([2, 5])")
  1           0 LOAD_NAME                0 (numpy)
              2 LOAD_METHOD              1 (array)
              4 LOAD_CONST               0 (2)
              6 LOAD_CONST               1 (5)
              8 BUILD_LIST               2
             10 CALL_METHOD              1
             12 RETURN_VALUE

It seems the tuple is treated as a single object as it is created whereas the list needs to be built.

Upvotes: 2

Related Questions