Why the means are different when appending items to np.array and a list in Python

Question

I have a code which looks like this:

list_a = []
arr = np.array([])
for _ in range(1,101):
    list_a.append(np.random.randint(1,1001,100).mean())
    arr = np.append(arr, np.random.randint(1,1001,100).mean())

print(f'casted list to np.array mean - {np.array(list_a).mean()}')
print(f'old school average - {sum(list_a)/len(list_a)}')
print(f'just arr.mean - {arr.mean()}')
print(f'first array element - {arr[0]}')
print(f'first list element - {list_a[0]}')
print(f'last arr element - {arr[99]}')
print(f'last list element - {list_a[99]}')

This prints:

casted list to np.array mean - 498.9785
old school average - 498.97850000000005
just arr.mean - 499.5889000000001
first array element - 510.76
first list element - 518.8
last arr element - 527.54
last list element - 521.58

Why do I get means that are not equal and why does the first and the last elements ( I am assuming the rest too ) are not equal when they are inside the same loop? Is there a difference when casting list to np.array and getting the mean vs just appending items to a np.array and getting the mean?

Karl Knechtel · Accepted Answer

Why do I get means that are not equal

Because the data is different, as you found with the other tests.

and why does the first and the last elements ( I am assuming the rest too ) are not equal when they are inside the same loop?

Because each time through the loop, you do np.random.randint(1,1001,100).mean() to determine a value to append to list_a, and then you do it again to determine a value to append to arr. np.random.randint is used to produce random numbers, so of course it produces a different array on each of those two calls; and so the means are different, and so the values stored are different.

Is there a difference when casting list to np.array and getting the mean vs just appending items to a np.array and getting the mean?

There is no such thing in Python as "casting", but no, you get the same value this way. I know, your output shows 498.9785 in one case and 498.97850000000005 in another. These are extremely close. Floating-point work sometimes involves a tiny amount of imprecision.

Why the means are different when appending items to np.array and a list in Python

Answers (2)

Related Questions