ChangeMyName
ChangeMyName

Reputation: 7408

How to get access to a column in a 2D list in Python?

I am using a 2D list, and I'd like to calculate its mean value by row. The following is my code:

import numpy as np

mylist = np.zeros((2,120))    # This gives you a 2 by 120 2D list with 2 rows, and 120 columns
average_list = np.zeros(120)

for col in xrange(120):
    average_list[col] = np.mean(mylist[:][col])

However, the above chunk generates this:

IndexError: index 2 is out of bounds for axis 0 with size 2

As I find during debugging, the problem happens at the col in np.mean(mylist[:][col])

May I know what am I wrong about this?

Thanks.

Upvotes: 1

Views: 4080

Answers (4)

unutbu
unutbu

Reputation: 879321

One way to fix your code (with minimal changes) would be

for col in xrange(120):
    average_array[col] = np.mean(myarray[:, col])

However, a better way would be to avoid the for-loop and use axis=0:

average_array = myarray.mean(axis=0)   # 1

axis=0 tells mean to take the mean over the first axis, i.e. the rows.


A small example may help you see the difference between myarray[:][col] and myarray[:, col]:

In [7]: myarray = np.arange(6).reshape(2,3)

In [8]: myarray
Out[8]: 
array([[0, 1, 2],
       [3, 4, 5]])

In [9]: myarray[:][0]
Out[9]: array([0, 1, 2])

In [10]: myarray[:, 0]
Out[10]: array([0, 3])

As you can see myarray[:][0] selects the 0th row of a (copy of) myarray. So myarray[:][col] raises an IndexError when col is greater than 1, since there are only 2 rows.

Upvotes: 2

Matt
Matt

Reputation: 17629

Not directly an answer to your question, but you can specify an axis to calculate the mean on:

np.mean(mylist, axis=0)

axis=0 will give you row-wise mean, whereas axis=1 will give you the column-wise mean.

Upvotes: 1

perreal
perreal

Reputation: 97938

When you do mylist[:] you making a copy of the 2D array and then with mylist[:][col] you are indexing the first dimension. Try this:

for col in xrange(120):
    average_list[col] = np.mean([ x[col] for x in mylist] )

But unutbu's answer is far more efficient.

Upvotes: 1

Bibhas Debnath
Bibhas Debnath

Reputation: 14929

mylist has 2 lists in it. So index 2 is out of bounds.

>>> mylist
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.]])

Upvotes: 1

Related Questions