Reputation: 630
I have two variables coming from diffrent functions and the first one a
is:
<class 'numpy.ndarray'>
(100,)
while the other one b
is:
<class 'numpy.ndarray'>
(100, 1)
If I try to correlate them via:
from scipy.stats import pearsonr
p, r= pearsonr(a, b)
I get:
r = max(min(r, 1.0), -1.0)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
My questions are:
Upvotes: 3
Views: 5519
Reputation: 3806
You'll need to call the reshape function on the first one to .reshape((100,1))
Reshape will change the "shape" property of the np array which will make the 1D array [1,2,3, ..., 100] to a 2D array [[1],[2],[3],...[100]]
Upvotes: 0
Reputation: 577
First question's answer: a
is a vector, and b
is a matrix. Look at this stackoverflow link for more details: Difference between numpy.array shape (R, 1) and (R,)
Second question's answer:
I think converting one to the other form should just work fine. For the function you provided, I guess it expects vectors, hence just reshape b using b = b.reshape(-1)
which converts it to a single dimensions (a vector). Look at the below example for reference:
>>> import numpy as np
>>> from scipy.stats import pearsonr
>>> a = np.random.random((100,))
>>> b = np.random.random((100,1))
>>> print(a.shape, b.shape)
(100,) (100, 1)
>>> p, r= pearsonr(a, b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\xyz\Appdata\Local\Continuum\Anaconda3\lib\site-packages\scipy\stats\stats.py", line 3042, in pearsonr
r = max(min(r, 1.0), -1.0)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> b = b.reshape(-1)
>>> p, r= pearsonr(a, b)
>>> print(p, r)
0.10899671932026986 0.280372238354364
Upvotes: 3
Reputation: 943
(100,1) is 2d array of rows of length 1 like = [[1],[2],[3],[4]]
and second one is 1d array [1, 2, 3, 4 ]
a1 = np.array([[1],[2],[3],[4]])
a2 = np.array([1, 2, 3, 4 ])
Upvotes: 4