Victor
Victor

Reputation: 191

ValueError: operands could not be broadcast together with shapes (3,) (6,)

I am trying to calculate the below and getting an error when the arrays are not similar size. I know I can do this manually for different sized arrays, but can you help me correct this code.

import scipy
from scipy.stats import pearsonr, spearmanr
from scipy.spatial import distance
x = [5,3,2.5]
y = [4,3,2,4,3.5,4]
pearsonr(x,y)
:Error
scipy.spatial.distance.euclidean(x, y)
:Error
spearmanr(x,y)
:Error
scipy.spatial.distance.jaccard(x, y)
:Error

Upvotes: 1

Views: 827

Answers (1)

MSeifert
MSeifert

Reputation: 152850

For the distance the arrays must be of dimension 2, even if each subarray just contains one element, for example:

def make2d(lst):
    return [[i] for i in lst]

>>> scipy.spatial.distance.cdist(make2d([5,3,2.5]), make2d([4,3,2,4,3.5,4]))
array([[ 1. ,  2. ,  3. ,  1. ,  1.5,  1. ],
       [ 1. ,  0. ,  1. ,  1. ,  0.5,  1. ],
       [ 1.5,  0.5,  0.5,  1.5,  1. ,  1.5]])

You can choose a different metric (like jaccard):

>>> scipy.spatial.distance.cdist(make2d([5,3,2.5]), make2d([4,3,2,4,3.5,4]), metric='jaccard')
array([[ 1.,  1.,  1.,  1.,  1.,  1.],
       [ 1.,  0.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.,  1.]])

But for the statistics functions I have no idea how you want that to work, these sort-of require same-length arrays by definition. You may need to consult the documentation of these.

Upvotes: 1

Related Questions