Reputation: 79
I am a python beginner. I have an error message "ValueError: operands could not be broadcast together with shapes".
Here is my data:
import numpy as np
spent = np.array([
10, 10, 13, 12, 109, 17, 31, 1, 39, 41, 45,
41, 71, 161, 39, 115, 5, 51, 58, 334, 165, 1032,
40, 52, 21, 68, 79, 482, 10, 265, 60, 67, 12,
53, 188, 32, 397, 51, 17, 156, 100, 85, 53, 95,
68, 308, 53, 675, 78, 27, 219, 45, 45, 30, 61,
16, 72, 80, 96, 1386, 370, 16, 81, 28, 43, 90,
33, 66, 77])
visit = np.array([
19, 13, 16, 16, 18, 9, 12, 3, 15, 16, 16, 3, 4, 11, 11, 11, 11,
12, 12, 12, 13, 13, 14, 14, 15, 15, 5, 6, 6, 7, 7, 7, 7, 7,
17, 17, 8, 8, 8, 4, 4, 13, 8, 4, 4, 9, 20, 10, 11, 11, 14,
12, 12, 15, 12, 12, 13, 13, 13, 13, 14, 14, 14, 15, 16, 16, 18, 11,
6])
My job is select spent>100 and visit>10 together. So, I would like to find people who pay more than $100 among people who visited more than 10 times. I have tried the following codes.
a=spent[spent>100] & [visit>10]
print(a)
But, I have an error message "ValueError: operands could not be broadcast together with shapes". Could you advise me how to deal with this? I just have no idea.
Upvotes: 1
Views: 2427
Reputation: 394449
IIUC you don't need the mask on spent
per se like you've done:
In[16]:
a=(spent>100) & (visit>10)
a
Out[16]:
array([False, False, False, False, True, False, False, False, False,
False, False, False, False, True, False, True, False, False,
False, True, True, True, False, False, False, False, False,
False, False, False, False, False, False, False, True, False,
False, False, False, False, False, False, False, False, False,
False, False, False, False, False, True, False, False, False,
False, False, False, False, False, True, True, False, False,
False, False, False, False, False, False], dtype=bool)
This gives you a boolean mask that is only True
where both conditions are met in both arrays, you can then use this to mask against the original arrays
So using this against spent
:
In[18]:
spent[a]
Out[18]: array([ 109, 161, 115, 334, 165, 1032, 188, 219, 1386, 370])
Your error was that you masked your original array which produced an array that was a different shape that what you're trying to broadcast against visit
:
print(spent[spent>100].shape)
print((visit>10).shape)
(16,)
(69,)
You could compound the conditions into the same mask:
In[20]:
spent[(spent > 100) & (visit > 10)]
Out[20]: array([ 109, 161, 115, 334, 165, 1032, 188, 219, 1386, 370])
to produce the same result
Upvotes: 3
Reputation: 4866
A possible solution would be using list comprehensions:
[(x, y) for x, y in zip(visit, spent) if x > 10 and y > 100]
You could also use numpy as follows:
spent[visit > 10] > 100
Upvotes: 0