Reputation:
I have a data table like this:
Item Colour Item Range Item Size
789 COLOUR-BLUE RANGE-PANT SIZE-XXL
2507 COLOUR-BLACK RANGE-OTHER SIZE-XXL
2376 COLOUR-BLACK RANGE-JACKET SIZE-S
1378 COLOUR-WHITE RANGE-OTHER SIZE-L
598 COLOUR-BLUE RANGE-JACKET SIZE-M
1589 COLOUR-BLUE RANGE-JACKET SIZE-L
2580 COLOUR-BLACK RANGE-SHIRT SIZE-L
366 COLOUR-BLUE RANGE-PANT SIZE-XXL
2320 COLOUR-WHITE RANGE-OTHER SIZE-L
1247 COLOUR-GREEN RANGE-PANT SIZE-M
2224 COLOUR-BLACK RANGE-JACKET SIZE-L
3615 COLOUR-BLACK RANGE-OTHER SIZE-S
4176 COLOUR-GREEN RANGE-PANT SIZE-XL
1640 COLOUR-BLACK RANGE-PANT SIZE-S
1136 COLOUR-WHITE RANGE-OTHER SIZE-M
3437 COLOUR-BLACK RANGE-JACKET SIZE-S
4448 COLOUR-WHITE RANGE-OTHER SIZE-S
1188 COLOUR-WHITE RANGE-SHIRT SIZE-XXL
3332 COLOUR-GREEN RANGE-OTHER SIZE-M
1080 COLOUR-WHITE RANGE-OTHER SIZE-XXL
I want to get only the sub selection of data using the following mask:
mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])
I tried df[mask]
but it gives me the error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How to avoid the error.
I have done this so far:
import numpy as np
import pandas as pd
df = pd.read_clipboard()
df.drop(['Item','Item.2','Size'], inplace=True,axis=1)
df.columns = ['Item Colour', 'Item Range', 'Item Size']
print(df)
mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])
dff = df[mask]
dff
Update Still does not work.
mask = (df['Item Colour'] == 'COLOUR-WHITE').all()\
& (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']).all()\
& ( ~df['Item Size'].isin(['SIZE-XXL']).all())
df[mask]
Upvotes: 1
Views: 47
Reputation: 1726
The problem is coming from the way you're building your mask by checking whether items are in a list. You can do this with the pd.Series.isin([item1, item2, ...])
Series method. So, instead of:
df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']
,
do:
df['Item Range'].isin(['RANGE-JACKET','RANGE-PANT'])
To negate, for the 'not in':
df['Item Size'] not in ['SIZE-XXL']
,
you can do:
~df['Item Size'].isin(['SIZE-XXL'])
Upvotes: 2