Reputation: 141
I have a dataframe with products in some hierarchy (Category
then Sub_type
).
Categories = ['Construção','Ferramentas']
Sub_types = ["Aerógrafos","Tintas","Proteção de Superfícies","Rolos","Lixas"]
A few products in the sub_type = 'Lixas'
appears on both categories.
When I filter only by sub_type
and group them by category
, it will display on both categories.
df_destaque =df[df.sub_type.isin(["Aerógrafos","Tintas","Proteção de Superfícies","Rolos","Lixas"])]
I want "Lixas" to appear only on category
== "Ferramentas", I dont want it on "Construção".
I tried to filter using category == 'Construção'
and sub_type == ['Aerógrafos','Proteção de Superfícies','Rolos','Tintas']
(as you can see, I didnt include "Lixas").
Then added a OR condition using operator |
to do the same thing to the other category
. category == 'Ferramentas'
and sub_type == ['Aerógrafos','Lixas']
But as expected, it didnt work
This is what I tried:
df_destaque = df[
((df['category']=='Construção')&(df['sub_type']==["Aerógrafos","Tintas","Proteção de Superfícies","Rolos"]))
|
((df['category']=='Ferramentas')&(df['sub_type']==["Aerógrafos","Lixas"]))]
I got a ValueError: ('Lengths must match to compare', (4627,), (4,))
Can somebody help me? Thanks
Upvotes: 0
Views: 57
Reputation: 2128
You are doing correct. Just that you can not compare df['sub_type']
with [...]
using ==
, this is possible only if df['sub_type']
and [...]
are of the same length. Also this is not what you want, what you want is to check for all the values in df['sub_type']
if it's in [...]
or not. .isin
is a suitable function for that. So you replace ==
with isin. Keep (df['category']=='Construção')
and (df['category']=='Ferramentas')
Use:
df_destaque = df[
((df['category']=='Construção')&(df['sub_type'].isin(["Aerógrafos","Tintas","Proteção de Superfícies","Rolos"])))
|
((df['category']=='Ferramentas')&(df['sub_type'].isin(["Aerógrafos","Lixas"])))]
Upvotes: 1