Reputation: 1617
I am redaing data from csv I have a dataframe like this:
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt []
T-shirt ['blue','green']
my expected dataframe will be look like this
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt
T-shirt ['blue','green']
I want to remove empty list. How to do that in pandas?
update1
I applied Scott Boston,Ynjxsjmh and BENY solution. All solution filling None value for all of my row but I need to fill None value for only my empty list.
when I run type(df.loc[0,'variations_color'])
returning str
Upvotes: 0
Views: 152
Reputation: 153460
Just apply
len
:
df.loc[df['variations_color'].apply(len) == 0, 'variations_color'] = ''
or
df.loc[df['variations_color'].apply(len) == 0, 'variations_color'] = np.nan
Output:
product_title variations_color
0 T-shirt [yellow, orange]
1 T-shirt NaN
2 T-shirt [blue, green]
given df,
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':[['yellow', 'orange'],[],['blue', 'green']]})
However, if your datafame structure is like this:
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':['[yellow, orange]','[]','[blue, green]']})
Then, you can use the following:
df.loc[df['variations_color'] == '[]', 'variations_color'] = np.nan
Output:
product_title variations_color
0 T-shirt [yellow, orange]
1 T-shirt NaN
2 T-shirt [blue, green]
Note the difference in the first example
type(df.loc[0,'variations_color'])
returns a list
And, the second returns str. The string representation of the dataframe are identical, so you can't tell just by looking at it when printing. It is always important in python to know what kind (datatype) of the object you're working with.
Upvotes: 2
Reputation: 323226
Check assign with bool check
df.loc[~df['variatons_color'].astype(bool),'variatons_color'] = ''
Update
df.loc[df['variatons_color'].eq('[]'),'variatons_color'] = ''
Upvotes: 2
Reputation: 116
Look here!
import pandas as pd
from io import StringIO
data = '''
product_title variatons_color
T-shirt ['yellow','ornage']
T-shirt []
T-shirt ['blue','green']
'''
df = pd.read_csv(StringIO(data), delim_whitespace=True)
df.variatons_color = df.variatons_color.apply(eval)
df
'''
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt []
2 T-shirt [blue, green]
'''
type(df.iat[0, 1])
# list
df.mask(df.applymap(len) == 0, None)
'''
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt None
2 T-shirt [blue, green]
'''
Done!
Upvotes: 0
Reputation: 185
import pandas as pd
df = pd.DataFrame({'product_title':['T-shirt']*3,
'variations_color':[['yellow', 'orange'],[],['blue', 'green']]})
df['variations_color'] = df['variations_color'].apply(lambda x: None if any(eval(str(x))) == False else x)
df
Upvotes: 0
Reputation: 29982
You can try
df['variatons_color'] = df['variatons_color'].apply(lambda lst: lst if len(lst) else '')
print(df)
product_title variatons_color
0 T-shirt [yellow, ornage]
1 T-shirt
2 T-shirt [blue, green]
Upvotes: 3