Reputation: 13821
This should be a relatively simple question.
Below is the sample of my df
column:
title2
1 (, 2 ct, , )
2 (, 1 ct, , )
3 (, 2 ct, , )
4 NaN
5 (, 2 ct, , )
6 (, 5 ct, , )
7 (, 7 ounce, , )
8 (, 1 gal, , )
9 NaN
10 NaN
I would like to convert the whole column to a proper string column - i.e. my desired output would be:
title2
1 2ct
2 1ct
3 2ct
4 NaN
5 2ct
6 5ct
7 7 ounce
8 1gal
9 NaN
10 NaN
I have tried the following commands, but none seem to work:
title['title3'] = title['title2'].agg(' '.join)
title['title3'] = title['title2'].apply(lambda x: ''.join(x))
title['title3'] = title['title2'].astype(str)
title['title3'] = title['title2'].values.astype(str)
The answer given in this post: Convert a pandas column containing tuples to string, also does not help me unfortunately.
Can some shed some light on this? Thank you all.
Upvotes: 1
Views: 1072
Reputation: 10624
Try the following. I assume that tuples and Nans are saved as strings in your column, if not let me know so that i will adjust solution:
def clear(x):
if x=='Nan':
return 'Nan'
else:
l=str(x)
l=[i.strip() for i in l.split(',')]
return [i for i in l if any(k in ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9') for k in i)][0]
df['title2']=df['title2'].apply(lambda x: clear(x))
Upvotes: 1
Reputation: 37
Using regex:
import re
df['title3'] = df['title2'].apply(lambda x: re.sub('[^A-Za-z0-9]', '', str(x)))
Upvotes: 1
Reputation: 11504
This will do the trick
demo_data['title2'] = demo_data['title2'].astype(str).map(lambda x: x.lstrip("\,\'\[ \(").rstrip(" \, \,\'\]\)"))
demo_data['title2'] = demo_data['title2'].str.replace(r"\', \'", ",")
demo_data['title2']= demo_data['title2'].astype(str).map(lambda x: x.lstrip("\,\'\[ \(").rstrip(" \, \,\'\]\)"))
demo_data['title2'] = demo_data['title2'].str.replace(r" ", "")
which gives.
ID title2
0 1 2ct
1 2 1ct
2 3 2ct
3 4 nan
4 5 2ct
5 6 5ct
6 7 7ounce
7 8 1gal
8 9 nan
9 10 nan
Upvotes: 1