Reputation: 167
given a column of strings in a dataframe, the following code transforms them into integers. What I want to do is to just leave the string part without the dot, and whenever the cell contains a number in string form, I would like to change it to a string called 'number'. Just to be clear, the cells in this column have the following values:
'a. 12','b. 75','23', 'c/a 34', '85', 'a 32', 'b 345'
and I want to replace the cell values in this column with the following:
'a', 'b', 'number', 'c/a', 'number', 'a' , 'b'
How do I do that?
l2=['a. 12','b. 75','23', 'c/a 34', '85', 'a 32', 'b 345']
d = {'col1': []}
df = pd.DataFrame(data=d)
df['col1']=l2
df['col1'] = df['col1'].str.replace(r'\D', '').astype(str)
print(df)
Upvotes: 0
Views: 625
Reputation: 260640
According to your example which seems to be (1) change numbers only to 'number' and (2) remove trailing dot/space/numbers:
df['col1'] = df['col1'].str.replace(r'^[\d\s]+$', 'number', regex=True).str.replace('\.?\s*\d*$', '')
output:
col1
0 a
1 b
2 number
3 c/a
4 number
5 a
6 b
Upvotes: 1