Reputation: 3811
Hi i have a df which is like this
Product
Prod1
Prod 1
Prod2
Prod 2
Prod 2
Prod 3
Prod3 and so on
I basically want to convert all these Prod1 , Prod2 and Prod3 etc to categorical variables. For doing that I need to remove the blank spaces between Prod and the number, for e.g.removing space between Prod and 1 so that Prod1 , Prod 1 etc. become Prod1 so that there are no duplicate entries for same product
Expected output for above table
Product
Prod1
Prod1
Prod2
Prod2
Prod2
Prod3
Prod3 and so on
All answers of strip and all were mentioned for a sentence only . I want an answer which can be replicable to the entire table and remove empty spaces between all the words in a column
Upvotes: 1
Views: 1369
Reputation: 320
I guess this would be the simplest way!
df['Product'] = df['Product'].str.replace(' ','')
Upvotes: 1
Reputation: 150785
Let's try str.replace
with the following pattern to remove spaces between Prod
and digits
.
df['Product'] = df.Product.str.replace('(Prod)(\s+)(\d)', r'\1\3')
Output:
Product
0 Prod1
1 Prod1
2 Prod2
3 Prod2
4 Prod2
5 Prod3
6 Prod3 and so on
Upvotes: 3
Reputation: 82785
Using str.split().agg("".join)
Ex:
df['Product'] = df['Product'].str.split().agg("".join)
#or
#df['Product'] = df['Product'].str.replace(r"(\s+)", "")
print(df)
Output:
Product
0 Prod1
1 Prod1
2 Prod2
3 Prod2
4 Prod2
5 Prod3
6 Prod3
Upvotes: 4