Reputation: 173
I want to split on commas, and then remove the commas. I start out with a dataframe with 2 columns that I read in from a csv file.
[name] [feature1, feature2, feature3] - the features are all in one cell and each row may have a different number of features.
I made a sub-df from the main df with this code(pulled out the 2 columns i want for this):
df_features =df.loc[:,['name','features']]
Then split on the features column to separate them with this code:
df_features_split = df_features.features.str.split(expand=True,)
It splits the features into their own columns which is what I want, but leaves the commas after the text. I want to get rid of it. I tried:
df_features_split=df_features_split.replace(',', '')
but it does not remove the commas, I think maybe it needs to be more specific, but I'm not quite sure.any help would be appreciated.
Here is a sample of my df before it was split. Sorry, I hope the format is okay. There are 2 rows of the df.
1 The Beehive Loop Trail beach, dogs-no, forest, lake, views, wild-flowers, wildlife
2 Cadillac North Ridge Trail dogs-leash, forest, kids, partially-paved, views, wild-flowers, wildlife
Thank you!
Upvotes: 1
Views: 702
Reputation: 30002
You are really close to the answer. What you miss is the pat
argument of pandas.Series.str.split().
df_features_split = df.features.str.split(pat=',', expand=True)
Upvotes: 1