Reputation: 5
I have a pandas dataframe with a column named ranking_pos. All the rows of this column look like this: #123 of 12,216.
The output I need is only the number of the ranking, so for this example: 123 (as an integer).
How do I extract the number after the # and get rid of the of 12,216?
Currently the type of the column is object, just converting it to integer with .astype() doesn't work because of the other characters.
Upvotes: 0
Views: 52
Reputation: 3260
You can use .str.extract
:
df['ranking_pos'].str.extract(r'#(\d+)').astype(int)
or you can use .str.split()
:
df['ranking_pos'].str.split(' of ').str[0].str.replace('#', '').astype(int)
Upvotes: 1
Reputation: 318
df.loc[:,"ranking_pos"] =df.loc[:,"ranking_pos"].str.replace("#","").astype(int)
Upvotes: 0