Reputation: 1852
I have a pandas dataframe with about 1,500 rows and 15 columns. For one specific column, I would like to remove the first 3 characters of each row. As a simple example here is a dataframe:
import pandas as pd
d = {
'Report Number':['8761234567', '8679876543','8994434555'],
'Name' :['George', 'Bill', 'Sally']
}
d = pd.DataFrame(d)
I would like to remove the first three characters from each field in the Report Number
column of dataframe d
.
Upvotes: 65
Views: 138216
Reputation: 23141
You can also call str.slice
. To remove the first 3 characters from each string:
df['Report Number'] = df['Report Number'].str.slice(3)
To slice the 2-4th characters from each string:
df['Report Number'] = df['Report Number'].str.slice(1, 4)
Upvotes: 3
Reputation: 164673
It is worth noting Pandas "vectorised" str
methods are no more than Python-level loops.
Assuming clean data, you will often find a list comprehension more efficient:
# Python 3.6.0, Pandas 0.19.2
d = pd.concat([d]*10000, ignore_index=True)
%timeit d['Report Number'].str[3:] # 12.1 ms per loop
%timeit [i[3:] for i in d['Report Number']] # 5.78 ms per loop
Note these aren't equivalent, since the list comprehension does not deal with null data and other edge cases. For these situations, you may prefer the Pandas solution.
Upvotes: 7