Reputation: 1852

Remove first x number of characters from each row in a column of a Python dataframe

I have a pandas dataframe with about 1,500 rows and 15 columns. For one specific column, I would like to remove the first 3 characters of each row. As a simple example here is a dataframe:

import pandas as pd

d = {
    'Report Number':['8761234567', '8679876543','8994434555'],
    'Name'         :['George', 'Bill', 'Sally']
     }

d = pd.DataFrame(d)

I would like to remove the first three characters from each field in the Report Number column of dataframe d.

Upvotes: 65

Answers (3)

cottontail

Reputation: 23141

You can also call str.slice. To remove the first 3 characters from each string:

df['Report Number'] = df['Report Number'].str.slice(3)

To slice the 2-4th characters from each string:

df['Report Number'] = df['Report Number'].str.slice(1, 4)

Upvotes: 3

jpp

Reputation: 164673

It is worth noting Pandas "vectorised" str methods are no more than Python-level loops.

Assuming clean data, you will often find a list comprehension more efficient:

# Python 3.6.0, Pandas 0.19.2

d = pd.concat([d]*10000, ignore_index=True)

%timeit d['Report Number'].str[3:]           # 12.1 ms per loop
%timeit [i[3:] for i in d['Report Number']]  # 5.78 ms per loop

Note these aren't equivalent, since the list comprehension does not deal with null data and other edge cases. For these situations, you may prefer the Pandas solution.

Upvotes: 7

EdChum

Reputation: 394041

Use vectorised str methods to slice each string entry

In [11]:
d['Report Number'] = d['Report Number'].str[3:]
d

Out[11]:
     Name Report Number
0  George       1234567
1    Bill       9876543
2   Sally       4434555

Upvotes: 105

Remove first x number of characters from each row in a column of a Python dataframe

Answers (3)

Related Questions