Lui Hellesoe
Lui Hellesoe

Reputation: 384

Removing trailing numbers inside a column pandas

I have a column with 25k + records containing similar information in the example data frame.

Example data frame:

df = pd.DataFrame({'Address': ['Corner Great North Rd & Pipiwai Rd RD 6 Whangarei 2104',
                               '2305/142 Shakespeare Road Takapuna North Shore 0622',
                              '29 Stilwell Rd Mt Albert Auckland', '10/70 Atkinson Ave Otahuhu Auckland 1062']})

I'm trying to remove the trailing numbers in each record so ideally I would get a column that contains:

Address
--------
Corner Great North Rd & Pipiwai Rd RD 6 Whangarei
2305/142 Shakespeare Road Takapuna North Shore
29 Stilwell Rd Mt Albert Auckland
10/70 Atkinson Ave Otahuhu Auckland

I've tried using regex to remove all the characters at the end of string until it hits the first white space.

pattern = re.sub("(.*\s).*", '\\1', str)

df['Address'] = df.str.replace(pattern, '', regex=True)

This throws a TypeError exception which I think is caused from the numbers in the string. However I believe this code could also delete any trailing words deleting information I want to keep.

My question: Is there a regex pattern that could be applied for the entire column?

Upvotes: 0

Views: 718

Answers (2)

sophocles
sophocles

Reputation: 13821

Or you can use rstrip() and remove the trailing digits:

df['Address'] = df['Address'].str.rstrip(' 0123456789') # removed an additional trailing space

Prints:

df
Out[196]: 
                                             Address
0  Corner Great North Rd & Pipiwai Rd RD 6 Whanga...
1    2305/142 Shakespeare Road Takapuna North Shore 
2                  29 Stilwell Rd Mt Albert Auckland
3                10/70 Atkinson Ave Otahuhu Auckland 

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

Using str.replace:

df['Address'] = df['Address'].str.replace(r'\s+\d+$', '')

The regex pattern \s+\d+$ will match one or more whitespace characters followed by a number trailing to the end of the address.

Upvotes: 2

Related Questions