Mauro Del Nook
Mauro Del Nook

Reputation: 114

Removing everything inside squared brackets in pandas series or dataframe

Good day, is it possible to remove everything that's inside including the squared brackets? Thanks in advance

df = pd.DataFrame({'City': ['Santiago [1]','Madrid [2]','Barcelona [2]']})
df

City
0 Santiago [1]
1 Madrid [2]
2 Barcelona [2]

Desired output:

City
0 Santiago
1 Madrid
2 Barcelona

Upvotes: 0

Views: 201

Answers (4)

BENY
BENY

Reputation: 323266

Use split + strip

df.City=df.City.str.split('[').str[0].str.strip()
df
        City
0   Santiago
1     Madrid
2  Barcelona

Upvotes: 2

Allen Qin
Allen Qin

Reputation: 19947

This should work with oen or more [xxx] appearing at anywhere in your string.

df.City.str.split('\[.*\]').str.join('')

Upvotes: 0

unfussygarlic
unfussygarlic

Reputation: 153

YOBEN_S's answer is perfect. I am just adding an alternative where you don't have to use strip() by just using split() which splits the string by the white space in between.

df.City=df.City.str.split().str[0]
df
        City
0   Santiago
1     Madrid
2  Barcelona

EDIT : As Nick commented, this wouldn't work with cities containing white spaces in between. Here's an alternative if you want to separate using white space

df = pd.DataFrame({'City': ['Santiago [1]','Madrid [2]','Barcelona [2]','New York [2]','India and China [10]']})
df.City=df.City.apply(lambda x : " ".join(x.split()[:-1]))
df
              City
0         Santiago
1           Madrid
2        Barcelona
3         New York
4  India and China

Upvotes: 1

sammywemmy
sammywemmy

Reputation: 28644

Regex could work here as well ... get all characters before the [ :

df['extract'] = df.City.str.extract(r'(.*(?=\[))')

       City         extract
0   Santiago [1]    Santiago
1   Madrid [2]      Madrid
2   Barcelona [2]   Barcelona

Upvotes: 0

Related Questions