Pandas - How to groupby special partial string

Question

I have a .csv as the following:

Population  Region
1001        Rigolet (N.L.)
2000        Nain (N.L.)
3000        Lot 63 (P.E.I.)
4000        Lot 53 (P.E.I.)
5000        Burnt Islands (N.L.)
6000        Burgeo (N.L.)
7000        Ham-Nord (Que.)
8000        Chesterville (Que.)
1000        Warwick (Que.)
9000        Prince (Ont.)
1002        Wawa (Ont.)

I'd like to group by the ending part in the parentheses of the string in the Region column, such as '(N.L.)' or '(Ont.)'.

How could I do this?

Thanks a lot!

Erfan · Accepted Answer

Use Series.str.rsplit with n=1 so you split on the first whitespace from the right. Then groupby on these values:

grps = df['Region'].str.rsplit(n=1).str[-1]
df.groupby(grps).#dosomething

When we wrint grps:

print(grps)
0       (N.L.)
1       (N.L.)
2     (P.E.I.)
3     (P.E.I.)
4       (N.L.)
5       (N.L.)
6       (Que.)
7       (Que.)
8       (Que.)
9       (Ont.)
10      (Ont.)
Name: Region, dtype: object

Pandas - How to groupby special partial string

Answers (1)

Related Questions