Reputation: 335
I'm grouping a list of transactions by UK Postcode, but I only want to group by the first part of the postcode. So, UK post codes are in two parts, outward and inward, separated by a [space]. e.g. W1 5DA.
subtotals = df.groupby('Postcode').count()
Is the way I'm doing it now, the way I've thought about doing it at the moment is adding another column to the DataFrame with just the first word of the Postcode column, and then grouping by that... but I'm wondering if there's any easier way to do it.
Thank you
Upvotes: 2
Views: 2370
Reputation: 863511
I think you need groupby
by Series
created by split
by first space:
subtotals = df.groupby(df['Postcode'].str.split().str[0]).count()
Sample:
df = pd.DataFrame({'Postcode' :['W1 5DA','W1 5DA','W2 5DA']})
print (df)
Postcode
0 W1 5DA
1 W1 5DA
2 W2 5DA
print (df['Postcode'].str.split().str[0])
0 W1
1 W1
2 W2
Name: Postcode, dtype: object
subtotals = df.groupby(df['Postcode'].str.split().str[0]).count()
print (subtotals)
Postcode
Postcode
W1 2
W2 1
Check also What is the difference between size and count in pandas?
Upvotes: 4