Reputation: 867
I have two pandas dataframes and I want to get some unique row counts from one dataframe (responses
) as column values in the other dataframe (contacts
)
import pandas as pd
contacts = pd.read_csv('contacts.csv', encoding='ISO-8859-1')
responses = pd.read_csv('campaign_responses.csv', encoding='ISO-8859-1')
contacts.head()
contact_id job_title country Email Webinar
0 0031B00002cPLuFQAW manager US 0 0
1 0031B00002Z2zMYQAZ admin UK 0 0
2 003a000001nHioCAAS manager DE 0 0
Note: Email and Webinar will be 0 for all rows. They're placeholder values for the moment.
responses.head()
campaign_type contact_id
0 Email 0031B00002cPLuFQAW
1 Webinar 0031B00002Z2zMYQAZ
2 Webinar 0031B00002cPLuFQAW
3 Webinar 0031B00002cPLuFQAW
4 Email 003a000001nHioCAAS
5 Email 003a000001nHioCAAS
I'd like to get a count of how many times each contact has responded to each campaign type as an attribute in the contacts data frame.
The final contacts
data frame should look like this (based on the data above)
contact_id job_title country Email Webinar
0 0031B00002cPLuFQAW manager US 1 2
1 0031B00002Z2zMYQAZ admin UK 0 1
2 003a000001nHioCAAS manager DE 2 0
Upvotes: 1
Views: 72
Reputation: 1509
Short and simple:
df.groupby(['contact_id', 'campaign_type']).size().unstack('type', fill_value=0)
Edit: neither short nor simple, see other answer.
Upvotes: 1
Reputation: 323286
Seems like you need
pd.crosstab(df.contact_id,df.campaign_type)
Out[37]:
campaign_type Email Webinar
contact_id
0031B00002Z2zMYQAZ 0 1
0031B00002cPLuFQAW 1 2
003a000001nHioCAAS 2 0
Upvotes: 4