Reputation: 13
I have a data frame that is similar to the following:
Time Account_ID Device_ID Zip_Code
0 2011-02-02 12:02:19 ABC123 A12345 83420
1 2011-02-02 13:35:12 EFG456 B98765 37865
2 2011-02-02 13:54:57 EFG456 B98765 37865
3 2011-02-02 14:45:20 EFG456 C24568 37865
4 2011-02-02 15:08:58 ABC123 A12345 83420
5 2011-02-02 15:25:17 HIJ789 G97352 97452
How do I make a plot with the count of unique of account id's on the y-axis and the number of unique device id's associated with a single account id on the x-axis?
So in this instance the "1" bin on the x-axis would have a height of 2 since accounts "ABC123" and "HIJ789" only have 1 unique device id each and the "2" bin would have a height of 1 since account "EFG456" has two unique device id's associated with it.
EDIT
This is the output I got from trying
df.groupby("Account_ID")["Device_ID"].nunique().value_counts().plot.bar()
Upvotes: 1
Views: 267
Reputation: 3664
You can combine groupby nunique and value_counts like this:
df.groupby("Account_ID")["Device_ID"].nunique().value_counts().plot.bar()
Edit: Code used to recreate your data:
df = pd.DataFrame({'Time': {0: '2011-02-02 12:02:19', 1: '2011-02-02 13:35:12', 2: '2011-02-02 13:54:57',
3: '2011-02-02 14:45:20', 4: '2011-02-02 15:08:58', 5: '2011-02-02 15:25:17'},
'Account_ID': {0: 'ABC123', 1: 'EFG456', 2: 'EFG456', 3: 'EFG456', 4: 'ABC123', 5: 'HIJ789'},
'Device_ID': {0: 'A12345', 1: 'B98765', 2: 'B98765', 3: 'C24568', 4: 'A12345', 5: 'G97352'},
'Zip_Code': {0: 83420, 1: 37865, 2: 37865, 3: 37865, 4: 83420, 5: 97452}})
Upvotes: 1