Reputation: 105
I have arrays within my data frame that I need to count.
This is the code I'm using:
indi = data_1.query("'2016-11-22' <= login_date <= '2016-12-22'").groupby(['employer_key','account_id','login_date']).count().reset_index()
indi_1 = indi.groupby(['employer_key']).account_id.unique().reset_index()
indi_1
which gives me this:
employer_key account_id
0 boeing [17008601, 17008645, 17008698, 17008952, 17009...]
1 dell_inc [10892711, 10892747, 10894032, 10894676, 10894...]
2 google [9215462, 9216605, 9217052, 9218693, 9222937, ...]
3 sprint_corporation [9858036, 9858809, 9859191, 9859350, 9859498, ...]
4 walmart [2515412, 2517367, 2519765, 2520049, 2526763, ...]
I want to count the numbers in the array so it looks like this:
employer_key account_id
0 boeing 5000
1 dell_inc 289
2 google 789
3 sprint_corporation 154670
4 walmart 4689
How can I do this? I'm using pandas. I'm also very new to python, so simpler the better.
Upvotes: 0
Views: 286
Reputation: 215047
If the account_id column contains lists, you can use str.len()
to calculate the number of elements in each cell:
df['account_id_count'] = df.account_id.str.len()
df
Upvotes: 2