Reputation: 15
I have the following data frame:
Sport | AgeGroup |
---|---|
Baseball | 20s |
Football | 20s |
Baseball | 30s |
Baseball | 20s |
Football | 20s |
Football | 20s |
Football | 30s |
And the goal here is to get the counts between the two columns such that it gets the counts of each column into a list of lists in the following format: [['baseball', '20s', 2], ['baseball', '30s', 1], ['football', '20s', 3], ['football', '30s', 1]]
where each list consists of the format [sport, ageGroup, count where both exist in the table].
Thus far, I've gotten what I needed in a way by doing: sport_age_count = df.groupby(["sport", "ageGroup"]).size()
. The problem here though is that the result isn't indexable and only treats the count like a singular list. I get the following result with the simple code I have above:
Sport | AgeGroup | |
---|---|---|
Baseball | 20s | 2 |
30s | 1 | |
Football | 20s | 3 |
30s | 1 |
The problem is, when I try to use baseball, football, 20s, or 30s, it won't let me. I can only access the counts. Also, besides the fact, I'd really like this to be formatted into the bold list of lists like above as well. I wasn't sure if there's any alternative python tricks out there besides using extensive for loops to build this list myself.
Upvotes: 1
Views: 51
Reputation: 195458
Another solution, using .agg()
:
print(
df.groupby(["Sport", "AgeGroup"], as_index=False)
.size()
.agg(list, axis=1)
.tolist()
)
Prints:
[['Baseball', '20s', 2], ['Baseball', '30s', 1], ['Football', '20s', 3], ['Football', '30s', 1]]
Upvotes: 2
Reputation: 5955
If you want an array (makes for easier indexing) you can reset the index and use to_numpy()
after your groupby
sport_age_count.reset_index().to_numpy()
array([['Baseball', '20s', 2],
['Baseball', '30s', 1],
['Football', '20s', 3],
['Football', '30s', 1]], dtype=object)
But if you want a list of lists, you can use numpy's tolist()
sport_age_count.reset_index().to_numpy().tolist()
[['Baseball', '20s', 2],
['Baseball', '30s', 1],
['Football', '20s', 3],
['Football', '30s', 1]]
Upvotes: 1