Reputation: 45
I'm trying to perform a cross tab query on a Pandas data frame. It looks something like this:
| Category | Score1 | Score2 | Score3 | Score4
| 1 | 2 | 3 | 5 | 12
| 1 | 3 | 4 | 2 | 5
| 2 | 1 | 6 | 8 | 6
I want to create a cross tab query that get's me something like this, with category as a column and the scores as rows:
Category| 1 | 2 | 3 |
Score 1 | | | |
Score 2 | | | |
Score 3 | | | |
The values in the blank spaces will be the averages for each category for each score.
The existing data file is a csv that is read into a Pandas DataFrame. I looked at the pd.crosstab()
function, but I can't seem to tie it in to the dataframe. Also it has a parameter for aggfunc
, which I set (using numpy) aggfunc=np.average
, however I can't figure out the value
parameter.
I'm I missing something, or should I be using Pandas at all for this?
Upvotes: 1
Views: 3821
Reputation: 29635
A solution with pd.crosstab
probably exists, but you can also perform what you want with groupby
, mean
and T
(transpose) such as:
df_output = df.groupby('Category').mean().T
and you get:
Category 1 2
Score1 2.5 1.0
Score2 3.5 6.0
Score3 3.5 8.0
Score4 8.5 6.0
Upvotes: 2