Techman224
Techman224

Reputation: 45

Pandas crosstab averages

I'm trying to perform a cross tab query on a Pandas data frame. It looks something like this:

| Category | Score1 | Score2 | Score3 | Score4
| 1        | 2      | 3      | 5      | 12
| 1        | 3      | 4      | 2      | 5
| 2        | 1      | 6      | 8      | 6

I want to create a cross tab query that get's me something like this, with category as a column and the scores as rows:

Category| 1        | 2      | 3      |        
Score 1 |          |        |        |    
Score 2 |          |        |        |      
Score 3 |          |        |        |

The values in the blank spaces will be the averages for each category for each score.

The existing data file is a csv that is read into a Pandas DataFrame. I looked at the pd.crosstab() function, but I can't seem to tie it in to the dataframe. Also it has a parameter for aggfunc, which I set (using numpy) aggfunc=np.average, however I can't figure out the value parameter.

I'm I missing something, or should I be using Pandas at all for this?

Upvotes: 1

Views: 3821

Answers (1)

Ben.T
Ben.T

Reputation: 29635

A solution with pd.crosstab probably exists, but you can also perform what you want with groupby, mean and T (transpose) such as:

df_output = df.groupby('Category').mean().T

and you get:

Category    1    2
Score1    2.5  1.0
Score2    3.5  6.0
Score3    3.5  8.0
Score4    8.5  6.0

Upvotes: 2

Related Questions