Pivot table in pandas to count unique values

Question

I have a dataframe of English words with French translations, here is a simplified version:

English	French
to provide	fournir
to provide	fournir
to provide	offrir, fournir
to provide	offrir
to provide	fournir
to provide	offrir

I would like to create a pivot table which counts the occurences of each translation, like so:

English	French translation	Number of times
to provide	fournir	3
to provide	offrir, fournir	1
to provide	offrir	2

I have tried using the code below, but (1) it doesn't retain the French translations, and (2) there is only one row for each English word, whereas I would expect several rows, one for each given French translation.

df_eng_word_trans = pd.pivot_table(df_user_vocabulary,index='english', values='french',aggfunc=pd.Series.nunique)

English	French
to provide	712
to argue	521

In this example from my real data, there is only ONE row for "to provide", whereas I want several, with the count for each different French translation. I have also tried the following code:

df_eng_word_trans = pd.pivot_table(df_user_vocabulary,index=['english','french'], values='french', aggfunc=pd.Series.nunique)

But I get the following error:

ValueError: Grouper for 'french' not 1-dimensional

As a bonus, I also need to expand the French translations where there are two (or more) words and add them to the count, like so, but I think this is stage two:

English	French translation	Number of times
to provide	fournir	4
to provide	offrir	3

Thank you for your help.

Corralien · Accepted Answer

Use value_counts:

>>> df.value_counts(['English', 'French']) \
      .rename('Number of times').reset_index()

      English           French  Number of times
0  to provide          fournir                3
1  to provide           offrir                2
2  to provide  offrir, fournir                1

Pivot table in pandas to count unique values

Answers (1)

Related Questions