pandas summarize by count_values as rows

Question

I am trying to summarize some data by their categories in different columns in a dataframe. This is the data.

         feature1  feature2  featurem
brand1   good      none      good
brand2   bad       good      bad
brand..  none      none      good
brandn   good      none      none

I would like to have a table that shows me how many from [good,bad,none] I have per feature. So that would tell me how many brands have good in feature 1, bad in feature 1 and none in feature 1 and so on.

I know, for instance, that with

df["feature1"].value_counts()

I can get such values for each feature separately, but I would like to add it to a new dataframe where I have all the features. How can I do this?

A resulting table would look like this:

Scott Boston · Accepted Answer

Try this, using apply and passing pd.Series.value_counts:

df = pd.DataFrame({'feature '+str(i):np.random.choice(['Good','Bad','none'], 20) for i in range(1,10)})

df.apply(pd.Series.value_counts)

Output:

      feature 1  feature 2  feature 3  feature 4  feature 5  feature 6  \
Bad           6         12          6         10          6          4   
Good          6          2          8          5          6          9   
none          8          6          6          5          8          7   

      feature 7  feature 8  feature 9  
Bad           3          6          7  
Good          3          6          4  
none         14          8          9

pandas summarize by count_values as rows

Answers (2)

Related Questions