Reputation: 21
I am trying to create a summary table from a dataframe that looks like below example. The columns have a set list of unique values.
tdf = pd.DataFrame({"A": ["ind1", "ind2", "ind1", "ind3", "ind3", "ind1", "ind1"],
"B": ["ind3", "ind1", "ind3", "ind1", "ind1","ind3", "ind2"],
"C": ["ind1","ind1","ind2","ind2","ind3","ind3","ind3"],
"D": ["ind3","ind1","ind2","ind3","ind2","ind1","ind3"],
"E": ["ind1","ind3","ind1","ind1","ind2","ind2","ind2"]})
I'd then need to create a new table-like object that has a header that correspond to the columns and 3 rows with frequency counts of the set index values.
setvalues = ['ind1','ind2','ind3']
result = pd.DataFrame({"A": [4,1,2],
"B": [3,1,3],
"C": [2,2,3],
"D": [2,2,3],
"E": [3,3,1]})
I tried pivot tables but it wasn't returning the required format. In Excel I could just set the index values and do a simple COUNTIF on the columns but I am struggling to implement it in Python.
Upvotes: 0
Views: 153
Reputation: 8865
import pandas as pd
tdf = pd.DataFrame({"A": ["ind1", "ind2", "ind1", "ind3", "ind3", "ind1", "ind1"],
"B": ["ind3", "ind1", "ind3", "ind1", "ind1","ind3", "ind2"],
"C": ["ind1","ind1","ind2","ind2","ind3","ind3","ind3"],
"D": ["ind3","ind1","ind2","ind3","ind2","ind1","ind3"],
"E": ["ind1","ind3","ind1","ind1","ind2","ind2","ind2"]})
full = tdf.apply(pd.value_counts).fillna(0);
print(full)
Upvotes: 1
Reputation: 323366
Here we can using value_counts
tdf.apply(pd.Series.value_counts)
A B C D E
ind1 4 3 2 2 3
ind2 1 1 2 2 3
ind3 2 3 3 3 1
Upvotes: 2