Reputation: 11
Columns of my dataframe are the survey questions and the rows are the responses. The unique response choices were (1 - Strong Disagree, 2 - Disagree, 3 - Neutral, 4- Agree, 5- Strongly Agree). The rows have all the respondents selections and I ideally would like the columns to be the unique response choices with a sum of how many people selected that response per question and the rows to be each question.
Not sure how to get that - any suggestions?
My original data
Tried to transpose which successfully made the questions rows but now I have 100s of unique "rows" for each response
Ultimate goal would be to group-by the questions and have the response choices under each question with the columns as a sum for each response
Upvotes: 1
Views: 716
Reputation: 11
Here is the final result with your code and the group-by! final results!
Upvotes: 0
Reputation: 500
This is a bit quick and dirty but it might help
EDIT Updated to transform the tally into a pandas data frame
Setup example dataframe
df = pd.DataFrame ({ 'question_1' : ['1 - strongly agree','1 - strongly agree','2 - agree'],
'question_2' : ['3 - neutral','2 - agree','2 - agree'],
'question_3' : ['1 - strongly agree','2 - agree','3 - neutral'],
'question_4' : ['4 - disagree','5 - strongly disgree','5 - strongly disgree'],
'question_5' : ['3 - neutral','2 - agree','2 - agree']} )
Get the value_counts() for each column
ls_flat = []
for col in df.columns:
for index in df[col].value_counts().index:
print(col,index, df[col].value_counts()[index])
ls_flat.append([col,index, df[col].value_counts()[index]])
Make this list into a dataframe
df_flat = pd.DataFrame(ls_flat)
Rename columns to something more meaningful
df_flat.columns = ['question', 'response', 'tally']
This will create a dataframe which looks like this
Upvotes: 1