Reputation: 35
I am just getting into regression analysis again and started to practicing. I now have a covid data set with age group as a categorical variable (say 0-10, 20-30 ...). Another column that I have is the number of hospitalizations in that given age group.
I am trying to run a regression analysis on how age (independent variable) is influencing hospitalization (dependent variable). Because age group is categorical variable, the output of the regression analysis is not really insightful.
This is how the dataset looks right now:
If I run a simple LM I get a whacky output:
I am scratching my head how to transform this dataset simply to have meaningful insights. E.g. is age a influencing variable for hospitalizations? what is the coefficient -> e.g one age group higher, hospitalizations rise by ...
Thank you very much for any help!
Upvotes: 0
Views: 995
Reputation: 5429
Try with the mean or the median age in each group and use it numerically.
Upvotes: 1