Reputation: 1
So I'm trying to fit a binary logistic regression model for a question to estimate the odds of the disease and here is the original disease outbreak data (there are 196 observations and I deleted some data entries):
Column 1: ID (person)
Column 2: Age of the person
Column 3: SES (Socio-economic status of the person) (1=upper class, 2=middle class, 3=lower class)
Column 4: Sect (categorical: two different regions)
Column 5: Y (1=disease, 0=no disease)
Column 6: Savings (1=person has savings, 0=no savings)
1 33 1 1 0 1
2 35 1 1 0 1
3 6 1 1 0 0
...
194 31 3 1 0 0
195 85 3 1 0 1
196 24 2 1 0 0
I tried the following command to fit the binary regression model:
lm1=glm(Y~factor(Age)+factor(SES)+factor(Sect)+factor(Savings),family=binomial("logit"))
summary(lm1)
and not surprisingly, it is a mess because there are too many age terms (the age terms are from 2 to 85)... So my question is, would someone be able to help me to modify my command so I'm able to have an age estimate, for example, 5 or 10 year intervals increment?
Also, the above model doesn't include any interaction terms. So if I was about to consider, say SES*Age interaction and I would like to see the age estimate for each every 5 or 10 years, how should I write the input command?
Upvotes: 0
Views: 3847
Reputation: 19618
Use cut
to turn numeric into factors, click HERE for more info about cut.
The flag you might be interested will the breaks=
:
If you only pass one number to that flag, it will divide the whole range into equivalent intervals, like the example I showed below. You can also pass a vector of number which will specify how the interval will be divided.
data(mtcars)
library(plyr)
mydata <- mtcars
# Here I cut the whole numeric range into 10 equal intervals
mydata$myhp <- cut(mydata$hp, 10)
# Here is how the data looks like:
mpg cyl disp hp drat wt qsec vs am gear carb myhp
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 (108,137]
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 (108,137]
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 (80.1,108]
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 (108,137]
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 (165,194]
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 (80.1,108]
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 (222,250]
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 (51.7,80.1]
> str(mydata)
'data.frame': 32 obs. of 12 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
....
$ myhp: Factor w/ 10 levels "(51.7,80.1]",..: 3 3 2 3 5 2 7 1 2 3 ...
Upvotes: 2