Reputation: 31
I am very new to R and coding in general, so I apologize in advance for anything that may seem silly.
I performed an ANOVA and wanted to do a TukeyHSD on my data. At first, it worked fine. Then I created two data sets. In each one, I sorted my data to include just one of the two dose types. I then go on to perform the ANOVA (which works), but the Tukey yields this error
-[.data.frame`(mf, mf.cols[[i]]) : undefined columns selected.
What does that mean? I search the names of the columns in my newly created data set and they are all present.
Thank you so much!!
Here is the dataset I created and the error I received.
df1 <- Flor_Group_1_2019_EC[Flor_Group_1_2019_EC$Dose=="IM", ]
df2 <- Flor_Group_1_2019_EC[Flor_Group_1_2019_EC$Dose=="SC", ]
aov1 = aov(`CFU/g`~Treatment+`Time Point`, data=df1)
summary(aov1)
Df Sum Sq Mean Sq F value Pr(>F)
Treatment 3 3.068e+15 1.023e+15 7.774 7.98e-05 ***
`Time Point` 16 2.065e+16 1.291e+15 9.810 7.20e-16 ***
Residuals 134 1.763e+16 1.316e+14
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
1 observation deleted due to missingness
TukeyHSD(aov1)
Error in
[.data.frame
(mf, mf.cols[[i]]) : undefined columns selected
colnames(df1)
[1] "Steer" "Dose" "Time Point" "Treatment" "Average"
[6] "CFU/g" "Log"
Upvotes: 3
Views: 4007
Reputation: 13309
After going through some old source code, I figured this had to do with the naming in my data set.
This is because for my data,like in the data in this post, the naming convention is not very friendly. While we can add "backticks"(``)
to names
, it is sometimes difficult to use those when programming with base
R based functions. The solution then is to rename
as follows:
# base
names(df1) <- gsub("CFU\\/g","CFU",names(df1))
names(df1) <- gsub("Time Point","time",names(df1))
# tidyverse
dplyr::rename(df1, CFU = `CFU/g`,
time = `Time Point`)
You can then rebuild your models and redo the TukeyHSD
:
df1 <- Flor_Group_1_2019_EC[Flor_Group_1_2019_EC$Dose=="IM", ]
df2 <- Flor_Group_1_2019_EC[Flor_Group_1_2019_EC$Dose=="SC", ]
aov1 = aov(CFU~Treatment+ time , data=df1)
TukeyHSD(aov1)
NOTE: I cannot provide a reproducible example because I cannot readily create an example dataset. I however did solve this issue as stated in this answer.
Upvotes: 5