Reputation: 101
I'm wondering if there is a way to correct for multiple comparisons using lsmeans that will allow me to correct for only a certain number of comparisons? I have found [R lsmeans adjust multiple comparison, but I don't think this gets at what I need.
Basically, the best model to explain my data is a complete model, and so I'm looking at interactions between three factors (year, river and sex) (there are four rivers and 4 different years). I could just do the following (TL is total length, and is a response variable) TL_interact = lm(TL ~ YearRiverSex, data = Walleye_alldata_nu) followed by the lsmeans turtle = lsmeans(possum, specs = pairwise~PeriodRiverSex, adjust = "FDR")
But this corrects for 496 comparisons. The actual number of comparisons that I am interested in is 56. These can be achieved by using the slightly altered code
TL_lsmeans1of2 = lsmeans(TL_interact, specs = pairwise~Year|Sex*River, adjust = "FDR")
then using summary so that I can manipulate this data as a dataframe TL_lsmeans1of2_df = summary(TL_lsmeans1of2$contrasts)
a sample of the output from this is
Sex = F, River = Chalifour:
contrast estimate SE df t.ratio p.value
Y2002/03 - Y2015 74.1746032 18.994645 1497 3.905 0.0006
Y2002/03 - Y2016 33.4750958 16.963837 1497 1.973 0.0730
Y2002/03 - Y2017 45.7222222 19.604254 1497 2.332 0.0396
Y2015 - Y2016 -40.6995074 14.468508 1497 -2.813 0.0149
Y2015 - Y2017 -28.4523810 17.489789 1497 -1.627 0.1248
Y2016 - Y2017 12.2471264 15.260011 1497 0.803 0.4224
Sex = M, River = Chalifour:
contrast estimate SE df t.ratio p.value
Y2002/03 - Y2015 49.4788034 6.054656 1497 8.172 <.0001
Y2002/03 - Y2016 36.4539394 5.893992 1497 6.185 <.0001
Y2002/03 - Y2017 55.7266667 6.352504 1497 8.772 <.0001
Y2015 - Y2016 -13.0248640 5.645109 1497 -2.307 0.0254
Y2015 - Y2017 6.2478632 6.122289 1497 1.021 0.3077
Y2016 - Y2017 19.2727273 5.963447 1497 3.232 0.0019
and then, selecting certain rows of the output for TL_lsmeans2of2 = lsmeans(TL_interact, specs = pairwise~Year*Sex|River, adjust = "FDR")
I won't talk about this second lsmeans output b/c the same solution to my question will be applied there. However, with the lines that I select from this second lsmeans statement, together with the results of the first lsmeans statement, I have 56 comparisons.
I tried to adjust the p-value using the t.ratio and the degrees of freedom using the pt command using TL_lsmeans1of2_df$p.val.adjusted = pt(TL_lsmeans1of2_df$t.ratio, TL_lsmeans1of2_df$df)
But the results are just wacky
Sex = F, River = Chalifour:
contrast estimate SE df t.ratio p.value p.val.adjusted
Y2002/03 - Y2015 74.1746032 18.994645 1497 3.905 0.0006 0.999950797
Y2002/03 - Y2016 33.4750958 16.963837 1497 1.973 0.0730 0.975678658
Y2002/03 - Y2017 45.7222222 19.604254 1497 2.332 0.0396 0.990090497
Y2015 - Y2016 -40.6995074 14.468508 1497 -2.813 0.0149 0.002486354
Y2015 - Y2017 -28.4523810 17.489789 1497 -1.627 0.1248 0.051995069
Y2016 - Y2017 12.2471264 15.260011 1497 0.803 0.4224 0.788822739
Any thoughts about how I could do get a new p-value using an appropriate correction for 56 comparisons would be greatly appreciated. And, if you think it is better for me to code my contrasts differently (I saw this post using 0s and 1s [https://rcompanion.org/rcompanion/h_01.html], I would also gladly take any suggestions on how to approach this.
Thank you, Ella Bowles
Upvotes: 0
Views: 405
Reputation: 327
From a strict programming perspective you can compute all the 496 contrast with no correction, then subset the results selecting just the results you are interested in, and finally correct the p-values using the p.adjust
function in base R using FDR.
Anyway, from a statistical perspective, if you are within an ANOVA framework, you should think about the legitimacy of post-hoc testing just a subset of the contrast the constitute your significant interaction.
Upvotes: 1