Reputation: 7
Are you please able to assist in performing a Krustal Wallis test using a subset of my data? I would like to be able to test for differences in "N" between "Producers".
names(Isotope.Data)
[1] "Species" "Name" "Group" "Simple_Group" "Trophic_Group"
[6] "Sample" "N" "C"
In my csv.file I have a column "Trophic Group" which separates Consumers and Producers.
table(Isotope.Data$Trophic_Group)
Consumer Producers
61 18
Under the column heading Simple_Group, I have three Producers - Rhodophyta, Seagrass and Phaeophyceae
table(Isotope.Data$Simple_Group)
Abalone Loliginidae Octopus Phaeophyceae Rhodophyta Seagrass Teleost
24 2 12 6 9 3 20
Tunicate
3
I have tried numerous things, but I get various error messages. Would anyone be able to improve on the following code?
kruskal.test(C ~ Simple_Group, data = Isotope.Data, subset = Isotope.Data$Trophic_Group = "Producers")
P.S. I have created a separate CSV.file which only includes Primary Producers. However a subsequent Dunn-test of multiple comparisons, used to determine which levels differed from each other provides different significance levels to those which includes both Consumers and Producers.
Upvotes: 0
Views: 2535
Reputation: 23
You can also use the map()
function from the package purrr
to apply function in each group once splited
library(purrr)
test <- df %>% group_split(phase) %>% map(~kruskal.test(.,val ~ distance))
test
Upvotes: 1
Reputation: 4080
Will maybe this answer be helpful? Based on @user295691 answer:
Kruskal-Wallis test: create lapply function to subset data.frame?
Here you identify individual groups what you want to test differences between, and use split, to correctly define subsetting of your data frame.
Dummy example:
# create data
val<-runif(60, min = 0, max = 100)
distance<-floor(runif(60, min=1, max=3))
phase<-rep(c("a", "b", "c"), 20)
df<-data.frame(val, distance, phase)
# get unique groups
ii<-unique(df$phase)
# run Kruskal test, specify the subset
kruskal.test(df$val ~df$distance,
subset = phase == "c")
And now apply the kruskal.test
to each group using split
:
lapply(split(df, df$phase), function(d) { kruskal.test(val ~ distance, data=d) })
or create a function:
lapply(ii, function(i) { kruskal.test(df$val ~ df$distance, subset=df$phase==i )})
Both produces test results for each group:
[[1]]
Kruskal-Wallis rank sum test
data: df$val by df$distance
Kruskal-Wallis chi-squared = 0.14881, df = 1, p-value = 0.6997
[[2]]
Kruskal-Wallis rank sum test
data: df$val by df$distance
Kruskal-Wallis chi-squared = 0.11688, df = 1, p-value = 0.7324
[[3]]
Kruskal-Wallis rank sum test
data: df$val by df$distance
Kruskal-Wallis chi-squared = 0.0059524, df = 1, p-value = 0.9385
Or just get the p-values (notice the addition of $p.value
after the kruskal.test
):
lapply(ii, function(i) {
kruskal.test(df$val ~ df$distance,
subset=df$phase==i )$p.value
}
)
Upvotes: 2