Reputation: 135
So I have this Data and trying to do kruskal.test()
over a list containing dataframes
df_list <- list(
`1.3.A` =
tibble::tribble(
~Person, ~Height, ~Weight,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L
),
`2.2.A` =
tibble::tribble(
~Person, ~Height, ~Weight,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L
),
`1.1.B` =
tibble::tribble(
~Person, ~Height, ~Weight,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L,
"Alex", 175L, 75L,
"Gerard", 180L, 85L,
"Clyde", 179L, 79L
)
)
I am trying to perform kruskal.test
over these 3 dataframes but failed after hours and hours of trying to find a solution. I am new to R.
Failed attempts are :
snake <- function(i){
kruskal.test(df$Height ~ df$Person, data = i)
}
snail <- lapply(df_list, "[[", snake)
df_list %>% kruskal.test(df$Height ~ df$Person)
sapply(df_list, function(i) { kruskal.test(df$Height ~ df$Person, data = i)})
Map(function(x) kruskal.test(Height ~ Person), get(df_list))
Map(function(df_list, .f(kruskal.test(Height ~ Person)))
lapply(mget(df_list), function(x) kruskal.test(Height ~ Person))
bunny <- df_list %>%
kruskal_test(df$Height ~ Person, data = .)
Summary: I am trying to do kruskal.test()
over a set of list containing dataframes. How can a pass a formula over lapply()
or Map()
to run the kruskal.test()
in each dataframes in the list?
Upvotes: 0
Views: 174
Reputation: 11957
Your code is referencing an object called "df", which does not appear to exist. Also, when using kruskal.test
with the arguments kruskal.test(formula, data)
, there is no need to reference the data frame in the formula. Providing kruskal.test
a "data" argument will cause the function to search for the formula symbols first in the provided data. In other words, if data frame "x" contains columns "Height" and "Person", then the following would work:
kruskal.test(Height ~ Person, data = x)
In your example, you shouldn't reference df
. Notice that the code below creates a temporary function with an argument called "i", and that "i" is subsequently referenced:
lapply(df_list, function(i) kruskal.test(Height ~ Person, data = i))
$`1.3.A`
Kruskal-Wallis rank sum test
data: Height by Person
Kruskal-Wallis chi-squared = 5, df = 2, p-value = 0.08208
$`2.2.A`
Kruskal-Wallis rank sum test
data: Height by Person
Kruskal-Wallis chi-squared = 5, df = 2, p-value = 0.08208
$`1.1.B`
Kruskal-Wallis rank sum test
data: Height by Person
Kruskal-Wallis chi-squared = 5, df = 2, p-value = 0.08208
Upvotes: 2