Reputation: 23
I am new to R and therefore sorry, if the awnser is obvious. I am trying to perform operations on tibbles and their values/columns while this tibbles are part of a list. Previously I would upload each of the now tibbles manually as a data.frame (csv data) and perform the operations manually on the data.frame. Unfortunately this is tiresome, so I am trying to get all the operations I have in my script done for all my data.frames at the same time. For example, what worked so far for me was to add 0.7 to every element in every column by the name 'Temperature' in each tibble on the list. I did it like that:
for(i in seq_along(Data_List)) {Data_List[[i]]$Temperature <- Data_List[[i]]$Temperature + 0.7}
However I now would like to perform different tasks: primarily I need to divide my tibbles into sequences. When I worked with the one data.frame at a time, this is what I did:
df_Sitting <- df[1:12, ]
df_Standing <- df[13:26, ]
df_LigEx <- df[27:35, ]
df_VigEx <- df[36:42, ]
df_After <- df[43:54, ]
How do I adjust it properly for the list of all my tibbles/data.frames I now have? Secondly, I want to perform descriptive statistics, Pearson Correlation and Lin Correlation. Additionally I created a ggplot and a Bland-Altman-Plot. I did it like this:
describe(df$Temperature)
describe(df$Temp_core)
cor.test(df)
library(epiR)
epi.ccc(df$Temp_core, df$Temperature, ci = "z-transform",
conf.level = 0.95, rep.measure = FALSE, subjectid)
mdata <- melt(df, id="Time")
ggplot(data = mdata, aes(x = Time, y = value))+
geom_point(aes(group= variable, color = variable))+
geom_line(aes(group= variable, color = variable))
library(BlandAltmanLeh)
BlandAltman_df <- bland.altman.plot(df$Temp_core, df$Temperature, graph.sys = "ggplot2")
print(BlandAltman_df +theme(plot.title=element_text(hjust = 0.5)))
I want now to run all the functions above for the entire list of tibbles and variables within the tibbles at once and get all the corresponding Statistics and Plots, to later create a Markdown. I tried lapply but it somehow does not work. I hope I formulated the question correctly, I appreciate the help!!
PS, here is the ouput from dput(head(df, 20))
structure(list(Time = structure(c(52465, 52525, 52585, 52645,
52705, 52765, 52825, 52885, 52945, 53005, 53065, 53125, 53185,
53245, 53305, 53365, 53425, 53485, 53545, 53605), class = c("hms",
"difftime"), units = "secs"), Temp_core = c(35.565, 36.097, 36.38,
36.591, 36.782, 36.927, 37.067, 37.149, 37.208, 37.249, 37.276,
37.296, 37.327, 37.349, 37.356, 37.376, 37.393, 37.397, 37.409,
37.432), Temperature = c(33.87, 34.52, 34.85, 35.12, 35.37, 35.59,
35.74, 35.82, 35.95, 3600, 36.06, 36.17, 36.23, 36.18, 36.16,
36.18, 36.19, 36.19, 36.37, 36.37)), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))
Upvotes: 1
Views: 119
Reputation: 76605
You can lapply
the tests and plot code to the list members and return lists of tests results and plots. Something like the following.
library(ggplot2)
library(epiR)
library(BlandAltmanLeh)
Data_List <- lapply(Data_List, \(X){
X[["Temperature"]] <- X[["Temperature"]] + 0.7
X
})
cor_test_list <- lapply(Data_List, \(X) cor.test(formula = ~ Temperature + Temp_core, data = X))
lin_test_list <- lapply(Data_List, \(X){
epi.ccc(
X[["Temp_core"]],
X[["Temperature"]],
ci = "z-transform",
conf.level = 0.95,
rep.measure = FALSE
)
})
gg_plot_list <- lapply(Data_List, \(X){
mdata <- reshape2::melt(X, id = "Time")
ggplot(data = mdata, aes(x = Time, y = value))+
geom_point(aes(group = variable, color = variable))+
geom_line(aes(group= variable, color = variable))
})
BlandAltman_List <- lapply(Data_List, \(X){
BlandAltman_df <- bland.altman.plot(X$Temp_core, X$Temperature, graph.sys = "ggplot2")
BlandAltman_df +
theme(plot.title = element_text(hjust = 0.5))
})
To access the test results, use once again *apply
loops together with extraction functions.
sapply(cor_test_list, "[[", "estimate")
# df_a.cor df_b.cor df_c.cor
#0.7425467 0.5259107 0.4572278
sapply(cor_test_list, "[[", "statistic")
# df_a.t df_b.t df_c.t
#7.680738 4.283887 3.561892
sapply(cor_test_list, "[[", "p.value")
# df_a df_b df_c
#6.709843e-10 8.771860e-05 8.434625e-04
sapply(lin_test_list, "[[", "rho.c")
sapply(lin_test_list, "[[", "sblalt")
The plots can be plotted one by one:
gg_plot_list[[1]]
BlandAltman_List[[1]]
or in a loop with print
.
for(i in seq_along(gg_plot_list))
print(gg_plot_list[[i]])
Or to a graphics device (to disk file).
for(i in seq_along(gg_plot_list)) {
filename <- sprintf("Rplot%03d.png", i)
png(filename = filename)
print(gg_plot_list[[i]])
dev.off()
}
Data_List <- iris[1:2]
names(Data_List) <- c("Temp_core", "Temperature")
Data_List$Time <- rep(1:50, 3)
Data_List <- split(Data_List, iris$Species)
names(Data_List) <- paste("df", letters[1:3], sep = "_")
Data_List <- lapply(Data_List, \(x){row.names(x) <- NULL; x})
Upvotes: 1
Reputation: 20137
Working with a list of some other types is totally doable in R. Firstly, I suggest replacing seq_along
with lapply
, or since you are already using tidyverse, purrr::map
:
for(i in seq_along(Data_List)) {
Data_List[[i]]$Temperature <- Data_List[[i]]$Temperature + 0.7
}
becomes:
modified_data_list <- purrr::map(Data_List, function(df){
dplyr::mutate(df, Temperature = Temperature + 0.7)
})
You can apply this same principle for your above function. Note that I use purrr:walk
here instead of map
, because you aren't returning a modified data frame in your function, you are instead calling it for "side effects" like the plot:
library(epiR)
library(BlandAltmanLeh)
modified_data_list <- purrr::walk(Data_List, function(df){
describe(df$Temperature)
describe(df$Temp_core)
cor.test(df)
epi.ccc(df$Temp_core, df$Temperature, ci = "z-transform",
conf.level = 0.95, rep.measure = FALSE, subjectid)
mdata <- melt(df, id="Time")
ggplot(data = mdata, aes(x = Time, y = value))+
geom_point(aes(group= variable, color = variable))+
geom_line(aes(group= variable, color = variable))
BlandAltman_df <- bland.altman.plot(df$Temp_core, df$Temperature, graph.sys = "ggplot2")
print(BlandAltman_df +theme(plot.title=element_text(hjust = 0.5)))
})
Upvotes: 0