millie0725
millie0725

Reputation: 371

How to apply a function to specific dataframes within a list

I have several dataframes, which each contain temperature data, that I put into a list (shown below is some mock data):

df1 <- data.frame(XH_warmed_air_1m = c(60, 70, 80, 90, 100),
                  XH_ambient_air_1m = c(60, 70, 80, 90, 100))
df2 <- data.frame(XH_warmed_air_1m = c(60, 70, 80, 90, 100),
                  XH_ambient_air_1m = c(60, 70, 80, 90, 100))
df3 <- data.frame(XH_warmed_air_1m = c(0, 10, 20, 30, 40),
                  XH_ambient_air_1m = c(0, 10, 20, 30, 40))

list <- list(df1=df1, df2=df2, df3=df3)

df1 and df2 in this list contain temperature data in fahrenheit, which needs to be converted to celsius (df3's data is already in celsius). So, I made a function to automatically convert the columns to celsius

f_to_c <- function(df){
  df[["XH_warmed_air_1m"]] <- fahrenheit.to.celsius(df[["XH_warmed_air_1m"]])
  df[["XH_ambient_air_1m"]] <- fahrenheit.to.celsius(df[["XH_ambient_air_1m"]])
  return(df)
}

I can use lapply to apply the function to the entire list, but this messes up df3's data by converting it to celsius when it already was in the first place

list <- lapply(list, f_to_c)

I would like to apply this function only to the needed dataframes, which I've attempted to do below. However, this results in the error message # Error in df[["XH_warmed_air_1m"]] : subscript out of bounds

list <- lapply(list$df1, f_to_c)

What method could I use to apply this function only to dataframes that contain temperatures in fahrenheit?

Using R version 3.5.1, Mac OS X 10.13.6

Upvotes: 0

Views: 347

Answers (4)

slava-kohut
slava-kohut

Reputation: 4233

You can check if a data frame contains temperatures in F/C. I assume here that if any values are less than or equal to 0 then we are dealing with C's.

list <- lapply(list, function(x) ifelse(any(x <= 0.), x, f_to_c(x)))

Upvotes: 0

akrun
akrun

Reputation: 886938

Another option is map

library(purrr)
list[1:2] <-  map(list[1:2], f_to_c)

Upvotes: 0

MarBlo
MarBlo

Reputation: 4524

I have understood you wanted to apply the function only to those DF in your list which contain temperatures in fahrenheit. The only value one can identify if the numbers are in celsius or fahrenheit in your data is the temperature itself. So I chosed the condition that the max temperature needs to be below 42 to be a celsius temperature.

Then you can build this condition nicely in with keep and map from purrr.


library(tidyverse)

df1 <- data.frame(XH_warmed_air_1m = c(60, 70, 80, 90, 100),
                  XH_ambient_air_1m = c(60, 70, 80, 90, 100))
df2 <- data.frame(XH_warmed_air_1m = c(60, 70, 80, 90, 100),
                  XH_ambient_air_1m = c(60, 70, 80, 90, 100))
df3 <- data.frame(XH_warmed_air_1m = c(0, 10, 20, 30, 40),
                  XH_ambient_air_1m = c(0, 10, 20, 30, 40))
list <- list(df1=df1, df2=df2, df3=df3)

fahrenheit.to.celsius <- function(x) (x - 32) / 1.8

f_to_c <- function(df){
  df[["XH_warmed_air_1m"]] <- fahrenheit.to.celsius(df[["XH_warmed_air_1m"]])
  df[["XH_ambient_air_1m"]] <- fahrenheit.to.celsius(df[["XH_ambient_air_1m"]])
  return(df)
}


list %>% 
  keep(~{max(.x$XH_ambient_air_1m) > 42}) %>% 
  map(., f_to_c)
#> $df1
#>   XH_warmed_air_1m XH_ambient_air_1m
#> 1         15.55556          15.55556
#> 2         21.11111          21.11111
#> 3         26.66667          26.66667
#> 4         32.22222          32.22222
#> 5         37.77778          37.77778
#> 
#> $df2
#>   XH_warmed_air_1m XH_ambient_air_1m
#> 1         15.55556          15.55556
#> 2         21.11111          21.11111
#> 3         26.66667          26.66667
#> 4         32.22222          32.22222
#> 5         37.77778          37.77778

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

You didn't include the fahrenheit.to.celsius function in your code, so I've added it here:

fahrenheit.to.celsius <- function(x) (x - 32) / 1.8

All you do is apply the function to a subset of your list, and write it back into that same subset:

list[1:2] <- lapply(list[1:2], f_to_c)

list
#> $df1
#>   XH_warmed_air_1m XH_ambient_air_1m
#> 1         15.55556          15.55556
#> 2         21.11111          21.11111
#> 3         26.66667          26.66667
#> 4         32.22222          32.22222
#> 5         37.77778          37.77778
#> 
#> $df2
#>   XH_warmed_air_1m XH_ambient_air_1m
#> 1         15.55556          15.55556
#> 2         21.11111          21.11111
#> 3         26.66667          26.66667
#> 4         32.22222          32.22222
#> 5         37.77778          37.77778
#> 
#> $df3
#>   XH_warmed_air_1m XH_ambient_air_1m
#> 1                0                 0
#> 2               10                10
#> 3               20                20
#> 4               30                30
#> 5               40                40

As a footnote, it's a really bad idea to have a list called list...

Created on 2020-07-15 by the reprex package (v0.3.0)

Upvotes: 2

Related Questions