Natalie
Natalie

Reputation: 33

Return rows with closest date to a given date for several dataframes in R

I have 9 dataframes of time series data of different lengths that are each for a different location. I am trying to find a way to input a date and get R to output the rows containing the dates closest to my input date for all dataframes at once.

Right now, I've been doing this one at a time:

#bring in data
MNT <- read_excel("MNT_temporal.xlsx")

#change date format
MNT$date <- ymd_hms(MNT$date)

#specifying input date
date <- ymd_hms("2017-09-11 15:47:35")

#find which row is closest to input date
which.min(abs(date - MNT$date))
[1] 6 #R returns row with closest date
MNT[6,] #view data
# A tibble: 1 x 4
  date                number.of.positives percent_inundated avg_depth_inundated
  <dttm>                            <dbl>             <dbl>               <dbl>
1 2017-09-11 15:45:00                   0                 0                  NA

Here is an example of my data for one location (MNT):

# A tibble: 9 x 4
  date                number.of.positives percent_inundated avg_depth_inundated
  <dttm>                            <dbl>             <dbl>               <dbl>
1 2017-09-11 14:05:00                   0                 0                  NA
2 2017-09-11 14:25:00                   0                 0                  NA
3 2017-09-11 14:45:00                   0                 0                  NA
4 2017-09-11 15:05:00                   0                 0                  NA
5 2017-09-11 15:25:00                   0                 0                  NA
6 2017-09-11 15:45:00                   0                 0                  NA
7 2017-09-11 16:05:00                   0                 0                  NA
8 2017-09-11 16:25:00                   0                 0                  NA
9 2017-09-11 16:45:00                   0                 0                  NA

The rest of my locations are named as follows:

#bring in data
MUT <- read_excel("MUT_temporal.xlsx")
MLT <- read_excel("MLT_temporal.xlsx")
MST <- read_excel("MST_temporal.xlsx")
MOL <- read_excel("MOL_temporal.xlsx")
PNT <- read_excel("PNT_temporal.xlsx")
PUT <- read_excel("PUT_temporal.xlsx")
PLT <- read_excel("PLT_temporal.xlsx")
PST <- read_excel("PST_temporal.xlsx")

Is there a way to do this for all locations at once?

Upvotes: 0

Views: 43

Answers (1)

dmuenzel
dmuenzel

Reputation: 101

How about like this?

my_function<-function(filename){

  #bring in data
  file <- read_excel(filename)
  
  #change date format
  file$date <- ymd_hms(file$date)
  
  #specifying input date
  date <- ymd_hms("2017-09-11 15:47:35")
  
  #find which row is closest to input date
  rowindex<-which.min(abs(date - file$date))
  
  return(file[rowindex,])
  
}

alllocations<-paste0(c("MNT","MUT","MLT","MST","MOL","PNT","PUT","PLT","PST"),"_temporal.xlsx")

savedates<-data.frame("date"=rep(0,length(alllocations)),"number.of.positives"=0, "percent_inundated"=0, "avg_depth_inundated"=0)

for(i in 1:length(alllocations)){
  
  savedates[i,]<-my_function(alllocations[i])
  
}

Upvotes: 1

Related Questions