alco
alco

Reputation: 51

How to select files according to datetime filename in R?

I have a set of files with the following name:

lineal_fit_coef_yymmddhhmmss.csv

and I'd like to select only those files that are just before my starting date; between my starting date and ending date; and just after my ending date.

How would you do that in R? I've been thinking about it but I don't get the way to do that. With list.files? But how would you introduce the condition about in-between dates in filename?

For example, I have the files:

lineal_fit_coef_130220183448.csv

lineal_fit_coef_130223113802.csv

lineal_fit_coef_130226043153.csv

lineal_fit_coef_130306094439.csv

lineal_fit_coef_130307094011.csv

and my starting date is: 130223193927 and my ending date is 130227122246.

I'd like to select only these three files:

lineal_fit_coef_130223113802.csv

lineal_fit_coef_130226043153.csv

lineal_fit_coef_130306094439.csv

I hope you can help me somehow.

Upvotes: 0

Views: 3058

Answers (5)

Rcoster
Rcoster

Reputation: 3210

Try this (works only to specifics case):

files <- c('lineal_fit_coef_130220183448.csv','lineal_fit_coef_130223113802.csv','lineal_fit_coef_130226043153.csv','lineal_fit_coef_130306094439.csv','lineal_fit_coef_130307094011.csv')
filesDATE <- as.double(gsub('[^0-9]', '', files))

files[filesDATE >= 130223193927 & filesDATE <= 130227122246]

(Your example is correct? I got different values)

Upvotes: 0

kith
kith

Reputation: 5566

I think you're looking for the function "file.info"

Use it on your csv files and apply your selection to the mtime column

files = list.files(pattern="csv$")
finfo = file.info(files)
finfo$mtime

If you want to do the same thing but use the times in the file names, first you have to convert them to Dates, then you can perform your selection.

#extract the part o the filename that holds the date
chardates = gsub(x=files, pattern = ".*_.*_.*_(.*).csv", replace="\\1")
#convert it to a real R Date
dates = strptime(chardates, format="%y%m%d%H%M%S")
#perform your selection
...

Upvotes: 1

Theodore Lytras
Theodore Lytras

Reputation: 3965

You need to use list.files(), extract the date as a string and convert to a POSIXct. Here's how to get the dates:

fileDates <- as.POSIXct(substr(list.files(pattern="lineal_fit_coef_[0-9]*\\.csv"),17,28), format="%y%m%d%H%M%S")

And then you can compare these to your start and end dates, and use the result as an index vector to list.files():

startingDate <- as.POSIXct("130223193927", format="%y%m%d%H%M%S")
endingDate <- as.POSIXct("130227122246", format="%y%m%d%H%M%S")

list.files(pattern="lineal_fit_coef_[0-9]*\\.csv")[fileDates >= startingDate & fileDates <= endingDate]

Hope this helps!

Upvotes: 0

Elroch
Elroch

Reputation: 142

How about obtaining the list of file names with dir. extracting the appropriate part of the strings with substr, coercing them to numeric with as.numeric, and finally compare using < to choose the files you want to use?

Upvotes: 1

CHP
CHP

Reputation: 17189

You can write custom function something like below

list.files.by.date <- function(from,to,...) {
  filelist <- list.files(...)
  timestamps <- as.POSIXct(gsub('.*([0-9]{12})+.*','\\1',filelist), format='%y%m%d%H%S', tz='GMT' )
  fromtime <- as.POSIXct(from,, format='%y%m%d%H%S', tz='GMT' )
  totime <- as.POSIXct(to,, format='%y%m%d%H%S', tz='GMT' )
  return(filelist[timestamps >= fromtime & timestamps <= totime])
}

This will allow you to get files whose "timestamp" in filename lies within range defined by from and to parameter.

Upvotes: 0

Related Questions