Create a new field based on the file name in R

Question

I have a number of .csv files that all contain the same fields that are housed in the same directory, but the values in each file are for a specific date. However, the data in the .csv files does not contain the date - only the file names contain the date.

I'm currently reading in and merging the different .csv files using the following code:

group <- list.files(pattern="[.]csv")
fun <- function(x) {
  df <- read.csv(x , header = TRUE)
}
big_df <- ldply(group, fun)

The output then looks something like this:

print(big_df)
  value_a value_b value_c
1       1     444      10
2       5    7364       9
3      32     573      14
4       7     253      16

and the files that it is reading from look like this:

print(group)
[1] "PSM_filter_report_04242015_1.csv" "PSM_filter_report_04242015_2.csv" "PSM_filter_report_04252015_1.csv"
[4] "PSM_filter_report_04272015_1.csv"

What I'd like to do as part of the process is use substr(), or some comparable function, to add the date from the file name as a new field in the file that I'm creating. By doing this, the new data frame would look like this:

big_df
  value_a value_b value_c     date
1       1     444      10 04242015
2       5    7364       9 04242015
3      32     573      14 04252015
4       7     253      16 04272015

So long as the date is read in as a factor or character (so that I don't lose the leading zero), I'll be comfortable converting the dates with as.Date().

jazzurro · Accepted Answer

This would be one of the ways to get what you want. Since you have a systematic way to represent time information (i.e., there are eight digits for date information), you could do something like this.

library(stringr)

group <- c("PSM_filter_report_04242015_1.csv","PSM_filter_report_04242015_2.csv",
           "PSM_filter_report_04252015_1.csv", "PSM_filter_report_04272015_1.csv")

str_extract(group, "[0-9]{8}")

# [1] "04242015" "04242015" "04252015" "04272015"

Then, you can convert this character string to date later. Hope this helps you.

Create a new field based on the file name in R

Answers (2)

Related Questions