Reputation: 97
I have been trying to write a code that read from multiple excel file in folder match a name. I have been able to achieve that part using the code at the show some code. The columns in the dataframe is id and Date.
My issue is I want to add another column called Code that will hold a code extracted from the files list to differentiate each row.
Initial dataframe after reaading the file and combinning that datasets
id Date
ExcelFile/CP1213_.xlsx 2013-05-09
ExcelFile/CP1213_.xlsx 2013-01-30
ExcelFile/CP1314_.xlsx 2013-02-14
ExcelFile/CP1314_.xlsx 2013-03-19
ExcelFile/CP1415_.xlsx 2013-02-22
ExcelFile/CP1415_.xlsx 2013-02-22
The table below shows what i want to achieve:
id Date Code
ExcelFile/CP1213_.xlsx 2013-05-09 CP1213
ExcelFile/CP1213_.xlsx 2013-01-30 CP1213
ExcelFile/CP1314_.xlsx 2013-02-14 CP1314
ExcelFile/CP1314_.xlsx 2013-03-19 CP1314
ExcelFile/CP1415_.xlsx 2013-02-22 CP1415
ExcelFile/CP1415_.xlsx 2013-02-22 CP1415
The output of the files is a list: "ExcelFile/CP1213_.xlsx" "ExcelFile/CP1314_.xlsx" "ExcelFile/CP1415_.xlsx"
files <- list.files(path = "ExcelFile/", pattern = "*.xlsx", full.names = T)
tbl <- sapply(files, read_excel, simplify=FALSE) %>% bind_rows(.id = "id")
Upvotes: 1
Views: 120
Reputation: 97
Base on Ronak Shah idea, you can use the mutate from the dplyr package then use the basename and then extract part of the filename using sub from the id.
files <- list.files(path = "ExcelFile/", pattern = "*.xlsx", full.names = T)
tbl <- sapply(files, read_excel, simplify=FALSE) %>% bind_rows(.id = "id")
tbl <- tbl %>% mutate(Code = sub("_.*", "", basename(tbl$id)))
Upvotes: 1
Reputation: 389047
You can use basename
and then extract part of the filename using sub
df$Code <- sub("_.*", "", basename(as.character(df$id)))
df
# id Date Code
#1 ExcelFile/CP1213_.xlsx 2013-05-09 CP1213
#2 ExcelFile/CP1213_.xlsx 2013-01-30 CP1213
#3 ExcelFile/CP1314_.xlsx 2013-02-14 CP1314
#4 ExcelFile/CP1314_.xlsx 2013-03-19 CP1314
#5 ExcelFile/CP1415_.xlsx 2013-02-22 CP1415
#6 ExcelFile/CP1415_.xlsx 2013-02-22 CP1415
Or if you want to extract it from files
df$Code <- sub("_.*", "", basename(files))
Upvotes: 0