Reputation: 397
i have a series of data, it looks like
sale20160101.txt,
sales20160102.txt,...,
sales20171231.
now i want to read them all and combine, but it also needs a date variable to help me identify their occurrence time,so the date variable will be 20160101,20160102,...,20161231.
my ideas is:
split filename into sale+"time"
duplicate time whenever i read according to number of data length
cbind data and time.
thx alot.
Upvotes: 1
Views: 981
Reputation: 887951
We could do this with fread
and rbindlist
from data.table
library(data.table)
#find the files that have names starting as 'sales' followed by numbers
#and have .txt extension
files <- list.files(pattern = "^sale.*\\d+\\.txt", full.names = TRUE)
#get the dates
dates <- readr::parse_number(basename(files))
#read the files into a list and rbind it
dt <- rbindlist(setNames(lapply(files, fread), dates), idcol = 'date')
Upvotes: 1
Reputation: 35392
I usually would do a variation of the following:
# find the files
ls <- list.files(pattern = '^sales')
# Get the dates
dates <- gsub('sales', '', tools::file_path_sans_ext(ls))
# read in the data
dfs <- lapply(ls, read.table)
# match the dates
names(dfs) <- dates
# bind all data together and include the date as a column
df <- dplyr::bind_rows(dfs, .id = 'date')
Upvotes: 1