Reputation: 77
In my global environment I have multiple data frames from different .csv
files. Each data frame is one day of observation of trading activity. Due to some performance challenges I had to perform some data preprocessing while I was uploading each .csv
file in R. The result is in the following image:
Now I would I like to combine each data frame in the consecutive order: for example
masterDataFrame <– rbind(durData_IBM_AskSide1, durData_IBM_AskSide2)
masterDataFrame <– rbind(masterDataFrame, durData_IBM_AskSide3)
masterDataFrame <– rbind(masterDataFrame, durData_IBM_AskSide20)
My first approach was first to combine all the csv
files in one and then perform my data preprocessing steps. This resulted in a very large data frame and the data preprocessing steps took over 10 minutes. The time aspect isn't my major issue, the problem was that between to consecutive days( ending of one day, starting of the following up day) my code produces an "event" which resulted in values which are wrong. So I had to treat each day separately and start anew for each trending day. The workaround was just an If statement that checked If date[i] == date[i+1]
. By the way my data frame is sorted ascending for each trading day and I have time column which looks like this
is of type POSIXlt
.
This resulted in an overall process which took over 20 minutes while performing my steps while uploading took just 1 one minute. Now I really like to know
a) how can I combine my resulting data frames into one master data frame with a loop approach?
b) while combining them how can I delete the already used data frames?
Upvotes: 0
Views: 510
Reputation: 388982
Here is one non-loop approach.
#Create name of dataframes
datanames <- paste0('durData_IBM_AskSide', 1:20)
#Combine them into one
combine_data <- do.call(rbind, mget(datanames))
#Remove them from global environment
rm(list = datanames)
Upvotes: 2
Reputation: 1053
library(tidyverse)
dir(pattern = "*.csv") %>% #change the path name to match yours after the first quotation mark
purrr::map_dfr(read_csv)
Upvotes: 0