Reputation: 169
I am loading data in using the list.files function that concatenates multiple .csv files into a single data frame. However, I then have to run another line that finds and removes rows that meet a certain criteria. I then run another line that combines the date and time columns into a Date_time coulmn. What I'm working with is below:
#Identifies all csv files in a folder and combines them into one continuous data frame.
exports_15E<- {list.files(path = "Z:/fishproj/Cambodia Dai project/Analytic/TS2020_Data_Exports",
pattern = "^15E.*FullWaterColumn_100m.sv.csv", full.names = TRUE) %>%
lapply(read_csv) %>%
bind_rows}
#Removes zero values form the data frame
exports_15E<- exports_15E[exports_15E$Layer !=0,]
#Combines date and time columns into one column.
exports_15E$Date_time <- as.POSIXct(paste(as.Date(as.character(exports_15E$Date_M),"%Y%m%d"), exports_15E$Time_M, sep=" "),format = "%Y-%m-%d %H:%M", tz="Asia/Bangkok")
What I'm trying to do is just combine those 3 functions or just write the syntax in a way that it will just run all of the above and create a single data frame out of it.
Example data frame:
Layer Time_M Date_M Sv_mean
1 1 18:22:22 20201201 -60.2
2 1 19:22:23 20201201 -59.8
3 1 20:22:23 20201201 -60.3
4 1 21:22:23 20201201 -62.3
5 1 22:22:23 20201201 -64.2
6 0 00:00:00 00000000 9999
7 0 00:00:00 00000000 9999
8 0 00:00:00 00000000 9999
9 0 00:00:00 00000000 9999
10 0 00:00:00 00000000 9999
Desired output:
Layer Time_M Date_M Sv_mean Date_time
1 1 18:22:22 20201201 -60.2 2020-12-01 18:22:22
2 1 19:22:23 20201201 -59.8 2020-12-01 19:22:22
3 1 20:22:23 20201201 -60.3 2020-12-01 20:22:22
4 1 21:22:23 20201201 -62.3 2020-12-01 21:22:22
5 1 22:22:23 20201201 -64.2 2020-12-01 22:22:22
The code is meant to create this from my .csv files (which is simple enough), then I want the code to remove the rows with layers that read "0", and also combine the "Date_M" and "Time_M" columns into a "Date_time" column. The code I have works and does all this, I just want it to run all the strings at once.
Upvotes: 1
Views: 75
Reputation: 9858
I would use a single pipe %>%
chain. I think dplyr and purrr are most helpful here. Specially, these packages allow the awesome map_dfr()
("map-data.frame-rowwise") and strightforward filter()
functions, which make data wrangling a fluid process.
Something like this may work (hard to tell without a proper reproducible example).
library(dplyr)
library(purrr)
exports_15E<- list.files(path = "Z:/fishproj/Cambodia Dai project/Analytic/TS2020_Data_Exports",
pattern = "^15E.*FullWaterColumn_100m.sv.csv", full.names = TRUE) %>%
map_dfr(read_csv) %>%
filter(Layer !=0) %>%
mutate(Date_time = as.POSIXct(paste(as.Date(as.character(Date_M),"%Y%m%d"), Time_M, sep=" "),format = "%Y-%m-%d %H:%M", tz="Asia/Bangkok"))
Upvotes: 1