Reputation: 17
I have around 170 parquet files in a folder and am trying to select rows that match a value in a column. but i get this error after a few runs. this code works if I manually put a few files but not for all. can someone please help?
Error: Problem with `filter()` input `..1`.
i Input `..1` is `pq$service_id %in% serviceroutes`.
x Input `..1` must be of size 1, not size 0.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column: `service_id`.
Code:
MTWs<-list.files(path=filepath)
serviceroutes<-unique(servicesumGWY$service_id)
outData1 <-data.table()
for (file in MTWs) {
fp<-paste0(filepath,file)
pq<- read_parquet(fp)
dataT <- pq %>% filter(pq$service_id %in% serviceroutes)
outData1<- rbind(outData1,dataT,fill=TRUE)
}
Upvotes: 0
Views: 39
Reputation: 389012
Filter the rows only if service_id
column is present in the data.
library(dplyr)
library(purrr)
MTWs <- list.files(path=filepath, full.names = TRUE)
serviceroutes <- unique(servicesumGWY$service_id)
outData <- map_df(MTWs, ~{
tmp <- read_parquet(.x)
if('service_id' %in% colnames(tmp))
tmp %>% filter(service_id %in% serviceroutes)
})
Upvotes: 1