lescobedo21
lescobedo21

Reputation: 35

RPostgreSQL loading multiple CSV files into an Postgresql table

I'm new at using Postgresql, and I'm having trouble populating a table I created with multiple *.csv files. I was working first in pgAdmin4, then I decide to work on RPostgreSQL as R is my main language.

Anyway, I am dealing (for now) with 30 csv files located in one folder. All have the same headers and general structure, for instance:

Y:/Clickstream/test1/video-2016-04-01_PARSED.csv
Y:/Clickstream/test1/video-2016-04-02_PARSED.csv
Y:/Clickstream/test1/video-2016-04-03_PARSED.csv

... and so on.

I tried to load all csv files by using a following the RPostgresql specific answer from Parfait. Sadly, it didn't work. My code is specified below:

library(RPostgreSQL)

dir = list.dirs(path = "Y:/Clickstream/test1")
num = (length(dir))

psql.connection <- dbConnect(PostgreSQL(), 
                  dbname="coursera", 
                  host="127.0.0.1", 
                  user = "postgres", 
                  password="xxxx")

for (d in dir){
  filenames <- list.files(d)

for (f in filenames){  
  csvfile <- paste0(d, '/', f)

# IMPORT USING COPY COMMAND
sql <- paste("COPY citl.courses FROM '", csvfile , "' DELIMITER ',' CSV ;")
dbSendQuery(psql.connection, sql)

  }
}

# CLOSE CONNNECTION
dbDisconnect(psql.connection)

I'm not understanding the error I got:

Error in postgresqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not Retrieve the result : ERROR:  could not open file
" Y:/Clickstream/test1/video-2016-04-01_PARSED.csv " for reading: Invalid
argument
)

If I'm understanding correctly, there is an invalid argument in the name of my first file. I'm not very sure about it, but again I am recently using PostgreSQL and this RPostgreSQL in R. Any help will be much appreciated.

Thanks in advance!

Edit: I found the problem, but cannot solve it for some reason. When I copy the path while in the for loop:

# IMPORT USING COPY COMMAND
sql <- paste("COPY citl.courses FROM '",csvfile,"' DELIMITER ',' CSV ;")

I have the following result:

sql 
[1] "COPY citl.courses FROM ' Y:/Clickstream/test1/video-2016-04-01_PARSED.csv ' DELIMITER ',' CSV ;"

This means that the invalid argument is the blank space between the file path. I've tried to change this unsuccessfully. Any help will be deeply appreciated!

Upvotes: 2

Views: 1903

Answers (1)

JackStat
JackStat

Reputation: 1653

Try something like this

Files <- list.files("Y:/Clickstream/test1", pattern = "*.csv", full.names = TRUE)

CSVs <- lapply(Files, read.csv)

psql.connection <- dbConnect(PostgreSQL(), 
                  dbname="coursera", 
                  host="127.0.0.1", 
                  user = "postgres", 
                  password="xxxx")


for(i in 1:length(Files)){

  dbWriteTable(psql.connection
    # schema and table
    , c("citl", "courses")
    , CSVs[i]
    , append = TRUE # add row to bottom
    , row.names = FALSE
    )

}

Upvotes: 1

Related Questions