Cyang
Cyang

Reputation: 379

What is the fastest way to write to a db table from R?

I have a data table in R with 1.5M rows. I want to export this to a MS SQL db table.

I know I can do it this way:

dbWriteTable(conn,"benefit_custom.Trial_set",trial_set )

But its very slow.

The other option I've tried is to write to a flat file and then create an SSIS pkg to transfer it to the db. This is not a problem, but the issue is that I have string and numeric data in my data table, and when R writes to the file, everything is varchar and is enclosed within quotes.

FileLocation <-"\\Benefit_Analysis_Input.dat"

 FileName<- paste( bcpWorkspace,FileLocation,sep = "") 

write.table(trial_set,file =FileName,append = FALSE, sep = "\t",col.names = T, row.names = F) 

The 1st method preserves the data types like I want to, but the performance is very bad. Does anyone have anything else I can try?

So I guess the data types cant be preserved if I'm writing to a flat file, so I have to go with choosing the data types when I'm importing the flat file into the db

Upvotes: 4

Views: 4927

Answers (2)

jangorecki
jangorecki

Reputation: 16697

Answering your question: the fastest seems to be rsqlserver

As of now I know about:

  • rsqlserver: Using System.Data.SqlClient drivers, only on win OS
  • RSQLServer: Using java drivers to SQLserver from any OS using RJDBC
  • RODBC: using ODBC drivers, easy setup only on win OS

Still microsoft sql server looks to be quite poorly supported from the R session perspective.

Here is interesting benchmark by rsqlserver project: https://github.com/agstudy/rsqlserver/wiki/benchmarking

Also important to note related to rsqlserver: A linux version using mono is planned.

Finally, my recent presentation on Data Warehousing with R covers DBI, RJDBC, RODBC examples.

Upvotes: 2

Mike Honey
Mike Honey

Reputation: 15027

I think @rhealitycheck is on the right track - I would use the SQL Import and Export Data wizard to generate an SSIS package. I would save it and customize it later, for example adding an upstream Execute Process Task to call R and write the text file out.

The performance and flexibility of this solution is hard to beat.

Upvotes: -1

Related Questions