sgp667
sgp667

Reputation: 1875

DBI dbWriteTable

I really like the dbWriteTable function from DBI(I usually use RSQLite or ROracle as backend).

I use that function to import a lot of excel spreadsheets, problem is that if these spreadsheets were created over long term columns change are added/deleted or change name from one document to another.

So my question is does anyone have a relatively quick way to add data to database without having to match filed list perfectly?

Here is sample script that I use

require(RSQLite)
require(readxl)

# Create database file
conn <- dbConnect(drv=SQLite(),dbname = "path to database")

# Define import function

excel2sqltable <- function(conn, file, table) {
  source.df <-  read_excel(path=file,col_names = TRUE) %>%
  cbind("SourceFile" = file, .)

  names(source.df) <- source.df %>%
    data.frame(check.names = TRUE) %>%
    {gsub("[.]",x=names(.),replacement="_")}

  print(paste("Importing ", file))

  setOldClass(c("tbl_df", "data.frame"))
  dbWriteTable(conn = conn, name = table, value = source.df, append=TRUE)
}

With that function I can do: sapply(list.files(),FUN = function(x){excel2sqltable(conn,x,"Imports")})

Upvotes: 0

Views: 2003

Answers (1)

Abdou
Abdou

Reputation: 13274

You can use this as a guide:

library(RSQLite)

sqlite_conn <- dbConnect(drv = SQLite(),dbname = 'data_dump.sqlite')

excel2sqltable <- function(conn, file, table) {
  source.df <-  readxl::read_excel(path=file,col_names = TRUE) %>%
    cbind("SourceFile" = file, .)

  names(source.df) <- source.df %>%
    data.frame(check.names = TRUE) %>%
    {gsub("[.]",x=names(.),replacement="_")}

  if(!dbExistsTable(conn, table)) {
    dbWriteTable(conn = conn, name = table, value = source.df)
  } else {
    # Get both dataframe columns and table columns
    df_cols <- colnames(source.df)
    tbl_cols <- dbListFields(conn, table)

    # Check if there are columns in the dataframe
    # that are not in the destination table
    # Loop through the missing columns and add
    # them to the database table
    if (length(setdiff(df_cols, tbl_cols)) > 0) {
      missing_cols <- setdiff(df_cols, tbl_cols)
      for (col_name in missing_cols) {
        dbSendStatement(conn, sprintf('ALTER TABLE %s ADD %s VARCHAR', table, col_name))
      }
    }

    setOldClass(c("tbl_df", "data.frame"))


    dbWriteTable(conn = conn, name = table, value = source.df, append=TRUE)
  }
}

lapply(list.files(), function(x) {
  excel2sqltable(sqlite_conn, x, "Imports")
})
dbDisconnect(sqlite_conn)

I hope it serves a purpose.

Upvotes: 1

Related Questions