Reputation: 1807
I have a data table in R, of about 25 columns and 16 million-ish rows. I'm using the following function to batch import this data table into SQL Server:
# Function to insert data in batches
batch_insert <- function(data, con, table_name, batch_size = 10000) {
n_batches <- ceiling(nrow(data) / batch_size)
dbBegin(con) # Start transaction
tryCatch({
for (i in 1:n_batches) {
start_row <- (i - 1) * batch_size + 1
end_row <- min(i * batch_size, nrow(data))
batch <- data[start_row:end_row, ]
# Insert the batch into the table
dbWriteTable(con, table_name, batch, append = TRUE, row.names = FALSE)
cat("Inserted rows ", start_row, " to ", end_row, "\n")
}
dbCommit(con) # Commit transaction if everything is fine
}, error = function(e) {
dbRollback(con) # Rollback transaction on error
message("Error during batch insert: ", e$message)
})
}
# Write the data to the database in batches
batch_insert(my_table, my_db, "my_super_cool_db")
Unfortunately, in R Studio, it gets to about 2 million rows and then mysteriously hangs, like it's frozen.
Any help appreciated, thanks.
Upvotes: 0
Views: 28