Reputation: 343
I have a dataframe (df) with three columns (a,b,c)
I am inserting these values into SQL database using
df <- data.frame(a=1:10, b=10:1, c=11:20)
values <- paste("(",df$a,",", df$b,",",df$c,")", sep="", collapse=",")
cmd <- paste("insert into MyTable values ", values)
result <- sqlQuery(con, cmd, as.is=TRUE)
Source: How to insert a dataframe into a SQL Server table?
My question is what is the update equivalent of this example ? How can i use
cmd <- paste("update MyTable values ", values)
result <- sqlQuery(con, cmd, as.is=TRUE)
I tried sqlSave and sqlUpdate but it got sticky too soon with error messages like missing columns, index etc...so I am falling back to this example above and trying to learn how to use update statements based on that toy dataset. Any help is much appreciated folks.
Upvotes: 0
Views: 3869
Reputation: 1594
I realize this post is several years old, but if anyone stumbles upon this now, I figured I'd just write a better answer. The method the OP was trying to use is quite painful and very inefficient. You can simply use the dbWriteTable commands. You can insert your data directly from a dataframe into a new table that doesn't previously exist or you can append an existing table by setting the append = TRUE option. See help(dbWriteTable) for additional options as well such as specifying column data types. Note, in the example below I have defined the SQL Server name as well as the database name, but I did not show their values here, for obvious security reasons. I've included the command and corresponding output.
> #Connect to Azure SQL Server database
> con <- DBI::dbConnect(odbc::odbc(),
+ Driver = "ODBC Driver 17 for SQL Server",
+ Server = SQL_SERVER_NAME,
+ Database = SQL_DATABASE_NAME,
+ Authentication = "ActiveDirectoryIntegrated")
>
>
> # Setup example, create two dataframes
> # df1: dataframe to insert into new SQL table that doesn't exist yet
> # df2: dataframe for appending
> df1 <- data.frame(a=1:10, b=10:1, c=11:20)
> df2 <- data.frame(a=31:30, b=30:31, c=41:50)
>
> # dbWriteTable sends your df1 dataframe to SQL Server and builds the table
> DBI::dbWriteTable(conn = con,
+ name = "my_table_in_sql_server",
+ value = df1
+ )
>
> # If the table already exists, you can append it using the append = TRUE argument
> DBI::dbWriteTable(conn = con,
+ name = "my_table_in_sql_server",
+ value = df2,
+ append = TRUE
+ )
>
> # You can verify your data was inserted now by simply querying the table
> DBI::dbGetQuery(conn = con, statement = "select * from my_table_in_sql_server")
a b c
1 1 10 11
2 2 9 12
3 3 8 13
4 4 7 14
5 5 6 15
6 6 5 16
7 7 4 17
8 8 3 18
9 9 2 19
10 10 1 20
11 31 30 41
12 30 31 42
13 31 30 43
14 30 31 44
15 31 30 45
16 30 31 46
17 31 30 47
18 30 31 48
19 31 30 49
20 30 31 50
Upvotes: 0
Reputation: 880
I know this question was posted over 4 years ago but I hope this will help out other userRs who are searching for an answer to this.
table <- [NAME OF THE TABLE YOU WANT TO UPDATE]
x <- [YOUR DATA SET]
# We'll need the column names of the table for our INSERT/UPDATE statement
rs <- dbSendQuery(con, paste0('SHOW COLUMNS FROM ', table, ';'))
col_names <- dbFetch(rs)
dbClearResult(rs)
# Find which columns are primary keys as we shouldn't need to update these
pri <- which(col_names$Key == "PRI")
# For each row in your data frame, build an UPDATE statement and query your db
for(i in 1:nrow(x)) {
# Transform ith row of dataset into character vector
values <- sapply(x[i, ], as.character)
# Build the INSERT/UPDATE query
myquery <- paste0("INSERT INTO ",
table,
"(", paste(col_names$Field, collapse = ", "), ") ", # column names
"VALUES",
"('", paste(values, collapse = "', '"), "') ", # new records
"ON DUPLICATE KEY UPDATE ",
paste(col_names$Field[-pri], values[-pri], sep = " = '", collapse = "', "), # everything minus primary keys
"';")
# Show full query for clarity
cat("Performing query", i, "of", nrow(x), ":\n", myquery, "\n\n")
# Send query to database
dbSendQuery(con, myquery)
}
I just posted my solution to this problem on GitHub if you're looking for a more guided walkthrough.
Upvotes: 1
Reputation: 4568
So, firstly you have the wrong syntax for UPDATE
. In general,
UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];
so you can't build up the values as a concatenated vector as you have done. If you don't select a particular element with the WHERE
you will update the value value1
across all the values in column1
.
EDIT: If you can't match the condition, then you aren't actually updating, you are inserting, which has the forms:
It is possible to write the INSERT INTO statement in two forms.
The first form does not specify the column names where the data will be inserted, only their values:
INSERT INTO table_name
VALUES (value1,value2,value3,...);
The second form specifies both the column names and the values to be inserted:
INSERT INTO table_name (column1,column2,column3,...)
VALUES (value1,value2,value3,...);
If you want to do anything more complicated, you will need to build up the query with SQL, probably in something other than R first, at least to learn. A possibility for experimentation could be SQL fiddle if you aren't comfortable with SQL yet.
Upvotes: 3