Reputation: 10173
I have the following function written in R that (I think) is doing a poor job of updating my mongo databases collections.
library(mongolite)
con <- mongolite::mongo(collection = "mongo_collection_1", db = 'mydb', url = 'myurl')
myRdataframe1 <- con$find(query = '{}', fields = '{}')
rm(con)
con <- mongolite::mongo(collection = "mongo_collection_2", db = 'mydb', url = 'myurl')
myRdataframe2 <- con$find(query = '{}', fields = '{}')
rm(con)
... code to update my dataframes (rbind additional rows onto each of them) ...
# write dataframes to database
write.dfs.to.mongodb.collections <- function() {
collections <- c("mongo_collection_1", "mongo_collection_2")
my.dataframes <- c("myRdataframe1", "myRdataframe2")
# loop dataframes, write colllections
for(i in 1:length(collections)) {
# connect and add data to this table
con <- mongo(collection = collections[i], db = 'mydb', url = 'myurl')
con$remove('{}')
con$insert(get(my.dataframes[i]))
con$count()
rm(con)
}
}
write.dfs.to.mongodb.collections()
My dataframes myRdataframe1
and myRdataframe2
are very large dataframes, currently ~100K rows and ~50 columns. Each time my script runs, it:
myRdataframe1
myRdataframe1
myRdataframe1
This last bullet point is iffy, because I run this R script daily in a cronjob and I don't like that each time I am entirely wiping the mongo db collection and re-inserting the R dataframe to the collection.
If I remove the con$remove() line, I receive an error that states I have duplicate _id keys. It appears I cannot simply append using con$insert().
Any thoughts on this are greatly appreciated!
Upvotes: 5
Views: 1762
Reputation: 2176
you can use upsert ( which matches document with the first condition if found it will update it, if not it will insert a new one, first you need to separate id from each doc
_id= my.dataframes[i]$_id
updateData = my.dataframes[i]
updateData$_id <- NULL
then use upsert ( there might be some easier way to concatenate strings in R)
con$update(paste('{"_id":"', _id, '"}' ,sep="" ) , paste('{"$set":', updateData,'}', sep=""), upsert = TRUE)
Upvotes: 0
Reputation: 10918
When you attempt to insert documents into MongoDB that already exist in the database as per their primary key you will get the duplicate key exception. In order to work around that you can simply unset the _id
column using something like this before the con$insert
:
my.dataframes[i]$_id <- NULL
This way, the newly inserted document will automatically get a new _id
assigned.
Upvotes: 2