Canovice
Canovice

Reputation: 10173

Use Rs mongolite to correctly (insert? update?) add data to existing collection

I have the following function written in R that (I think) is doing a poor job of updating my mongo databases collections.

library(mongolite) 

con <- mongolite::mongo(collection = "mongo_collection_1", db = 'mydb', url = 'myurl')
myRdataframe1 <- con$find(query = '{}', fields = '{}')
rm(con)

con <- mongolite::mongo(collection = "mongo_collection_2", db = 'mydb', url = 'myurl')
myRdataframe2 <- con$find(query = '{}', fields = '{}')
rm(con)

... code to update my dataframes (rbind additional rows onto each of them) ...

# write dataframes to database
write.dfs.to.mongodb.collections <- function() {

  collections <- c("mongo_collection_1", "mongo_collection_2") 
  my.dataframes <- c("myRdataframe1", "myRdataframe2")

  # loop dataframes, write colllections
  for(i in 1:length(collections)) {

    # connect and add data to this table
    con <- mongo(collection = collections[i], db = 'mydb', url = 'myurl')
    con$remove('{}')
    con$insert(get(my.dataframes[i]))
    con$count()

    rm(con)
  }
}
write.dfs.to.mongodb.collections()

My dataframes myRdataframe1 and myRdataframe2 are very large dataframes, currently ~100K rows and ~50 columns. Each time my script runs, it:

This last bullet point is iffy, because I run this R script daily in a cronjob and I don't like that each time I am entirely wiping the mongo db collection and re-inserting the R dataframe to the collection.

If I remove the con$remove() line, I receive an error that states I have duplicate _id keys. It appears I cannot simply append using con$insert().

Any thoughts on this are greatly appreciated!

Upvotes: 5

Views: 1762

Answers (2)

Mohammed Essehemy
Mohammed Essehemy

Reputation: 2176

you can use upsert ( which matches document with the first condition if found it will update it, if not it will insert a new one, first you need to separate id from each doc

 _id= my.dataframes[i]$_id
 updateData = my.dataframes[i]
 updateData$_id <- NULL

then use upsert ( there might be some easier way to concatenate strings in R)

 con$update(paste('{"_id":"', _id, '"}' ,sep="" ) , paste('{"$set":', updateData,'}', sep=""), upsert = TRUE)

Upvotes: 0

dnickless
dnickless

Reputation: 10918

When you attempt to insert documents into MongoDB that already exist in the database as per their primary key you will get the duplicate key exception. In order to work around that you can simply unset the _id column using something like this before the con$insert:

my.dataframes[i]$_id <- NULL

This way, the newly inserted document will automatically get a new _id assigned.

Upvotes: 2

Related Questions