Reputation: 904
I work with kotlin and the following dependencies:
id("io.realm.kotlin") version "1.7.0"
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.6.4")
implementation("io.realm.kotlin:library-base:1.7.0")
General usecase:
I download multiple csv files, convert them to RealmObjects and then I try to save the list of RealmObjects. Of course, in this case it is possible that RealmObjects with the same PrimaryKey are saved multiple times: e.g. QuantityRealmObject
is used in several RealmObjects within the parent (or root) object. I thought the UpdatePolicy.Modified which does not exist for Kotlin (?) would do exactly that.
As Jay described, Upserting
is probably the way to go in this case, however I am not sure if
modified
policy: Check for each RealmObject
with a
PrimaryKey within another RealmObject
if it is already managedRealm.query()
), if yes use the managed object, else insert itI try to save data RealmDB which works but I do have nested RealmObjects with @PrimaryKeys. Currently my code only works with setting the UpdatePolicy to ALL which probably leads to a lot of unnecessary updates (and possibly a bigger filesize?) but less actual data in the db than when working with EmbeddedRealmObjects.
EDIT:
My problem is that my Example
objects have references to other RealmObjects with PrimaryKeys (e.g. QuantityRealmObject
) or references to other RealmObjects which have also references to QuantityRealmObjects
. With UpdatePolicy.ALL if have the luxury that I can just call copyToRealm(exampleObject) and all references are saved correctly. If there is a duplicate primary key for a "nested" quantity object reference, it just updates it with the same values, but the references are still ok. If I want to upsert like you suggest, which of course works, I would have to check lots of "nested" realm object references for each copyToRealm(exampleObject) call:
val exampleObject
//query if exampleObject.field1.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
//query if exampleObject.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
//query if exampleObject.field2.field3.quantityRealmObject already exists, if not create it, if yes set this field to the already managed instance
// ... do that for lots of references
realm.copyToRealm(exampleObject)
//vs.
realm.copyToRealm(exampleObject, UpdatePolicy.ALL)
I like the idea with error handling, however I am not sure how I could set the correct references in the ExampleObject in an error case.
fun createRealm(dbName: String, data: List<DataRealmObject>, schema: String) {
val config = RealmConfiguration.Builder(setOf(
// a few RealmObject classes
)
.compactOnLaunch()
.build()
val realm = Realm.open(config)
realm.writeBlocking {
data.forEach {
this.copyToRealm(it, UpdatePolicy.ALL)
}
}
realm.close()
}
When I do not set the UpdatePolicy to ALL, of course I get exceptions stating that an object with the PrimaryKey already exists. Is there a good solution to deal with this without setting the UpdatePolicy to ALL? Ideal would be something like: if an object with the given PrimaryKey does not exist, insert it, else use the already existing object.
I do suspect that the massive updates on already existing objects has a negative effect on the filesize of the realmDb.
How could I solve this problem? I could query before each copy call if each nested RealmObject already exists, however this would be very complex since there are some basic types which occur in a lot of different fields.
EDIT:
An example object could look like this:
Example(): RealmObject{
var field1: String = ""
var anotherRealmObjectRef: Quantity? = null
var anotherRealmObjectRef2: Another? = null
// other fields who can contain references to objects with PrimaryKeys
}
Quantity(): RealmObject{
@PrimaryKey
var id = ""
var value: Double = 0.0
var unit: String = ""
// constructor sets id to e.g. value_unit
}
Another(): RealmObject {
// other fields
var price: Quantity? = null
}
So as I said, I download csv files with data, convert each row to, in this case, Example
realm objects. For each of those objects I must create also Quantity
objects in multiple fields. I added an id field as PrimaryKey to Quantity
because in reality I create maybe 1 mio example objects but there will be only 10k unique Quantity
objects. So I only want unique Quantity
instances in my realmdb to save space and keep the filesize small. I could potentially check before I create each Quantity
object, if there is currently another Example
object which contains already Quantity
objects with this PrimaryKey like you showed in your code example. Due to the somewhat complex class structure this would result in a lot of code and I am not sure if that is really feasible or good to do.
UpdatePolicy.ALL
basically solves this for me, because the resulting realm db only consists of unique quantity objects. However it does probably a lot of unnecessary updates on those objects.
The only real problem for me currently is that the resulting realm db has an unexpected filesize (currently around 400-500mb). A comparable realm db created with the swift sdk has around 200mb. If this is due to the mass updates (resulting in a lot of object versions?) it would be worth for me to solve the issue.
Upvotes: 1
Views: 1016
Reputation: 35657
There are several questions within the question so let me try to tackle them all. I prefer including code in answers but perhaps some clarity about how Realm works would be more beneficial. Some of this answer is IMO so evaluate accordingly.
TL;DR - skip to the Edit
The code in the question doesn't work as is because it's trying to brute force add an object, which had a duplicate primary key to an existing object; primary keys must be unique so having two objects with the same primary key would not be allowed.
The difference between .all and .modified are related to how the data is written (keep reading: Upsert, which may be an answer).
.all
forces all properties of an object to be re-written, whether they have changed or not. That's a whole lot of data to push around and I would find use cases for this kinda rare.
.modified
only writes out fields that have been modified so in general, it's far less data and the preferred option. It will also allow for Upsert
which is what you're attempting to do.
.error
; if you want to prevent updating an existing object, error will throw an error if an object that the same primary key already exists
Upsert
'ing is the process where if an object exists, it will be updated. If it does not exist, it will be inserted. To cause this behavior, when an object is being manipulated, set the update flag to .modified
and it will magically be inserted if needed, otherwise just the modified fields will be updated on the existing object. Note that you can also partially update an object by passing the primary key and a subset of the values to update.
The question mentions "nested objects" and that's a bit ambiguous (IMO) when is comes to Realm. Unfortunately the documentation kinda of mixes 'nested' in so that can lead to confusion.
Nested: In a tree sits a birds nest with eggs. The eggs are nested; they are part of the nest and exist within and as part of the nest; they do not exist in other nests, only that nest.
Objects that are managed (and have a primary key) and are added to another object are not really "nested" - they do not become part of the parent object as they are stored by reference. The two objects are both managed and independent of each other, can exist without the other and in the case of a referenced object, can be referenced from multiple other objects (so, not really nested)
Embedded objects on the other hand are more akin to a 'nested' object; they are not managed separately, do not/can not contain a primary key and are part of the parent object's graph.
To update an embedded (nested) object would be done through dot notation starting with the parent parentObject.embeddedChildToUpdate.fieldToUpdate
and would not be done using .modified or .all (in this context) since the field is being written directly. (and embedded objects would not ever be upserted since they cannot existing without the parent)
It doesn't appear you're using embedded objects - everything seems to be by reference so a bit OT but I hope that helps.
Edit
If the goal is to persist objects that have a unique primary key and to ignore those that are duplicates, this should do it. Attempt to read an object with a given primary key, if it does not exist, persist a new object; if it does exist, ignore it and move on to the next one.
for (widget in widgetList.find()) {
realm.write {
val widget = //fetch the widget via it's primary key
//if there is no widget with that primary key, persist it
if (widget == null) {
widget.copyToRealm(WidgetClass().apply {
_id = ObjectId()
//populate the properties if needed
}
})
//if we get here, a widget with that primary key exists
// so don't persist it (e.g. ignore it)
}
}
This process does not need .all or .modified or even an upsert
The other option is to attempt to write each object - if an object with an existing primary key exists, and error will be thrown. Handle the error elegantly (pretty much do nothing) and then move on to the next object.
Upvotes: 1