Reputation: 11
I want to insert in Orientdb unique objects but in order to avoid duplicate object I make a query and then I create my object if it does not exist. I have billions of objects to insert and this is very taken a long time. How to avoid duplicate object in insertion and have good performance?
Here is a sample of my code (I'm using pyorient by the way) :
# creation object Address src
query_ip_src = client.query("select @rid from `Address` where address_value = '" + log_value[2] + "' parallel")
if len(query_ip_src) == 0:
ip_src = Address()
ip_src.address_value = log_value[2]
ip_src_record = client.record_create(clusters[b'address'], ip_src.to_dict())
ip_src_rid = str(ip_src_record._rid)
else:
ip_src_rid = "#" + str(query_ip_src[0].rid.get())
Upvotes: 0
Views: 330
Reputation: 2814
There is an UPDATE UPSERT SQL statement for this, eg.
UPDATE Address SET address_value = ?, otherField = ? UPSERT WHERE address_value = ?
Just make sure you have a unique index on the relevant unique fields (address_value
in this case), this will prevent data duplication
Upvotes: 1