Peter
Peter

Reputation: 1253

Bulk upsert with Django

bulk_create with ignore_conflicts=True insert new records only and doesn't update existing one. bulk_update updates only existing records and doesn't insert new one.

Now I see only one variant to make upsert is a raw query:

from catalog.models import Product

with connection.cursor() as cursor:
    cursor.executemany(
        """
        INSERT INTO app_table (pk, col1, col2) 
        VALUES (%s, %s, %s)
        ON DUPLICATE KEY UPDATE 
            col1 = VALUES(col1), 
            col2 = VALUES(col2); 
        """,
        [
            ('1', 'val1', 'val2'),
            ('2', 'val1', 'val2'),
        ]
    )

Is there another way to perform bulk upsert in Django?

Upvotes: 10

Views: 6655

Answers (3)

João Carvalho
João Carvalho

Reputation: 1

Mysql have REPLACE INTO function, very useful for replacing data and avoid constraints violation https://dev.mysql.com/doc/refman/8.0/en/replace.html

Upvotes: 0

Lord Elrond
Lord Elrond

Reputation: 16072

As of Django 4.1, the bulk_create method supports upserts via update_conflicts:

MyModel.objects.bulk_create(
    queryset, 
    update_conflicts=True, 
    update_fields=['col1', 'col2'],
    unique_fields=['col3'],
)

Upvotes: 6

abdullahalali
abdullahalali

Reputation: 406

There isn't a traditional method to use create and update in a single query in MySQL. I see that you've used INSERT ... ON DUPLICATE KEY UPDATE which is a common approach in these cases. An alternative lengthy approach:

  • Update a portion of your data in a temporary table, essentially COPY to your temp table -> cursor.copy_from(), then UPDATE to your original table.

Also, you could optimize performance by increasing your cache size, and setting DEBUG=False if you don't need metadeta in your logs Ref.

Perhaps also test the same data using PostgreSQL, and check if it results in better performance.

Upvotes: 1

Related Questions