binny
binny

Reputation: 729

How do I remove duplicate records in an existing database? (no unique constraint defined)

I'm working on a project where improvements to database schema, query speed, and template inheritance are required. As an example, there is a Menu model that lacks a unique constraint (See below).

To improve the model's data integrity, I'm planning to add a migration by adding a unique=True constraint to the season field.

Before applying the migration, I checked the database for all Menu instances to see if an integrity error could potentially occur. As a result of checking, there are 3 model instances with the same value assigned to season.

I want to remove all but 1 of the Menu instances from the existing database in this case, and it doesn't matter which one is kept. What would be some approaches to accomplishing this?

class Menu(models.Model):
    season = models.CharField(max_length=20)
    items = models.ManyToManyField('Item', related_name='items')
    created_date = models.DateTimeField(
            default=timezone.now)
    expiration_date = models.DateTimeField(
            blank=True, null=True)

    def __str__(self):
        return self.season

Upvotes: 1

Views: 405

Answers (1)

MohammadReza
MohammadReza

Reputation: 407

You can below code if you are using Postgresql:

all_unique_season = Menu.objects.distinct('season').values_list('id', flat=True)
Menu.objects.exclude(id__in=all_unique_season).delete()

Also if you are using other databases, you can use below code:

used_ids = list()
for i in Menu.objects.values('id', 'season'):
    used_ids.append(i['id'])
    Menu.objects.filter(season=i['season']).exclude(id__in=used_ids).delete()

Upvotes: 1

Related Questions