Mark Asbach
Mark Asbach

Reputation: 87

How to denormalize hierarchical Django model data for rendering CSV or Excel with Djange REST Framework?

Assume, we are building an address book application with Django REST Framework and want to render an endpoint that exports all persons. Each person can have one or multiple phone numbers.

Exemplary data could look like this:

[
  {
    'name': 'Jon Doe',
    'phone': 
    [
      {
        'type': 'home',
        'number': '+1 234 5678'
      }
    ]
  },
  {
    'name': 'Jane Doe',
    'phone':
    [
      {
        'type': 'home',
        'number': '+2 345 6789'
      },
      {
        'type': 'work',
        'number': '+3 456 7890'
      }
    ]
  }
]   

As we want to export CSV or Excel tables, we want to denormalize the data so that each phone number gets its own line.

A result could look like this:

name,phone.type,phone.number
Jon Doe,home,+1 234 5678
Jane Doe,home,+2 345 6789
Jane Doe,work,+3 456 7890

The question is, where exactly I would do the denormalization. I see two options:

  1. Write a custom Serializer that does the denormalization. On the upside, this would result in a single change that works for every Renderer, so I could have the endpoint export CSV and Excel using, e.g. djangorestframework-csv and drf-renderer-xlsx. On the downside, this would interfere with renderers that do not benefit from denormalization like JSON or XML.
  2. Derive each Renderer that needs denormalization and override the process_data() method to first denormalize and then invoke the super-class implementation.
  3. Write a custom View that does the denormalization based on the negotiated renderer, like described in https://www.django-rest-framework.org/api-guide/renderers/#varying-behavior-by-media-type.

This seems like an issue that many people could have, as tabular data export is a very common feature. Any hints on where I should start or what would be the best alternative?

Upvotes: 0

Views: 88

Answers (2)

Seth
Seth

Reputation: 46453

Assuming you have models such as:

class Person(models.Model):
    name = models.CharField(max_length=100)
    def __str__(self):
        return self.name

class Phone(models.Model):
    person = models.ForeignKey(Person, on_delete=models.CASCADE)
    type = models.CharField(max_length=100)
    number = models.CharField(max_length=100)
    def __str__(self):
        return self.number

Then create a serializer and viewset for Phone (rather than Person) like so:

class PhoneSerializer(serializers.HyperlinkedModelSerializer):
    person = serializers.StringRelatedField()
    class Meta:
        model = Phone
        fields = ['person', 'type', 'number']

class PhoneViewset(viewsets.ModelViewSet):
    queryset = Phone.objects.all()
    serializer_class = PhoneSerializer

router = routers.DefaultRouter()
router.register(r'phone', PhoneViewset)

Then DRF will produce something like this with its default json renderer:

[
    {"person":"Mufune Toshirō","type":"home","number":"000-123-4567"},
    {"person":"Mufune Toshirō","type":"work","number":"000-345-6789"},
    {"person":"Tōno Eijirō","type":"home","number":"000-234-4567"},
    {"person":"Nakadai Tatsuya","type":"home","number":"000-234-6789"},
    {"person":"Tsukasa Yōko","type":"cell","number":"000-987-6543"}
]

To get CSV output, you can install djangorestframework-csv. It turns the above into this:

number,person,type
000-123-4567,Mufune Toshirō,home
000-345-6789,Mufune Toshirō,work
000-234-4567,Tōno Eijirō,home
000-234-6789,Nakadai Tatsuya,home
000-987-6543,Tsukasa Yōko,cell

On a real database with lots of tables and complicated relationships, I would create a database view from a query that contained all of the fields denormalized how I wanted them to be. Then, I'd create an unmanaged model (something like this) so it could be exported from DRF.

In the above example, the view could be something like this:

CREATE VIEW phonelist AS
    SELECT a.name, b.type, b.number  
    FROM Person a left join Phone b on (a.id = b.person);

Upvotes: 0

Juanjo Salvador
Juanjo Salvador

Reputation: 1093

Assuming you have a model Contact (replace this for whatever model you have), using Pandas you can return a CSV file from a Django ORM QuerySet.

import pandas as pd

from .models import Contact

def export_contacts(self, *args, **kwargs):
    queryset = Contact.objects.all()
    df = pd.DataFrame(list(queryset))
    
    return df.to_csv()

You can add this into a separated module you can call directly from the view, new view for this, or wherever you need it.

Upvotes: 0

Related Questions