Reputation: 22826
Django's QuerySet
has two methods, annotate
and aggregate
. The documentation says that:
Unlike aggregate(), annotate() is not a terminal clause. The output of the annotate() clause is a QuerySet. https://docs.djangoproject.com/en/4.1/topics/db/aggregation/#generating-aggregates-for-each-item-in-a-queryset
Is there any other difference between them? If not, then why does aggregate
exist?
Upvotes: 196
Views: 73891
Reputation: 1
aggregate() can calculates all values of the current model's column. *A dictionary is returned.
annotate() can calculates all values of the child model's column accessed by foreign key. *A QuerySet is returned.
*Avg(), Count(), Max(), Min(), Sum() and so on can be used with aggregate()
and annotate()
.
For example, there are Category
and Product
models below:
# "models.py"
from django.db import models
class Category(models.Model):
name = models.CharField(max_length=20)
class Product(models.Model):
category = models.ForeignKey(Category, on_delete=models.CASCADE)
name = models.CharField(max_length=50)
price = models.DecimalField(decimal_places=2, max_digits=5)
And, there are Category
and Product
admins below:
# "admin.py"
from django.contrib import admin
from .models import Category, Product
@admin.register(Category)
class CategoryAdmin(admin.ModelAdmin):
list_display = ('id', 'name')
ordering = ('id',)
@admin.register(Product)
class ProductAdmin(admin.ModelAdmin):
list_display = ('id', 'category_id', 'category', 'name', 'price')
ordering = ('id',)
And, there are 2 categories below:
And, there are 5 products below:
And, there is test
view below:
# "views.py"
from .models import Category, Product
from django.http import HttpResponse
from django.db.models import Avg
from django.db.models import Count
from django.db.models import Max
from django.db.models import Min
from django.db.models import Sum
def test(request):
return HttpResponse("Test")
First, I explain about aggregate()
.
Now, I run test
view which has the columns id
, category
and price
in Avg(), Count(), Max(), Min() and Sum() in aggregate()
as shown below:
# "views.py"
# ...
def test(request):
print(Product.objects.aggregate(Avg('id')))
print(Product.objects.aggregate(Count('id')))
print(Product.objects.aggregate(Max('id')))
print(Product.objects.aggregate(Min('id')))
print(Product.objects.aggregate(Sum('id')))
print()
print(Product.objects.aggregate(Avg('category')))
print(Product.objects.aggregate(Count('category')))
print(Product.objects.aggregate(Max('category')))
print(Product.objects.aggregate(Min('category')))
print(Product.objects.aggregate(Sum('category')))
print()
print(Product.objects.aggregate(Avg('price')))
print(Product.objects.aggregate(Count('price')))
print(Product.objects.aggregate(Max('price')))
print(Product.objects.aggregate(Min('price')))
print(Product.objects.aggregate(Sum('price')))
return HttpResponse("Test")
Then, these dictionaries below are outputted on console:
{'id__avg': 3.0}
{'id__count': 5}
{'id__max': 5}
{'id__min': 1}
{'id__sum': 15}
{'category__avg': 1.4}
{'category__count': 5}
{'category__max': 2}
{'category__min': 1}
{'category__sum': 7}
{'price__avg': Decimal('30.0000000000000000')}
{'price__count': 5}
{'price__max': Decimal('50.00')}
{'price__min': Decimal('10.00')}
{'price__sum': Decimal('150.00')}
And, aggregate()
can accept multiple kinds of columns and functions in any order, the multiple same kind of columns and functions and no columns and functions as shown below. *The multiple same kind of columns and functions are made into one and no columns and functions get an empty dictionary:
# "views.py"
# ...
def test(request):
# Multiple kinds of columns and functions in any order
print(
Product.objects.aggregate(
Max('price'), Max('category'), Sum('id'), Min('id')
)
)
# The multiple same kind of columns and functions
print(
Product.objects.aggregate(
Sum('price'), Sum('price'), Sum('price')
)
)
# No columns and functions
print(Product.objects.aggregate())
return HttpResponse("Test")
Then, these dictionaries below are outputted on console:
{'price__max': Decimal('50.00'), 'category__max': 2, 'id__sum': 15, 'id__min': 1}
{'price__sum': Decimal('150.00')}
{}
And, Max()
and Min()
below can accept non-numeric types:
# "views.py"
# ...
def test(request):
print(Product.objects.aggregate(Count('name')))
print(Product.objects.aggregate(Max('name')))
print(Product.objects.aggregate(Min('name')))
return HttpResponse("Test")
Then, these dictionaries below are outputted on console:
{'name__count': 5}
{'name__max': 'Tea'}
{'name__min': 'Apple'}
But, Avg()
and Sum()
below cannot accept non-numeric types:
# "views.py"
# ...
def test(request):
print(Product.objects.aggregate(Avg('name')))
print(Product.objects.aggregate(Sum('name')))
return HttpResponse("Test")
So, the errors below occur:
django.db.utils.ProgrammingError: function avg(character varying) does not exist
django.db.utils.ProgrammingError: function sum(character varying) does not exist
And, you can change the default key names as shown below:
# "views.py"
# ...
def test(request):
print(Product.objects.aggregate(priceAve=Avg('price')))
print(Product.objects.aggregate(priceCount=Count('price')))
print(Product.objects.aggregate(priceMax=Max('price')))
print(Product.objects.aggregate(priceMin=Min('price')))
print(Product.objects.aggregate(priceSum=Sum('price')))
return HttpResponse("Test")
Then, the default key names are changed as shown below:
{'priceAve': Decimal('30.0000000000000000')}
{'priceCount': 5}
{'priceMax': Decimal('50.00')}
{'priceMin': Decimal('10.00')}
{'priceSum': Decimal('150.00')}
Next, I explain about annotate()
.
Now, I run test
view which has the columns product__id
, product__category
and product__price
in Avg()
, Count()
, Max()
, Min()
and Sum()
in annotate()
as shown below. *You need to put __avg
, __count
, __max
, __min
and __sum
to product__id
, product__category
and product__price
for Avg()
, Count()
, Max()
, Min()
and Sum()
respectively:
# "views.py"
# ...
def test(request):
qs = Category.objects.annotate(
Avg('product__id'),
Count('product__id'),
Max('product__id'),
Min('product__id'),
Sum('product__id')
).order_by('pk')
for obj in qs:
print(
obj.id,
obj.name,
obj.product__id__avg,
obj.product__id__count,
obj.product__id__max,
obj.product__id__min,
obj.product__id__sum
)
print()
qs = Category.objects.annotate(
Avg('product__category'),
Count('product__category'),
Max('product__category'),
Min('product__category'),
Sum('product__category')
).order_by('pk')
for obj in qs:
print(
obj.id,
obj.name,
obj.product__category__avg,
obj.product__category__count,
obj.product__category__max,
obj.product__category__min,
obj.product__category__sum
)
print()
qs = Category.objects.annotate(
Avg('product__price'),
Count('product__price'),
Max('product__price'),
Min('product__price'),
Sum('product__price')
).order_by('pk')
for obj in qs:
print(
obj.id,
obj.name,
obj.product__price__avg,
obj.product__price__count,
obj.product__price__max,
obj.product__price__min,
obj.product__price__sum
)
return HttpResponse("Test")
Then, these below are outputted on console:
1 Food 2.0 3 3 1 6
2 Drink 4.5 2 5 4 9
1 Food 1.0 3 1 1 3
2 Drink 2.0 2 2 2 4
1 Food 20.0000000000000000 3 30.00 10.00 60.00
2 Drink 45.0000000000000000 2 50.00 40.00 90.00
And, the query without order_by('pk') below makes the order descendant:
# "views.py"
# ...
def test(request):
qs = Category.objects.annotate(
Avg('product__price'),
Count('product__price'),
Max('product__price'),
Min('product__price'),
Sum('product__price')
) # Without ".order_by('pk')"
for obj in qs:
print(
obj.id,
obj.name,
obj.product__price__avg,
obj.product__price__count,
obj.product__price__max,
obj.product__price__min,
obj.product__price__sum
)
return HttpResponse("Test")
Then, the order is descendant as shown below:
2 Drink 4.5 2 5 4 9
1 Food 2.0 3 3 1 6
And, empty annotate()
below has id
and name
attributes:
# "views.py"
# ...
def test(request):
# Empty "annotate()"
qs = Category.objects.annotate().order_by('pk')
for obj in qs:
print(obj.id, obj.name)
return HttpResponse("Test")
Then, these below are outputted on console:
1 Food
2 Drink
But, empty annotate()
below doesn't have __avg
, __count
, __max
, __min
and __sum
attributes as shown below:
# "views.py"
# ...
def test(request):
# Empty "annotate()"
qs = Category.objects.annotate().order_by('pk')
for obj in qs:
print(
obj.product__price__avg,
obj.product__price__count,
obj.product__price__max,
obj.product__price__min,
obj.product__price__sum,
)
return HttpResponse("Test")
So, the errors below occur:
AttributeError: 'Category' object has no attribute 'product__price__avg'
AttributeError: 'Category' object has no attribute 'product__price__count'
AttributeError: 'Category' object has no attribute 'product__price__max'
AttributeError: 'Category' object has no attribute 'product__price__min'
AttributeError: 'Category' object has no attribute 'product__price__sum'
And, Max()
and Min()
below can accept non-numeric types:
# "views.py"
# ...
def test(request):
qs = Category.objects.annotate(
Count('product__name'),
Max('product__name'),
Min('product__name'),
).order_by('pk')
for obj in qs:
print(
obj.product__name__count,
obj.product__name__max,
obj.product__name__min,
)
return HttpResponse("Test")
Then, these below are outputted on console:
3 Orange Apple
2 Tea Milk
But, Avg()
and Sum()
below cannot accept non-numeric types:
# "views.py"
# ...
def test(request):
qs = Category.objects.annotate(
Avg('product__name'),
Sum('product__name')
).order_by('pk')
for obj in qs:
print(
obj.product__name__avg,
obj.product__name__sum
)
return HttpResponse("Test")
So, the errors below occur:
django.db.utils.ProgrammingError: function avg(character varying) does not exist
django.db.utils.ProgrammingError: function sum(character varying) does not exist
And, you can change the default attribute names as shown below:
# "views.py"
# ...
def test(request):
qs = Category.objects.annotate(
productPriceAvg=Avg('product__price'),
productPriceCount=Count('product__price'),
productPriceMax=Max('product__price'),
productPriceMin=Min('product__price'),
productPriceSum=Sum('product__price')
).order_by('pk')
for obj in qs:
print(
obj.id,
obj.name,
obj.productPriceAvg,
obj.productPriceCount,
obj.productPriceMax,
obj.productPriceMin,
obj.productPriceSum
)
return HttpResponse("Test")
Then, these below are outputted on console:
1 Food 20.0000000000000000 3 30.00 10.00 60.00
2 Drink 45.0000000000000000 2 50.00 40.00 90.00
Upvotes: 9
Reputation: 1307
Aggregate Aggregate generate result (summary) values over an entire QuerySet. Aggregate operate over the rowset to get a single value from the rowset.(For example sum of all prices in the rowset). Aggregate is applied on entire QuerySet and it generate result (summary) values over an entire QuerySet.
In Model:
class Books(models.Model):
name = models.CharField(max_length=100)
pages = models.IntegerField()
price = models.DecimalField(max_digits=5, decimal_places=3)
In Shell:
>>> Books.objects.all().aggregate(Avg('price'))
# Above code will give the Average of the price Column
>>> {'price__avg': 34.35}
Annotate Annotate generate an independent summary for each object in a QuerySet.(We can say it iterate each object in a QuerySet and apply operation)
In Model:
class Video(models.Model):
name = models.CharField(max_length=52, verbose_name='Name')
video = models.FileField(upload_to=document_path, verbose_name='Upload
video')
created_by = models.ForeignKey(User, verbose_name='Created by',
related_name="create_%(class)s")
user_likes = models.ManyToManyField(UserProfile, null=True,
blank=True, help_text='User can like once',
verbose_name='Like by')
In View:
videos = Video.objects.values('id', 'name','video').annotate(Count('user_likes',distinct=True)
In view it will count the likes for each video
Upvotes: 47
Reputation: 309099
I would focus on the example queries rather than your quote from the documentation. Aggregate
calculates values for the entire queryset. Annotate
calculates summary values for each item in the queryset.
>>> Book.objects.aggregate(average_price=Avg('price'))
{'average_price': 34.35}
Returns a dictionary containing the average price of all books in the queryset.
>>> q = Book.objects.annotate(num_authors=Count('authors'))
>>> q[0].num_authors
2
>>> q[1].num_authors
1
q
is the queryset of books, but each book has been annotated with the number of authors.
Upvotes: 299
Reputation: 239430
That's the main difference, but aggregates also work on a grander scale than annotations. Annotations are inherently related to individual items in a queryset. If you run an Count
annotation on a something like a many-to-many field, you'll get a separate count for each member of the queryset (as an added attribute). If you were to do the same with an aggregation, however, it would attempt to count every relationship on every member of the queryset, even duplicates, and return that as just one value.
Upvotes: 23