RadiantHex
RadiantHex

Reputation: 25547

Merging rows when counting - Django/SQL

I have the following model:

class Item(models.Model):

    unique_code = models.CharField(max_length=100)
    category_code = models.CharField(max_length=100)
    label = models.CharField(max_length=100)

I would like to get:


Any ideas?

Upvotes: 1

Views: 260

Answers (3)

vls
vls

Reputation: 2319

Django/SQL solution as requested:

the count of the different category_codes used:

category_codes_cnt = Item.objects.values('category_codes').distinct().count()

count of the different unique_codes used:

unique_codes_cnt = Item.objects.values('unique_codes').distinct().count()

count of the different combination of category_code and unique_code used:

codes_cnt = Item.objects.values('category_codes', 'unique_codes').distinct().count()

Upvotes: 3

araqnid
araqnid

Reputation: 133412

select count(distinct unique_code) as unique_code_count,
       count(distinct category_code) as category_code_count,
       count(*) as combination_count
from (select unique_code, category_code, count(*) as combination_count
      from item
      group by unique_code, category_code) combination

Upvotes: 0

S.Lott
S.Lott

Reputation: 391818

Don't waste too much time trying finesse a cool SQL solution.

from collections import defaultdict
count_cat_code = defaultdict(int)
count_unique_code = defaultdict(int)
count_combo_code = defaultdict(int)
for obj in Item.objects.all():
    count_cat_code[obj.category_code] += 1
    count_unique_code[obj.unique_code] += 1
    count_combo_code[obj.category_code,obj.unique_code] += 1

That will do it. And it will work reasonably quickly. Indeed, if you do some benchmarking, you may find that -- sometimes -- it's as fast as a "pure SQL" statement.

[Why? Because RDBMS must use a fairly inefficient algorithm for doing GROUP BY and Counts. In Python we have the luxury of assuming some things based on our application and our knowledge of the data. In this case, for example, I assumed that it would all fit in memory. An assumption that cannot be made by the RDBMS internal algorithms.]

Upvotes: 1

Related Questions