learningstudent
learningstudent

Reputation: 577

how to print data using the condition in python based on mysql result data

i have a mysql sample data resulting like below table

main_cat| sub_cat | number | org_id
Career  | school  | 5      | A
Career  | college | 3      | A
Career  | higher  | 4      | A 
Job     | Blr     | 6      | A
Job     | Hyd     | 11     | A
Job     | Chennai | 12     | A 
Career  | school  | 15     | B
Career  | college | 30     | B
Career  | higher  | 5      | B 
Job     | Blr     | 5      | B
Career  | college | 8      | C
Job     | Chennai | 4      | C 

I want to print top 2 main_cat for every organization. For every top 2 main_cat, I want to print top 2 sub_cat for every organization. So every organization should have 4 or less records with top 2 main_cat and top 2 sub_cat for each main_cat

Please help me.

Upvotes: 1

Views: 45

Answers (2)

mata
mata

Reputation: 69032

For grouping python provides itertools.groupby(), which groups sorted input by a given key function.

In this case the input needs to be sorted first by org_id, then by main_cat then by number in descending order, e.g. if your data as a list looks like this:

data = [
 ['Career', 'school', 5, 'A'],
 ['Career', 'college', 3, 'A'],
 ['Career', 'higher', 4, 'A'],
 ['Job', 'Blr', 6, 'A'],
 ['Job', 'Hyd', 11, 'A'],
 ['Job', 'Chennai', 12, 'A'],
 ['Career', 'school', 15, 'B'],
 ['Career', 'college', 30, 'B'],
 ['Career', 'higher', 5, 'B'],
 ['Job', 'Blr', 5, 'B'],
 ['Career', 'college', 8, 'C'],
 ['Job', 'Chennai', 4, 'C']
]

then you'd sort it like this:

data.sort(key = lambda x: (x[3], x[0], -x[2]))

or by altering your sql statement to include ORDER BY main_cat, sub_cat, number DESC, then you'll aready get it in the right order from the database.

Now you can use groupby to group, and islice to limit the number of results per grouped category:

from itertools import groupby, islice
from operator import itemgetter

# already sorted data
data = [
 ['Career', 'school', 5, 'A'],
 ['Career', 'higher', 4, 'A'],
 ['Career', 'college', 3, 'A'],
 ['Job', 'Chennai', 12, 'A'],
 ['Job', 'Hyd', 11, 'A'],
 ['Job', 'Blr', 6, 'A'],
 ['Career', 'college', 30, 'B'],
 ['Career', 'school', 15, 'B'],
 ['Career', 'higher', 5, 'B'],
 ['Job', 'Blr', 5, 'B'],
 ['Career', 'college', 8, 'C'],
 ['Job', 'Chennai', 4, 'C']
]

data.sort(key = lambda x: (x[3], x[0], -x[2]))

for org, by_org in groupby(data, key=itemgetter(3)):
    print("org:", org)
    for cat, by_cat in islice(groupby(by_org, key=itemgetter(0)), 2):
        print("  cat:", cat)
        for subcat, by_subcat in islice(groupby(by_cat, key=itemgetter(1)), 2):
            print("    subcat:", subcat, " = ", list(by_subcat))

Output:

org: A
  cat: Career
    subcat: school  =  [['Career', 'school', 5, 'A']]
    subcat: higher  =  [['Career', 'higher', 4, 'A']]
  cat: Job
    subcat: Chennai  =  [['Job', 'Chennai', 12, 'A']]
    subcat: Hyd  =  [['Job', 'Hyd', 11, 'A']]
org: B
  cat: Career
    subcat: college  =  [['Career', 'college', 30, 'B']]
    subcat: school  =  [['Career', 'school', 15, 'B']]
  cat: Job
    subcat: Blr  =  [['Job', 'Blr', 5, 'B']]
org: C
  cat: Career
    subcat: college  =  [['Career', 'college', 8, 'C']]
  cat: Job
    subcat: Chennai  =  [['Job', 'Chennai', 4, 'C']]

Upvotes: 1

zipa
zipa

Reputation: 27869

You can use pandas to process your query into dataframe with read_sql:

import pandas as pd
df = pd.read_sql(connection,query)
result = df.groupby(['org_id', 'main_cat', 'sub_cat'])['number'].head(2)

Variable connection is your connection to db and query is your SELECT string.

Upvotes: 0

Related Questions