Reputation: 53

Working with list of tuples embedded in a list of dictionaries in python

I am in a beginning coding class and I can not seem to turn the basics I'm taught into a working program with a list this complicated. What functions should I be using to do this?

At this point we have not discussed importing any extra features (numpy etc) and I know people use lambda a lot (though I don't really understand what it does), but that has not been introduced in this class.

#This is an example of the structure of a student dictionary
#They have an id number
#They have a first name, last name and a list of assignments
#Assignments are tuples of an assignment name and grade
#The grade is a 4 point scale from 0 to 4
'''
student_list = [{'id': 12341, 'first_name': 'Alice', 'last_name': 'Anderson',
     'assignments': [('assignment_1', 0), ('assignment_2', 2), ('assignment_3', 4)]},

 {'id': 12342, 'first_name': 'Boris', 'last_name': 'Bank',
   'assignments': [('assignment_1', 1), ('assignment_2', 3), ('assignment_3', 0)]},

 {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape',
   'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]},

 {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson',
   'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]},

 {'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders',
   'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}]

#This function should return a list of the n student dictionaries with the
#highest grades on the assignment passed in as assignment name
#If there is a tie then it is broken by returning the student(s) with the
#lowest id number(s)
def highest_n_grades(students, assignment_name, n):

Edit

Sorry, I'm not trying to get an answer. I see how that looks. I feel like I've written out and deleted a million things and that's my problem. I'm having trouble even getting started.

I was hoping for a point in the right direction in terms of maybe what commands can grab highest grades etc. all I really have so far is something like:

def highest_n_grades(student_list):
  for s in student_list:
    for assignment_name, grade in s['assignments']:
        if int(grade) >= 4:
            print(assignment_name, grade)

highest_n_grades(student_list)

But I know that's not even really getting me started. It doesn't have three inputs and it's not looking for the max, it's looking for the manually entered value 4, and it's not even coming close to tying at back to student names or making another list.

Edit 2

Also tried that gave an error I was trying to sort the dictionary rather than the list.

def highest_n_grades(student_list, assignment_name):
  for s in student_list:
    for assignment_name in s['assignments'][1]:
      s['assignments'][1] = assignment_name
      s.sort(key=assignment_name)
    print(student_list)

highest_n_grades(student_list, assignment_name='assignment_1' )

Edit 3

OK, I've maybe made a little headway?

newlist2 = sorted(newlist, key=lambda k: k['assignments'][0], reverse = True)
newlist3 = sorted(newlist, key=lambda k: k['assignments'][1], reverse = True)
newlist4 = sorted(newlist, key=lambda k: k['assignments'][2], reverse = True)

These seem to be sorting by assignment. I don't understand what lambda is doing, but I at least can generate a list with the highest grade coming up first. I think that's a baby step.

Edit 4

Here is a function I created. It seems to get me what I want, it outputs the highest 3 students, but it prints it 5 times? and I know this isn't really flexible but it's a start.

def highest_n_grades(student_list,  n):
  for s in student_list:
    newlist = sorted(student_list, key=lambda k: k['assignments'][0], reverse=True)
    print(newlist[:n])

highest_n_grades(student_list, 3)

output:

[{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]
[{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]
[{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]
[{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]
[{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]

Upvotes: 0

Answers (2)

Kim

Reputation: 1684

This is a difficult assignment for a beginner's course. The difficulties are lambdas, multiple-key sorting, lists, list slices and tuples, dictionaries and even ordered versus unordered data types. I've been programming in Python for 10 years and didn't find it straightforward.

A lambda is a tiny function that you define on the fly. sorted() takes a function as its second argument. It needs to call this function for each student to generate a sort key. The sort function compares the sort keys of two students to decide which student goes first in the sort.

A good place to start with lambdas is to remember that:

id_key = lambda x: x[0]

is equivalent to:

def id_key(x):
    return x[0]

Furthermore

sorted(students, key=lambda x: x[0])

is equivalent to:

sorted(student, key=id_key)

For sorting on multiple values I'd be looking at stable sorts and their properties. Stable sort algorithms are great for sorting on more than one value. Most Python sorting functions are 'stable'.

Here's a solution using the present structure:

def sort_by_grade_then_id(grades):
    # sort (id, grade) tuples high grades, low ids first
    sorted_by_id = sorted(grades, key=lambda student: student[0])
    sorted_by_id_and_assignment_grade = sorted(sorted_by_id,
        key=lambda student: student[1], reverse=True)
    return sorted_by_id_and_assignment_grade


def highest_n_grades(students, assignment_name, n):
grades = []
for student in students:
    for assignment, grade in student['assignments']:
        if assignment_name == assignment:
            grades.append((student['id'], grade))
return sort_by_grade_then_id(grades)[:n]    

>>> print(highest_n_grades(student_list, 'assignment_2', 2))
[(12343, 4), (12342, 3)]

But if you now want the student's name rather than his/her id, you'll have to do another serial search to get it.

As a different approach, the following copies the original list-based student database into a dictionary-based one.

from copy import copy

students_dict = {student['id']: student for student in copy(student_list)}
for student in students_dict.values():
    student['assignments'] = dict(student['assignments'])

Listing the top grades becomes:

def highest_n_grades_dict(students, assignment_name, n):
    grades = [
        (id, student['assignments'][assignment_name])
        for id, student
        in students.items()
    ]
    return sort_by_grade_then_id(grades)[:n]

It doesn't matter with just a few students, but if you had many students and many assignments, this new version would be faster. You can also use the student database to look stuff up now, rather than having to search and match.

As an example:

print('Highest grades dict version...')
grades = highest_n_grades_dict(students_dict, 'assignment_2', 2)
print(grades)
print("...and dict structure easily allows us to get other student details")
names_and_grades = [
    (students_dict[id]['first_name'] + ' ' + students_dict[id]['last_name'], grade)
    for id, grade
    in grades]
print(names_and_grades)
>>> python grades.py
Highest grades dict version...
[(12343, 4), (12342, 3)]
...and dict structure easily allows us to get other student details
[('Carl Cape', 4), ('Boris Bank', 3)]

Side note: if you deal with tuples a lot, you might be interested in named tuples, as they often make tuple-related code (including lambda functions) easier to read, write and understand. Have a look at my recent answer to this question for an example.

Upvotes: 1

vash_the_stampede

Reputation: 4606

This can be done using lambda and sorted. When using sorted with lambda we set first key=lambda x:. Now you can think of that x representing a list index, so to sort by assignment_1 we are going to want to go x['assignments'] this will take us to the assignments, then the next step, if our assignment is assignment_1 we know that that is the 0 index of assignments so together it would be key=lambda x: x['assignments'][0]. Now we can also sort a secondary option and that will be our tie breaker, we will use x[id] and will be in a tuple with our primary sorting factor. Of course we should use reverse = True to get descending scores, but since we want our tiebreaker to be in ascending order we can offset the reverse on id using -(x['id'])

Altogether the sort looks like this :

lista = sorted(students, key=lambda x: (x['assignments'][0], -(x['id'])), reverse = True)

The tricky part would be choosing the proper assignment index for the passed assignment, for that you could use .split('_')[1] (when using .split('_') on 'assignment_1' we generate a new list that is ['assignemnt', '1'] in this case we can take now the [1] index of .split() which is 1 as an int and subtract 1 to get 0 which is the corresponding index, as well as for the rest being that all are off 1 from their index.

def highest_n_grades(students, assignment_name, n):
    y = int(assignment_name.split('_')[1]) - 1
    lista = sorted(students, key=lambda x: (x['assignments'][y], 'id'), reverse = True)
    return lista [:n]   

print(highest_n_grades(student_list, 'assignment_1', 3))
# [{'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 3), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12343, 'first_name': 'Carl', 'last_name': 'Cape', 'assignments': [('assignment_1', 2), ('assignment_2', 4), ('assignment_3', 1)]}]

Demonstration of tie-breaker case, using pseudo scores:

print(highest_n_grades(student_list, 'assignment_1', 3))
# [{'id': 12344, 'first_name': 'Didi', 'last_name': 'Dawson', 'assignments': [('assignment_1', 4), ('assignment_2', 0), ('assignment_3', 2)]}, {'id': 12345, 'first_name': 'Ed', 'last_name': 'Enders', 'assignments': [('assignment_1', 4), ('assignment_2', 1), ('assignment_3', 3)]}, {'id': 12342, 'first_name': 'Boris', 'last_name': 'Bank', 'assignments': [('assignment_1', 2), ('assignment_2', 3), ('assignment_3', 0)]}]

Further reading

on .split()

https://docs.python.org/3/library/stdtypes.html

on using sorted

https://docs.python.org/3/library/functions.html https://wiki.python.org/moin/HowTo/Sorting