Reputation: 123
I am struggling to create a nested dictionary with the following data:
Team, Group, ID, Score, Difficulty
OneTeam, A, 0, 0.25, 4
TwoTeam, A, 1, 1, 10
ThreeTeam, A, 2, 0.64, 5
FourTeam, A, 3, 0.93, 6
FiveTeam, B, 4, 0.5, 7
SixTeam, B, 5, 0.3, 8
SevenTeam, B, 6, 0.23, 9
EightTeam, B, 7, 1.2, 4
Once imported as a Pandas Dataframe, I turn each feature into these lists:
teams, group, id, score, diff
.
Using this stack overflow answer Create a complex dictionary using multiple lists I can create the following dictionary:
{'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25},
'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}
using the code:
{team: {'id': i, 'score': s, 'diff': d} for team, i, s, d in zip(teams, id, score, diff)}
But what I'm after is having 'Group' as the main key, then team, and then id, score and difficulty within the team (as above).
I have tried:
{g: {team: {'id': i, 'score': s, 'diff': d}} for g, team, i, s, d in zip(group, teams, id, score, diff)}
but this doesn't work and results in only one team per group within the dictionary:
{'A': {'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93}},
'B': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2}}}
Below is how the dictionary should look, but I'm not sure how to get there - any help would be much appreciated!
{'A:': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25}},
'B': {'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}}
Upvotes: 2
Views: 787
Reputation: 777
A dict comprehension may not be the best way of solving this if your data is stored in a table like this.
Try something like
from collections import defaultdict
groups = defaultdict(dict)
for g, team, i, s, d in zip(group, teams, id, score, diff):
groups[g][team] = {'id': i, 'score': s, 'diff': d }
By using defaultdict, if groups[g] already exists, the new team is added as a key, if it doesn't, an empty dict is automatically created that the new team is then inserted into.
Edit: you edited your answer to say that your data is in a pandas dataframe. You can definitely skip the steps of turning the columns into list. Instead you could then for example do:
from collections import defaultdict
groups = defaultdict(dict)
for row in df.itertuples():
groups[row.Group][row.Team] = {'id': row.ID, 'score': row.Score, 'diff': row.Difficulty}
Upvotes: 3
Reputation: 2211
If you absolutely want to use comprehension, then this should work:
z = zip(teams, group, id, score, diff)
s = set(group)
d = { #outer dict, one entry for each different group
group: ({ #inner dict, one entry for team, filtered for group
team: {'id': i, 'score': s, 'diff': d}
for team, g, i, s, d in z
if g == group
})
for group in s
}
I added linebreaks for clarity
EDIT:
After the comment, to better clarify my intention and out of curiosity, I run a comparison:
# your code goes here
from collections import defaultdict
import timeit
teams = ['OneTeam', 'TwoTeam', 'ThreeTeam', 'FourTeam', 'FiveTeam', 'SixTeam', 'SevenTeam', 'EightTeam']
group = ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B']
id = [0, 1, 2, 3, 4, 5, 6, 7]
score = [0.25, 1, 0.64, 0.93, 0.5, 0.3, 0.23, 1.2]
diff = [4, 10, 5, 6, 7, 8, 9, 4]
def no_comprehension():
global group, teams, id, score, diff
groups = defaultdict(dict)
for g, team, i, s, d in zip(group, teams, id, score, diff):
groups[g][team] = {'id': i, 'score': s, 'diff': d }
def comprehension():
global group, teams, id, score, diff
z = zip(teams, group, id, score, diff)
s = set(group)
d = {group: ({team: {'id': i, 'score': s, 'diff': d} for team, g, i, s, d in z if g == group}) for group in s}
print("no comprehension:")
print(timeit.timeit(lambda : no_comprehension(), number=10000))
print("comprehension:")
print(timeit.timeit(lambda : comprehension(), number=10000))
Output:
no comprehension:
0.027287796139717102
comprehension:
0.028979241847991943
They do look the same, in terms of performance. With my sentence above, I was just highlighting this as an alternative solution to the one already posted by @JohnO.
Upvotes: 2