Reputation: 67
I'm trying to iterate through each line of a csv and bring back the top 3 highscores. There's only 2 columns, one named 'users' and the other 'highscores'. I know what I have so far isn't much but I'm completely stumped. I feel like I could get the highest score by storing the value and iterate over each line and then replace it if it's less than a number, but I'm not sure what to do If I want the top three lines.
This is how I began:
import csv
a = open('highscoreslist.csv')
spreadsheet = csv.DictReader(a)
names = []
scores = []
for row in speadsheet:
names.append(row['users'])
scores.append(row['highscores'])
And now I just don't know what direction to take. I was going to put them in two lists and then find the highest that way, but they're already in a dictionary so that my be pointless. I'm also trying to learn this concept, so I would prefer not to do it in Pandas.
Upvotes: 2
Views: 732
Reputation: 1670
This answer illustrates a Python solution to your problem and how things roll with Python.
sorted([(row['users'], row['highscores']) for row in csv.DictReader(a)], key=lambda t: t[1], reverse=True)[:3]
You are asking for the three highest scores, which presumably you wish to also know the user otherwise this becomes trivially easy.
Given your code, the primary issue you are having is that you have placed names and scores in two independent data structures.
So get the data into the same datastructure. For that, you can loop over the dict rows of the CSV from the DictReader. Change those dicts into tuples. And don't use a loop, use a list comprehension.
[(row['users'], row['highscores']) for row in csv.DictReader(a)]
You will get [(1,2),(6,5),(6,7)]
etc.
Then use a 'lambda' this is a small function that can be passed into other functions, like sorted()
.
sorted()
works on a collection that is passed in, such as the above list of tuples. And, it takes some very useful arguments, not the least of which is a key to sort by. So, we can get fancy and use the lambda to specify which key we wish sorting to occur by.
sorted([(1,2),(6,5),(6,7)], key= the lambda)
and lambda t: t[1]
says return the second element of each tuple (the user's high score). So then sorted()
sorts by highscore.
And then you slice your result by list slicing, another super cool python thing. [:3]
says give me the first three elements - and sorted from high to low gives you the top three scores, and their users, because you sorted(...,reverse=True)
.
You can than access the resultant list of tuples to show the high scores of a game you are hoping to make/automate!
Upvotes: 1
Reputation: 27495
This can be done with a collections.Counter
very easily as well.
import csv
from collections import Counter
players = Counter()
with open('highscoreslist.csv') as a:
spreadsheet = csv.DictReader(a)
for row in speadsheet:
name, score = row['users'], row['score']
players[name] = score
top_3 = players.most_common(3)
print(*(f'{name}: {score}' for name, score in top_3), sep='\n'))
This prints the top 3 in format:
name: score
name: score
name: score
Upvotes: 0
Reputation: 887
If you want to do it on the run, you could simply store the top three highscores and corresponding users in order. On each iteration, check if the new higscore is larger than any of the three you have; if it's larger than the smallest of the three, replace it. If it's larger than the biggest, replace the first one and shift down the other two etc. perhaps this pseudocode makes it more understandable.
score1 = 0
score2 = 0
score3 = 0
#score3 >= score2 >= score 1
for score in file:
if score > score3:
if score <= score2:
score1 = score
elif score <= score3:
score1 = score2
score2 = score
elif score > score3:
score1 = score2
score2 = score3
score3 = score
Of course you could then keep the users in separate variables and changed them together with the values.
Upvotes: 1
Reputation: 122
I did something similar to @Kristopher Ives, by keeping it as a OrderedDict you can still pull the username of the user.
scores = OrderedDict(sorted(spreadsheet.items(), key=lambda row:row[1]["highscores"]))
Upvotes: 0
Reputation: 14847
This is a task that is well-suited for pandas, and honestly, it's almost never worth bothering with Python's stock csv library:
In [36]: df = pd.read_csv("highscoreslist.csv")
In [37]: df.sort_values("highscores", ascending=False).head(3)
Out[37]:
users highscores
6 G 99032
5 F 89584
1 B 73201
(here's how that frame was created also):
In [32]: import pandas as pd
In [33]: import numpy as np
In [34]: df = pd.DataFrame.from_dict({'users': [chr(ord("A") + i) for i in range(10)],
...: 'highscores': np.random.randint(0, 100000, 10)})
In [35]: df.to_csv("highscoreslist.csv", index=False)
Upvotes: 2
Reputation: 6025
You can sort the spreadsheet by high score:
scores = sorted(scores, key=lambda row: row['highscores'], reverse=True)
Now the top three scores are in scores[0]
, scores[1]
and scores[2]
My answer includes NO PANDAS
Upvotes: 2