Reputation: 41
I'm trying to improve my Python skills and general basic coding. I have a csv file, the first 7 rows (including the header) of which are shown below:
HomeTeam AwayTeam HomeTeamWin AwayTeamWin
AV MU 1 0
BR QPR 1 0
C E 0 1
MU BR 1 0
QPR C 0 1
E AV 0 1
I am trying to implement the following code such that an output file will be generated that shows, based on the result from their most recent game, if the home team was / was not coming off a win. I am stuck at the section marked with ******
#start loop
for row in file:
#create empty list to put value we will find into
observation_list=[]
#define variable a as being row[0], i.e. the cell
#in the current row that contains the 'hometeam'
a=row[0]
#*****stuck here*******#
#call the last row to contain variable a i.e. where toprow = the most recent row
#above the current row to have contained varaible a i.e. the value from row[0]
for toprow in file:
#*****stuck here*******#
if (toprow[0] or toprow[1])==a:
#implement the following if statement
#where toprow[0] is the 1st column containing the value
#of the hometeam from the toprow
if (toprow[0]==a):
#implement the following to generate an output file showing
#1 or 0 for home team coming off a win
b=toprow[2]
observation_list.append(b)
with open(Output_file, "ab") as resultFile:
writer = csv.writer(resultFile, lineterminator='\n')
writer.writerow(observation_list)
else (toprow[1]==a):
#implement the following if statement
#where toprow[1] is the 1st column containing the value
#of the hometeam from the toprow
b==toprow[3]
observation_list.append(b])
#implement the following to generate an output file showing
#1 or 0 for home team coming off a win
with open(Output_file, "ab") as resultFile:
writer = csv.writer(resultFile, lineterminator='\n')
writer.writerow(observation_list)
From what I have done and read thus far I can see there being two problems:
Problem 1: how can I get the second for loop, marked with ****, to iterate over the previously read rows until it reaches the most recent row to contain the variable define by 'a' ?
Problem 2: How do I start the code block from the 3rd row? The reason this needs to be done is to prevent A. reading the header and, more importantly, B. trying to read a non existent / negative row i.e. row1 - 1 = row0, row0 doesn't exist!?
NB the desired output file would be as follows:
-blank- #first cell would be empty as there is no data to fill it
-blank- #second cell would be empty as there is no data to fill it
-blank- #third cell would be empty as there is no data to fill it
0 #fourth cell contains 0 as MU lost their most recent game
0 #fifth cell contains 0 as QPR lost their most recent game
1 #sixth cell contains 1 as E won their most recent game
Upvotes: 0
Views: 260
Reputation: 23753
A good thing to do is to write down, in words, the steps you think you need to take to solve the problem. For this problem I want to:
While the file is being read, store the result of the most recently played game so it can be looked up later. dictionaries are made for this - {team1 : result_of_last_game, team2 : result_of_last_game, ...}
. When looking up each team's first game, there wont be a previous game - the dictionary will throw a KeyError
. the KeyError
can be handled with a try/except
block or collections.defaultdictionary
could be used to account for this.
I like to use operator.itemgetter
when extracting items from a sequence - it makes the code a bit more readable for when I look at it later.
import operator, collections
home = operator.itemgetter(0,2) #first and third item
away = operator.itemgetter(1,3) #second and fourth item
team = operator.itemgetter(0) #first item
#dictionary to hold the previous game's result
#default will be a blank string
last_game = collections.defaultdict(str)
#string to format the output
out = '{}\t{}'
with open('data.txt') as f:
#skip the header
f.next()
#data = map(parse, f)
for line in f:
#split the line into its relavent parts
line = line.strip()
line = line.split()
#extract the team and game result
#--> (team1, result), (team2, result)
h, a = home(line), away(line)
home_team = team(h)
#print the result of the last game
print(out.format(home_team, last_game[home_team]))
#update the dictionary with the results of this game
last_game.update([h,a])
Instead of printing the results, you could easily write them to a file or collect them in a container and write them to a file later.
If you want something other than an empty string for your defaultdict
, you could do something like this
class Foo(object):
def __init__(self, foo):
self.__foo = foo
def __call__(self):
return self.__foo
blank = Foo('-blank-')
last_game = collections.defaultdict(blank)
Upvotes: 2