Reputation: 51
I am a newbie in Python, just started to learn. I am doing a sport prediciton based on scores that were before. I have 2 csv files, one is with all matches from the current year and one is filled with standings ( final results of the tournament and rankings + JUST UNIQUE OBJECTS - I mean I only have 14 rows on this). The problem comes with the standings csv that looks like this:
Squad,Rk,MP,W,D,L,GF,GA,GD,Pts,Pts/G,MP,W,D,L,GF,GA,GD,Pts,Pts/G
CFR Cluj,1,18,13,5,0,24,5,19,44,2.44,18,10,5,3,30,14,16,35,1.94
And I have this code that raises me the key error for the first line that I sampled from my csv.
def home_team_ranks_higher(row):
home_team = row["Home"]
visitor_team = row["Away"]
home_rank = standings.loc[home_team]["Rk"]
visitor_rank = standings.loc[visitor_team]["Rk"]
return home_rank < visitor_rank
dataset["HomeTeamRanksHigher"] = dataset.apply(home_team_ranks_higher, axis = 1)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-112-d3a62e1e7d32> in <module>
6 return home_rank < visitor_rank
7
----> 8 dataset["HomeTeamRanksHigher"] = dataset.apply(home_team_ranks_higher, axis = 1)
9
10 #dataset["HomeTeamRanksHigher"] = 0
~\anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
7546 kwds=kwds,
7547 )
-> 7548 return op.get_result()
7549
7550 def applymap(self, func) -> "DataFrame":
~\anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
178 return self.apply_raw()
179
--> 180 return self.apply_standard()
181
182 def apply_empty_result(self):
~\anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
269
270 def apply_standard(self):
--> 271 results, res_index = self.apply_series_generator()
272
273 # wrap results
~\anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
298 for i, v in enumerate(series_gen):
299 # ignore SettingWithCopy here in case the user mutates
--> 300 results[i] = self.f(v)
301 if isinstance(results[i], ABCSeries):
302 # If we have a view on v, we need to make a copy because
<ipython-input-112-d3a62e1e7d32> in home_team_ranks_higher(row)
2 home_team = row["Home"]
3 visitor_team = row["Away"]
----> 4 home_rank = standings.loc[home_team]["Rk"]
5 visitor_rank = standings.loc[visitor_team]["Rk"]
6 return home_rank < visitor_rank
~\anaconda3\lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
877
878 maybe_callable = com.apply_if_callable(key, self.obj)
--> 879 return self._getitem_axis(maybe_callable, axis=axis)
880
881 def _is_scalar_access(self, key: Tuple):
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1108 # fall thru to straight lookup
1109 self._validate_key(key, axis)
-> 1110 return self._get_label(key, axis=axis)
1111
1112 def _get_slice_axis(self, slice_obj: slice, axis: int):
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis)
1057 def _get_label(self, label, axis: int):
1058 # GH#5667 this will fail if the label is not present in the axis.
-> 1059 return self.obj.xs(label, axis=axis)
1060
1061 def _handle_lowerdim_multi_index_axis0(self, tup: Tuple):
~\anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level)
3489 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
3490 else:
-> 3491 loc = self.index.get_loc(key)
3492
3493 if isinstance(loc, np.ndarray):
~\anaconda3\lib\site-packages\pandas\core\indexes\range.py in get_loc(self, key, method, tolerance)
356 except ValueError as err:
357 raise KeyError(key) from err
--> 358 raise KeyError(key)
359 return super().get_loc(key, method=method, tolerance=tolerance)
360
KeyError: 'CFR Cluj'
Note: I tried to interchange the 'Rk' and 'Squad' columns, but I could not get any result, just different errors.
What I am looking for is getting the rank of every home team / visitor team from my history of matches that are found in the final table (standings) and store them in 'home_rank' / 'visitor_rank' variables.
PS: I tried other ideas to access the rank but none of them got me any result.
Any ideas or solutions are great! Thank you :)
Upvotes: 0
Views: 1280
Reputation: 734
The KeyError reflects, that you try to index your dataframe standings
with a row value instead of a column name. You might try to access the squads rank home_rank
(and similarly for visitor_rank
) with
home_rank = standings['Rk'][ standings['Squad']=='CFR Cluj' ][0]
#home_rank = standings['Rk'].loc[ standings['Squad']=='CFR Cluj' ][0]
Step by step this is equal to
boolean_indices = standings['Squad']=='CFR Cluj'
standings_ranks = standings['Rk']
home_ranks = standings_ranks[boolean_indices]
home_rank = home_ranks[0] #if unique it only contains a single value
Upvotes: 1