Reputation: 925
I'm trying to iterate through a few .csv files, and if a value is the same, add up the related entries. If a value is not the same, add that as a new row.
My current code is:
import pandas
s1011 = pandas.read_csv('goals/1011.csv', sep=',', usecols=(1,3,5))
s1011.dropna(how="all", inplace=True)
print s1011
s1112 = pandas.read_csv('goals/1112.csv', sep=',', usecols=(1,3,5))
s1112.dropna(how="all", inplace=True)
print s1112
s1213 = pandas.read_csv('goals/1213.csv', sep=',', usecols=(1,3,5))
s1213.dropna(how="all", inplace=True)
print s1213
Currently this doesn't do much I know. It prints out 3 headings, Team, For & Against. It the prints out 20 football teams, how many goals they've scored, and how many conceded. I've tried using merge in python but it isn't suitable as it just makes one big table.
What I'm trying to do is open the multiple csv files, iterate through the list, if the Team is the same, add together goals for and goals against from each file. If Team has not been entered before, add a new row with new entries.
Is this possible using python?
Edit:
It currently prints
Team For Against
Man Utd 86 43
Man City 83 45
Chelsea 89 39
etc for 20 teams. So I want to update the For and Against entries by adding up the number of goals for each team over a number of seasons. Since it won't be the same 20 teams each season, I want to add a new row of entries if a team hasn't been in the league before.
Upvotes: 0
Views: 352
Reputation: 1343
Assume you have the following csv's:
df1
Team For Against
Man Utd 86 43
Man City 83 45
Chelsea 89 39
df2
Team For Against
Man Utd 88 45
Man City 85 47
ICantNameATeam 91 41
You can first stack them using pandas.concat:
df_concat = pandas.concat([df1, df2], axis=0)
which would give you:
df_concat
Team For Against
Man Utd 86 43
Man City 83 45
Chelsea 89 39
Man Utd 88 45
Man City 85 47
ICantNameATeam 91 41
Then you can use dataframe.groupby to take the sum:
df_sum = df_concat.groupby('Team').sum().reset_index()
This will group the dataframe according to unique team names and take the sum of each column.
Upvotes: 1
Reputation: 27
Here's one approach:
df[For-total] = df[For-1011] + df[For-1112] + df[For-1213]
Upvotes: 0