Reputation: 35
I'm trying to create a difference column in Pandas based on comparing stats from one county to another, but the problem is that I have multiple years of data and I don't want to just subtract each entry by the overall average but by the comparison county's stat for that year.
Here's a picture I had in mind. It's easy (though tedious) to do on excel, but you just subtract county 1 for that year from the other county values... I'm just wondering if there's a way to do this on pandas? I've looked into the diff method, but it doesn't seem useful and it's not like I can use the typical code like this (because we're going by year, not some aggregate statistic).
cols I have in my example: "Year" "County" "Stat 1" and I want to create col of "Stat diff b/t county 1"
This is code I've tried that dosen't work:
df["Diff Between County 1"] = df.loc[df.County == 'County 1', ['Year', "Stat 1"] ]- df["Stat 1"]
I can't figure out how to also group by year or something ... because County 1 has a different value every year.
I've also tried to just select the stats and years for the comparison county but I'm just not sure how to then take the difference by year. There must be some way to match up multiple conditions or something
df_wantedcounty = df.loc[df.County =='County 1', ['Stat 1', 'Year'] ]
df_wantedcounty = df_la.reset_index(drop = True)
Any help?
Upvotes: 0
Views: 48
Reputation: 3128
Here you go, as a one liner!
### Recreate Dataframe ####
Year = [2010, 2010, 2010, 2022, 2022, 2022]
County = ["County 1", "County 2", "County 3", "County 1", "County 2", "County 3", ]
Stat_1 = [8312, 7816, 8185, 10176, 10666, 8145]
df = pd.DataFrame({"Year":Year, "County":County, "Stat_1":Stat_1})
############################
# Code to make new column
df["Stat diff b/t county 1"] = df.groupby('Year')['Stat_1'].apply(lambda x: x.iloc[0]-x)
Output:
Year County Stat_1 Stat diff b/t county 1
0 2010 County 1 8312 0
1 2010 County 2 7816 496
2 2010 County 3 8185 127
3 2022 County 1 10176 0
4 2022 County 2 10666 -490
5 2022 County 3 8145 2031
Upvotes: 2