Reputation: 179
This is similar to some other questions posted, but i can't find an answer that fits my needs.
I have a Dataframe with the following:
RK PLAYER SCHOOL YEAR POS POS RK HT WT 2019 2018 2017 2016
0 1 Nick Bosa Ohio St. Jr EDGE 1 6-4 266 Jr
1 2 Quinnen Williams Alabama Soph DL 1 6-3 303 Soph
2 3 Josh Allen Kentucky Sr EDGE 2 6-5 262 Sr
3 4 Ed Oliver Houston Jr DL 2 6-2 287 Jr
2018, 2017, and 2016 have np.NaN values; but i can't format this table correctly with them in it.
Now i have a separate list containing the following:
season = ['Sr', 'Jr', 'Soph', 'Fr']
The 2019 column says their current status, and i would like for the 2018 column to show their status as of the prior year. So if it was 'Sr', it should be 'Jr'. Essentially, what i want to do is have the column check for the value in [season], move it one index ahead, and then take that value back into the column. The result for 2018 should be:
RK PLAYER SCHOOL YEAR POS POS RK HT WT 2019 2018 2017 2016
0 1 Nick Bosa Ohio St. Jr EDGE 1 6-4 266 Jr Soph
1 2 Quinnen Williams Alabama Soph DL 1 6-3 303 Soph Fr
2 3 Josh Allen Kentucky Sr EDGE 2 6-5 262 Sr Jr
3 4 Ed Oliver Houston Jr DL 2 6-2 287 Jr Soph
I can think of a way to do this with a for k, v in iteritems loop that would check the values, but i'm wondering if there's a better way?
Upvotes: 4
Views: 58
Reputation: 111
Another possible solution is to write a function that will accept a row, do a slice of seasons
list starting from '2019' value and return that slice as pandas.Series
. Then we can apply that function to columns using apply()
. I used a part of your input DataFrame for testing.
In [3]: df
Out[3]:
WT 2019 2018 2017 2016
0 266 Jr NaN NaN NaN
1 303 Soph NaN NaN NaN
2 262 Sr NaN NaN NaN
3 287 Jr NaN NaN NaN
In [4]: def fill_row(row):
...: season = ['Sr', 'Jr', 'Soph', 'Fr']
...: data = season[season.index(row['2019']):]
...: return pd.Series(data)
In [5]: cols_to_update = ['2019', '2018', '2017', '2016']
In [6]: df[cols_to_update] = df[cols_to_update].apply(fill_row, axis=1)
In [7]: df
Out[7]:
WT 2019 2018 2017 2016
0 266 Jr Soph Fr NaN
1 303 Soph Fr NaN NaN
2 262 Sr Jr Soph Fr
3 287 Jr Soph Fr NaN
Upvotes: 1
Reputation: 2889
I'm not sure if this is much smarter than what you already have, but its a suggestion
import pandas as pd
def get_season(curr_season, curr_year, prev_year):
season = ['Sr', 'Jr', 'Soph', 'Fr']
try:
return season[season.index(curr_season) + (curr_year - prev_year)]
except IndexError:
# Return some meaningful meassage perhaps?
return '-'
df = pd.DataFrame({'2019': ['Jr', 'Soph', 'Sr', 'Jr']})
df['2018'] = [get_season(s, 2019, 2018) for s in df['2019']]
df['2017'] = [get_season(s, 2019, 2017) for s in df['2019']]
df['2016'] = [get_season(s, 2019, 2016) for s in df['2019']]
df
Out[18]:
2019 2018 2017 2016
0 Jr Soph Fr -
1 Soph Fr - -
2 Sr Jr Soph Fr
3 Jr Soph Fr -
Upvotes: 2