Reputation: 364
I have following structure of the data in the dataframes. I want to create a new dataframe which represent in a way that
newDF would have columns Year, DF1, DF2, DF3
The columns should contain the data under the W
I have done : pd.concat([DF1['W'], DF2['W'], DF3['W']], axis=1, keys=['DF1', 'DF2','DF3'])
I got the results, but not sure how can I get the year data in it.
DF1
Year Conf W L T Pct SRS SOS AP Pre AP High AP Post Coach(es) Bowl
0 2017 SEC 13 2 0 0.867 22.47 7.01 15.0 2.0 2.0 Kirby Smart (13-2) Rose Bowl-W, College Football Championship-L
1 2016 SEC 8 5 0 0.615 3.64 2.57 18.0 9.0 NaN Kirby Smart (8-5) Liberty Bowl-W
2 2015 SEC 10 3 0 0.769 8.98 1.83 9.0 7.0 NaN Bryan McClendon (1-0), Mark Richt (9-3) TaxSlayer Bowl-W
3 2014 SEC 10 3 0 0.769 18.84 5.07 12.0 6.0 9.0 Mark Richt (10-3) Belk Bowl-W
4 2013 SEC 8 5 0 0.615 12.82 7.59 5.0 5.0 NaN Mark Richt (8-5) Gator Bowl-L
DF2
Year Conf W L T Pct SRS SOS AP Pre AP High AP Post Coach(es) Bowl
0 2017 Big Ten 8 5 0 0.615 13.44 6.98 11.0 7.0 NaN Jim Harbaugh (8-5) Outback Bowl-L
1 2016 Big Ten 10 3 0 0.769 17.56 4.79 7.0 2.0 10.0 Jim Harbaugh (10-3) Orange Bowl-L
2 2015 Big Ten 10 3 0 0.769 16.34 4.57 NaN 12.0 12.0 Jim Harbaugh (10-3) Citrus Bowl-W
3 2014 Big Ten 5 7 0 0.417 1.82 3.65 NaN NaN NaN Brady Hoke (5-7) NaN
4 2013 Big Ten 7 6 0 0.538 5.53 3.30 17.0 11.0 NaN Brady Hoke (7-6) Buffalo Wild Wings Bowl-L
DF3
Year Conf W L T Pct SRS SOS AP Pre AP High AP Post Coach(es) Bowl Unnamed: 13
0 2017 ACC 7 6 0 0.538 8.07 5.76 3.0 3.0 NaN Jimbo Fisher (5-6), Odell Haggins (2-0) Independence Bowl-W NaN
1 2016 ACC 10 3 0 0.769 15.01 6.16 4.0 2.0 8.0 Jimbo Fisher (10-3) Orange Bowl-W NaN
2 2015 ACC 10 3 0 0.769 13.59 1.97 10.0 9.0 14.0 Jimbo Fisher (10-3) Peach Bowl-L NaN
3 2014 ACC 13 1 0 0.929 14.48 5.13 1.0 1.0 5.0 Jimbo Fisher (13-1) Rose Bowl-L NaN
4 2013 ACC 14 0 0 1.000 23.36 1.29 11.0 1.0 1.0 Jimbo Fisher (14-0) BCS Championship-W NaN
RESULTS I GOT : When I did this : pd.concat([DF1['W'], DF2['W']], axis=1, keys=['DF1', 'DF2'])
how can I get year in the dataframe.
DF1 DF2
0 13 8
1 8 10
2 10 10
3 10 5
thanks for the help 4 8 7
Upvotes: 0
Views: 1610
Reputation: 863226
I think need set_index
with concat
and last reset_index
for column from index
:
df = pd.concat([DF1.set_index('Year')['W'],
DF2.set_index('Year')['W'],
DF3.set_index('Year')['W']], axis=1, keys=['DF1', 'DF2','DF3']).reset_index()
Another more dynamic solution with list comprehension
:
dfs = [x.set_index('Year')['W'] for x in [DF1,DF2,DF3]]
df = pd.concat(dfs, axis=1, keys=['DF1', 'DF2','DF3']).reset_index()
print (df)
Year DF1 DF2 DF3
0 2017 13 Ten 7
1 2016 8 Ten 10
2 2015 10 Ten 10
3 2014 10 Ten 13
4 2013 8 Ten 14
Upvotes: 1