Chinmay
Chinmay

Reputation: 364

Create new dataframe using existing dataframe columns pandas

I have following structure of the data in the dataframes. I want to create a new dataframe which represent in a way that

newDF would have columns Year, DF1, DF2, DF3

The columns should contain the data under the W

I have done : pd.concat([DF1['W'], DF2['W'], DF3['W']], axis=1, keys=['DF1', 'DF2','DF3'])

I got the results, but not sure how can I get the year data in it.

DF1

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl
0   2017    SEC     13  2   0   0.867   22.47   7.01    15.0    2.0     2.0     Kirby Smart (13-2)  Rose Bowl-W, College Football Championship-L
1   2016    SEC     8   5   0   0.615   3.64    2.57    18.0    9.0     NaN     Kirby Smart (8-5)   Liberty Bowl-W
2   2015    SEC     10  3   0   0.769   8.98    1.83    9.0     7.0     NaN     Bryan McClendon (1-0), Mark Richt (9-3)     TaxSlayer Bowl-W
3   2014    SEC     10  3   0   0.769   18.84   5.07    12.0    6.0     9.0     Mark Richt (10-3)   Belk Bowl-W
4   2013    SEC     8   5   0   0.615   12.82   7.59    5.0     5.0     NaN     Mark Richt (8-5)    Gator Bowl-L

DF2

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl
0   2017    Big Ten     8   5   0   0.615   13.44   6.98    11.0    7.0     NaN     Jim Harbaugh (8-5)  Outback Bowl-L
1   2016    Big Ten     10  3   0   0.769   17.56   4.79    7.0     2.0     10.0    Jim Harbaugh (10-3)     Orange Bowl-L
2   2015    Big Ten     10  3   0   0.769   16.34   4.57    NaN     12.0    12.0    Jim Harbaugh (10-3)     Citrus Bowl-W
3   2014    Big Ten     5   7   0   0.417   1.82    3.65    NaN     NaN     NaN     Brady Hoke (5-7)    NaN
4   2013    Big Ten     7   6   0   0.538   5.53    3.30    17.0    11.0    NaN     Brady Hoke (7-6)    Buffalo Wild Wings Bowl-L

DF3

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl    Unnamed: 13
0   2017    ACC     7   6   0   0.538   8.07    5.76    3.0     3.0     NaN     Jimbo Fisher (5-6), Odell Haggins (2-0)     Independence Bowl-W     NaN
1   2016    ACC     10  3   0   0.769   15.01   6.16    4.0     2.0     8.0     Jimbo Fisher (10-3)     Orange Bowl-W   NaN
2   2015    ACC     10  3   0   0.769   13.59   1.97    10.0    9.0     14.0    Jimbo Fisher (10-3)     Peach Bowl-L    NaN
3   2014    ACC     13  1   0   0.929   14.48   5.13    1.0     1.0     5.0     Jimbo Fisher (13-1)     Rose Bowl-L     NaN
4   2013    ACC     14  0   0   1.000   23.36   1.29    11.0    1.0     1.0     Jimbo Fisher (14-0)     BCS Championship-W  NaN

RESULTS I GOT : When I did this : pd.concat([DF1['W'], DF2['W']], axis=1, keys=['DF1', 'DF2'])

how can I get year in the dataframe.

    DF1     DF2
0   13  8
1   8   10
2   10  10
3   10  5

thanks for the help 4 8 7

Upvotes: 0

Views: 1610

Answers (1)

jezrael
jezrael

Reputation: 863226

I think need set_index with concat and last reset_index for column from index:

df = pd.concat([DF1.set_index('Year')['W'], 
                DF2.set_index('Year')['W'], 
                DF3.set_index('Year')['W']], axis=1, keys=['DF1', 'DF2','DF3']).reset_index()

Another more dynamic solution with list comprehension:

dfs = [x.set_index('Year')['W'] for x in [DF1,DF2,DF3]]
df = pd.concat(dfs, axis=1, keys=['DF1', 'DF2','DF3']).reset_index()

print (df)
   Year  DF1  DF2  DF3
0  2017   13  Ten    7
1  2016    8  Ten   10
2  2015   10  Ten   10
3  2014   10  Ten   13
4  2013    8  Ten   14

Upvotes: 1

Related Questions