Jeroen
Jeroen

Reputation: 841

Pandas concatenate two dataframes

I have the following training scores from a linear regressor and decision tree regressor.

lin_rmse_scores = np.array([65000.67382615, 70960.56056304, 67122.63935124, 66089.63153865,
 68402.54686442, 65266.34735288, 65218.78174481, 68525.46981754,
 72739.87555996, 68957.34111906])
tree_rmse_scores = np.array([65312.86044031, 70581.69865676, 67849.75809965, 71460.33789358,
 74035.29744574, 65562.42978503, 67964.10942543, 69102.89388457,
 66876.66473025, 69735.84760006])

I want to compare some statistics of the two regressors using Pandas describe(). For the linear regressor I do this as follows:

df = pd.Series(lin_rmse_scores).describe()

To the Series I want to specify the column 'Lin Regr'. I want to append a second column for the decision tree regressor. Result should look like:

        'Lin Regr'          'Dec Tree'
count       10.000000        10.000000
mean     67828.386774       68848.189796
std       2601.596761        2719.219956
min      65000.673826       65312.860440
25%      65472.168399       67119.938073
50%      67762.593108       68533.501655
75%      68849.373294       70370.235893
max      72739.875560       74035.297446

Upvotes: 0

Views: 29

Answers (1)

BENY
BENY

Reputation: 323226

Let combine them before describe

s=pd.DataFrame({'lin_rmse_scores':lin_rmse_scores,'tree_rmse_scores':tree_rmse_scores}).describe()
       lin_rmse_scores  tree_rmse_scores
count        10.000000         10.000000
mean      67828.386774      68848.189796
std        2601.596761       2719.219956
min       65000.673826      65312.860440
25%       65472.168399      67119.938073
50%       67762.593108      68533.501655
75%       68849.373294      70370.235893
max       72739.875560      74035.297446

Upvotes: 2

Related Questions