Reputation: 1088
Suppose I have the following dataframes:
f1 = pd.DataFrame({
'feature1': ['a', 'b','c'],
'col2': [1,2,3],
'col3': [0.1,0.2,0.3]
})
f2 = pd.DataFrame({
'feature2': ['x', 'y','z'],
'col2': [4,5,6],
'col3': [0.4,0.5,0.6]
})
f3 = pd.DataFrame({
'feature2': ['i', 'j','k'],
'col2': [7,8,9],
'col3': [0.7,0.8,0.9]
})
I want to make a new dataframe to stack the three dataframes on top of each other such that I get:
Feature Col1 Col2
0 feature1 col2 col3
1 a 1 0.1
2 b 2 0.2
3 c 3 0.3
4 feature2 col2 col3
5 x 4 0.4
6 y 5 0.5
7 z 6 0.6
8 feature2 col2 col3
9 i 7 0.7
10 j 8 0.8
11 k 9 0.9
So far I have been doing this by exporting each dataframe into excel and then copy and paste them into a new sheet manually (hence when I import the final excel file into Python I can get the desired result). But I am sure there should be a way to do this efficiently within Python itself?
Upvotes: 0
Views: 788
Reputation: 15872
You can use a function, using pandas.DataFrame.T
, pandas.DataFrame.reset_index
and pandas.DataFrame.set_axis
to preprocess the data, and then pandas.concat
:
def preprocess(df, new_cols=['Feature', 'Col1', 'Col2']):
"""
Make columns the first row of the dataframe.
And replace column names with `new_cols`.
"""
return df.T.reset_index().T.set_axis(new_cols, axis='columns')
>>> pd.concat(map(preprocess, [f1, f2, f3]), ignore_index=True)
Feature Col1 Col2
0 feature1 col2 col3
1 a 1 0.1
2 b 2 0.2
3 c 3 0.3
4 feature2 col2 col3
5 x 4 0.4
6 y 5 0.5
7 z 6 0.6
8 feature2 col2 col3
9 i 7 0.7
10 j 8 0.8
11 k 9 0.9
Here, preprocess(f1)
gives:
>>> preprocess(f1)
Feature Col1 Col2
index feature1 col2 col3
0 a 1 0.1
1 b 2 0.2
2 c 3 0.3
The index looks different, thus we pass ignore_index=True
to pandas.concat
argument, which converts the resulting index to a pandas.RangeIndex
starting from 0
.
Upvotes: 2
Reputation: 323226
Do small adjust with self-def function
def ff(x):
x = x.T.reset_index().T
x.columns = ['feature','col1','col2']
return x
out = pd.concat([ff(f1),ff(f2),ff(f3)]).reset_index(drop=True)
out
Out[96]:
feature col1 col2
0 feature1 col2 col3
1 a 1 0.1
2 b 2 0.2
3 c 3 0.3
4 feature2 col2 col3
5 x 4 0.4
6 y 5 0.5
7 z 6 0.6
8 feature2 col2 col3
9 i 7 0.7
10 j 8 0.8
11 k 9 0.9
Upvotes: 2