Reputation: 23
I have a dataframe
and need to break it into 2 equal dataframes
.
1st dataframe would contain top half rows and 2nd would contain the remaining rows.
Please help how to achieve this using python
.
Also in both the even rows scenario and odd rows scenario (as in odd rows I would need to drop the last row to make it equal).
Upvotes: 1
Views: 6479
Reputation: 262
with a simple eg. you can try as below:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13],['Tom',20],['Jerry',25]]
#data = [['Alex',10],['Bob',12],['Clarke',13],['Tom',20]]
data1 = data[0:int(len(data)/2)]
if (len(data) % 2) == 0:
data2 = data[int(len(data)/2):]
else:
data2 = data[int(len(data)/2):-1]
df1 = pd.DataFrame(data1, columns=['Name', 'Age'], dtype=float); print("1st half:\n",df1)
df2 = pd.DataFrame(data2, columns=['Name', 'Age'], dtype=float); print("2nd Half:\n",df2)
Output:
D:\Python>python temp.py
1st half:
Name Age
0 Alex 10.0
1 Bob 12.0
2nd Half:
Name Age
0 Clarke 13.0
1 Tom 20.0
Upvotes: 1
Reputation: 34086
Consider df
:
In [122]: df
Out[122]:
id days sold days_lag
0 1 1 1 0
1 1 3 0 2
2 1 3 1 2
3 1 8 1 5
4 1 8 1 5
5 1 8 0 5
6 2 3 0 0
7 2 8 1 5
8 2 8 1 5
9 2 9 2 1
10 2 9 0 1
11 2 12 1 3
12 3 4 5 6
Use numpy.array_split()
:
In [127]: import numpy as np
In [128]: def split_df(df):
...: if len(df) % 2 != 0: # Handling `df` with `odd` number of rows
...: df = df.iloc[:-1, :]
...: df1, df2 = np.array_split(df, 2)
...: return df1, df2
...:
In [130]: df1, df2 = split_df(df)
In [131]: df1
Out[131]:
id days sold days_lag
0 1 1 1 0
1 1 3 0 2
2 1 3 1 2
3 1 8 1 5
4 1 8 1 5
5 1 8 0 5
In [133]: df2
Out[133]:
id days sold days_lag
6 2 3 0 0
7 2 8 1 5
8 2 8 1 5
9 2 9 2 1
10 2 9 0 1
11 2 12 1 3
Upvotes: 3