Reputation: 141
Trying to break up a dataframe into train, val, and test dataframes based on its row index e.g. observation 1 will go into training, 2 into val, and 3 into test, however, I'm hitting a roadblock. Here's my code thus far:
climbingTngDataset = pd.DataFrame([])
climbingValDataset = pd.DataFrame([])
climbingTestDataset = pd.DataFrame([])
for i in range(len(dfClimbing)):
if i % 2 == 0:
climbingValDataset.append(i)
if i % 3 == 0:
climbingTestDataset.append(i)
else:
climbingTngDataset.append(i)
Upvotes: 0
Views: 389
Reputation: 402553
Use groupby
to split your dataFrame:
train, test, val = [
g for _, g in dfClimbing.groupby(dfClimbing.index % 3)
]
Demo
(With two splits instead of 3)
print(df)
Record ID Para Tag
0 1 A x
1 1 A y
2 2 B x
3 2 B y
4 1 A z
i, j = [g for _, g in df.groupby(df.index % 2)]
print(i)
Record ID Para Tag
0 1 A x
2 2 B x
4 1 A z
print(j)
Record ID Para Tag
1 1 A y
3 2 B y
Upvotes: 1