Jay Py
Jay Py

Reputation: 141

Python Split Up A Dataframe Based On Its Row Index

Trying to break up a dataframe into train, val, and test dataframes based on its row index e.g. observation 1 will go into training, 2 into val, and 3 into test, however, I'm hitting a roadblock. Here's my code thus far:

climbingTngDataset = pd.DataFrame([])
climbingValDataset = pd.DataFrame([])
climbingTestDataset = pd.DataFrame([])

for i in range(len(dfClimbing)):
    if i % 2 == 0:
       climbingValDataset.append(i) 
    if i % 3 == 0:
        climbingTestDataset.append(i)
    else:
        climbingTngDataset.append(i)

Upvotes: 0

Views: 389

Answers (1)

cs95
cs95

Reputation: 402553

Use groupby to split your dataFrame:

train, test, val = [
    g for _, g in dfClimbing.groupby(dfClimbing.index % 3)
]

Demo
(With two splits instead of 3)

print(df)
   Record ID Para Tag
0          1    A   x
1          1    A   y
2          2    B   x
3          2    B   y
4          1    A   z

i, j = [g for _, g in df.groupby(df.index % 2)]

print(i)
   Record ID Para Tag
0          1    A   x
2          2    B   x
4          1    A   z

print(j)
   Record ID Para Tag
1          1    A   y
3          2    B   y

Upvotes: 1

Related Questions