Cuting dataframe loop

Question

I have a dataset which is only one column. I want to cut the column into multiple dataframes.

I use a for loop to create a list which contains the values at which positions I want to cut the dataframe.

import pandas as pd

df = pd.read_csv("column.csv", delimiter=";", header=0, index_col=(0))

number_of_pixels = int(len(df.index))

print("You have " + str(number_of_pixels) +" pixels in your file")
number_of_rows = int(input("Enter number of rows you want to create"))
list=[] #this list contains the number of pixels per row

for i in range (0,number_of_rows): #this loop fills the list with the number of pixels per row
    pixels_per_row=int(input("Enter number of pixels in row " + str(i)))
    list.append(pixels_per_row)

print(list)

After cutting the column into multiple dataframes I want to transpose each dataframe and concating all dataframes back together using:

df1=df1.reset_index(drop=True) 
df1=df1.T 

df2=df2.reset_index(drop=True)
df2=df2.T

frames = [df1,df2]

result = pd.concat(frames, axis=0)

print(result)

So I want to create a loop that cuts my dataframe into multiple frames at the positions stored in my list.

Thank you!

ALollz · Accepted Answer

This is a problem that is better solved with numpy. I'll start from the point of you receiving a list from your user input. The whole point is to use numpy.split to separate the values based on the cumulative number of pixels requested, and then create a new DataFrame

Setup

import numpy as np
import pandas as pd

np.random.seed(123)
df = pd.DataFrame({'val': np.random.randint(1,10,50)})

lst = [4,10,2,1,15,8,9,1]

Code

pd.DataFrame(np.split(df.val.values, np.cumsum(lst)[:-1]))

Output

    0    1    2    3    4    5    6    7    8    9   10   11   12   13   14
0   3  3.0  7.0  2.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
1   4  7.0  2.0  1.0  2.0  1.0  1.0  4.0  5.0  1.0  NaN  NaN  NaN  NaN  NaN
2   1  5.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
3   2  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
4   8  4.0  3.0  5.0  8.0  3.0  5.0  9.0  1.0  8.0  4.0  5.0  7.0  2.0  6.0
5   7  3.0  2.0  9.0  4.0  6.0  1.0  3.0  NaN  NaN  NaN  NaN  NaN  NaN  NaN
6   7  3.0  5.0  5.0  7.0  4.0  1.0  7.0  5.0  NaN  NaN  NaN  NaN  NaN  NaN
7   8  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

If your list has more pixels than the total number of rows in your initial DataFrame then you'll get extra all NaN rows in your output. If your lst sums to less than the total number of pixels, it will add them to all to the last row. Since you didn't specify either of these conditions in your question, not sure how you'd want to handle that.

Cuting dataframe loop

Answers (1)

Setup

Code

Output

Related Questions