Omer Davidi
Omer Davidi

Reputation: 55

Iterating over dataframe and get columns as new dataframes

I'm trying to create a set of dataframes from one big dataframe. Theses dataframes consists of the columns of the original dataframe in this manner: 1st dataframe is the 1st column of the original one, 2nd dataframe is the 1st and 2nd columns of the original one, and so on. I use this code to iterate over the dataframe:

for i, data in enumerate(x):
    data = x.iloc[:,:i]
    print(data)

This works but I also get an empty dataframe in the beginning and an index vector I don't need. any suggestions on how to remove those 2?

thanks

Upvotes: 2

Views: 66

Answers (2)

ashkangh
ashkangh

Reputation: 1624

you can also do something like this:

data = {
    'col_1': np.random.randint(0, 10, 5),
    'col_2': np.random.randint(10, 20, 5),
    'col_3': np.random.randint(0, 10, 5),
    'col_4': np.random.randint(10, 20, 5),
}
df = pd.DataFrame(data)

all_df = {col: df.iloc[:, :i] for i, col in enumerate(df, start=1)}

# For example we can print the last one
print(all_df['col_4'])
    col_1   col_2   col_3   col_4
0   1   13  5   10
1   8   16  1   18
2   6   11  5   18
3   3   11  1   10
4   7   14  8   12

Upvotes: 0

ThePyGuy
ThePyGuy

Reputation: 18406

Instead of enumerating the dataframe, since you are not using the outcome after enumerating but using only the index value, you can just iterate in the range 1 through the number of columns added one, then take the slice df.iloc[:, :i] for each value of i, you can use list-comprehension to achieve this.

>>> [df.iloc[:, :i] for i in range(1,df.shape[1]+1)]
[  A
0  1
1  2
2  3,    
   A  B
0  1  2
1  2  4
2  3  6]

The equivalent traditional loop would look something like this:

for i in range(1,df.shape[1]+1):
    print(df.iloc[:, :i])
    
   A
0  1
1  2
2  3
   A  B
0  1  2
1  2  4
2  3  6

Upvotes: 1

Related Questions