Reputation: 55
I'm trying to create a set of dataframes from one big dataframe. Theses dataframes consists of the columns of the original dataframe in this manner: 1st dataframe is the 1st column of the original one, 2nd dataframe is the 1st and 2nd columns of the original one, and so on. I use this code to iterate over the dataframe:
for i, data in enumerate(x):
data = x.iloc[:,:i]
print(data)
This works but I also get an empty dataframe in the beginning and an index vector I don't need. any suggestions on how to remove those 2?
thanks
Upvotes: 2
Views: 66
Reputation: 1624
you can also do something like this:
data = {
'col_1': np.random.randint(0, 10, 5),
'col_2': np.random.randint(10, 20, 5),
'col_3': np.random.randint(0, 10, 5),
'col_4': np.random.randint(10, 20, 5),
}
df = pd.DataFrame(data)
all_df = {col: df.iloc[:, :i] for i, col in enumerate(df, start=1)}
# For example we can print the last one
print(all_df['col_4'])
col_1 col_2 col_3 col_4
0 1 13 5 10
1 8 16 1 18
2 6 11 5 18
3 3 11 1 10
4 7 14 8 12
Upvotes: 0
Reputation: 18406
Instead of enumerating the dataframe, since you are not using the outcome after enumerating but using only the index value, you can just iterate in the range
1 through the number of columns added one, then take the slice df.iloc[:, :i]
for each value of i
, you can use list-comprehension to achieve this.
>>> [df.iloc[:, :i] for i in range(1,df.shape[1]+1)]
[ A
0 1
1 2
2 3,
A B
0 1 2
1 2 4
2 3 6]
The equivalent traditional loop would look something like this:
for i in range(1,df.shape[1]+1):
print(df.iloc[:, :i])
A
0 1
1 2
2 3
A B
0 1 2
1 2 4
2 3 6
Upvotes: 1