Dance Party2
Dance Party2

Reputation: 7536

Modifying DataFrames in loop

Given this data frame:

import pandas as pd
df=pd.DataFrame({'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]})
df
    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9

I'd like to create 3 new data frames; one from each column. I can do this one at a time like this:

a=pd.DataFrame(df[['A']])
a
    A
0   1
1   2
2   3

But instead of doing this for each column, I'd like to do it in a loop.

Here's what I've tried:

a=b=c=df.copy()
dfs=[a,b,c]
fields=['A','B','C']
for d,f in zip(dfs,fields):
    d=pd.DataFrame(d[[f]])

...but when I then print each one, I get the whole original data frame as opposed to just the column of interest.

a
        A   B   C
    0   1   4   7
    1   2   5   8
    2   3   6   9

Update: My actual data frame will have some columns that I do not need and the columns will not be in any sort of order, so I need to be able to get the columns by name.

Thanks in advance!

Upvotes: 0

Views: 1063

Answers (3)

BENY
BENY

Reputation: 323226

Or you can try this, instead create copy of df, this method will return the result as single Dataframe, not a list, However, I think save Dataframe into a list is better

dfs=['a','b','c']
fields=['A','B','C']
variables = locals()
for d,f in zip(dfs,fields):
    variables["{0}".format(d)] = df[[f]]

a
Out[743]: 
   A
0  1
1  2
2  3
b
Out[744]: 
   B
0  4
1  5
2  6
c
Out[745]: 
   C
0  7
1  8
2  9

Upvotes: 1

cs95
cs95

Reputation: 402483

A simple list comprehension should be enough.

In [68]: df_list = [df[[x]] for x in df.columns]

Printing out the list, this is what you get:

In [69]: for d in df_list:
    ...:     print(d)
    ...:     print('-' * 5)
    ...:     
   A
0  1
1  2
2  3
-----
   B
0  4
1  5
2  6
-----
   C
0  7
1  8
2  9
-----

Each element in df_list is its own data frame, corresponding to each data frame from the original. Furthermore, you don't even need fields, use df.columns instead.

Upvotes: 3

Jon Deaton
Jon Deaton

Reputation: 4379

You should use loc

a = df.loc[:,0]

and then loop through like

for i in range(df.columns.size):
   dfs[i] = df.loc[:, i]

Upvotes: 1

Related Questions