Nelson Chung
Nelson Chung

Reputation: 87

How do I subset columns in a Pandas dataframe based on criteria using a loop?

I have a Pandas dataframe called "bag' with columns called beans1, beans2, and beans3

bag = pd.DataFrame({'beans1': [3,1,2,5,6,7], 'beans2': [2,2,1,1,5,6], 'beans3': [1,1,1,3,3,2]}) 
bag
Out[50]: 
   beans1  beans2  beans3
0       3       2       1
1       1       2       1
2       2       1       1
3       5       1       3
4       6       5       3
5       7       6       2

I want to use a loop to subset each column with observations greater than 1, so that I get:

beans1
0       3
2       2
3       5
4       6
5       7

   beans2
0       2
1       2
4       5
5       6

   beans3
3       3
4       3
5       2

The way to do it manually is :

beans1=beans.loc[bag['beans1']>1,['beans1']]
beans2=beans.loc[bag['beans2']>1,['beans2']]
beans3=beans.loc[bag['beans3']>1,['beans3']]

But I need to employ a loop, with something like:

for i in range(1,4):
    beans+str(i).loc[beans.loc[bag['beans'+i]>1,['beans'+str(i)]]

But it didn't work. I need a Python version of R's eval(parse(text=""))) Any help appreciated. Thanks much!

Upvotes: 1

Views: 190

Answers (1)

jezrael
jezrael

Reputation: 862751

It is possible, but not recommended, with globals:

for i in range(1,4):
    globals()['beans' + str(i)] = bag.loc[bag['beans'+str(i)]>1,['beans'+str(i)]]

for c in bag.columns:
    globals()[c] = bag.loc[bag[c]>1,[c]]

print (beans1)
   beans1
0       3
2       2
3       5
4       6
5       7

Better is create dictionary:

d = {c: bag.loc[bag[c]>1, [c]] for c in bag}

print (d['beans1'])
   beans1
0       3
2       2
3       5
4       6
5       7

Upvotes: 1

Related Questions