PyBoss
PyBoss

Reputation: 631

How to loop through multiple data frames to select a data frame based on row criteria?

I have a few Python dataframes in Pandas, I want to loop through them to find out which data frame meet my rows' criteria and save it in a new data frame.

d = {'Count' : ['10', '11', '12', '13','13.4','12.5']}
df_1= pd.DataFrame(data=d)
df_1

d = {'Count' : ['10', '-11', '-12', '13','16','2']}
df_2= pd.DataFrame(data=d)
df_2

Here is the logic I want to use, but it does not contain the right syntax,

for df in (df_1,df_2)
    if df['Count'][0] >0 and df['Count'][1] >0 and df['Count'][2]>0 and df['Count'][3]>0 
    and (df['Count'][4] is between df['Count'][3]+0.5 and df['Count'][3]-0.5) is True:
        df.save

The correct output is df_1... because it meets my condition. How do I create a new DataFrame or LIST to save the result as well?

enter image description here

Upvotes: 0

Views: 1123

Answers (1)

Max Power
Max Power

Reputation: 8954

Let me know if you have any questions in the comments. Main updates I made to your code was:

  1. Replacing your chained indexing with .loc
  2. Consolidating your first few separate and'd comparisons into a comparison on a slice of the series, reduced down to a single T/F with .all()

Code below:

import pandas as pd 

# df_1 & df_2 input taken from you
d = {'Count' : ['10', '11', '12', '13','13.4','12.5']}
df_1= pd.DataFrame(data=d)

d = {'Count' : ['10', '-11', '-12', '13','16','2']}
df_2= pd.DataFrame(data=d)

# my solution here
df_1['Count'] = df_1['Count'].astype('float')
df_2['Count'] = df_2['Count'].astype('float')

my_dataframes = {'df_1': df_1, 'df_2': df_2}
good_dataframes = []
for df_name, df in my_dataframes.items():
    if (df.loc[0:3, 'Count'] > 0).all() and (df.loc[3,'Count']-0.5 <= df.loc[4, 'Count'] <= df.loc[3, 'Count']+0.5):
        good_dataframes.append(df_name)

good_dataframes_df = pd.DataFrame({'good': good_dataframes})

TEST:

>>> print(good_dataframes_df)
   good
0  df_1

Upvotes: 1

Related Questions