Reputation: 1272
From this question Select column with only one negative value I'm trying to use and change the solution to a list of dataframes and select the one that qualifies. Cannot make it work though.
In the example below I want to return the dataframe that has only one negative value or less in column 'Z'.
In this case df1.
Example;
N = 5
np.random.seed(0)
df1 = pd.DataFrame(
{'X':np.random.uniform(-3,3,N),
'Y':np.random.uniform(-3,3,N),
'Z':np.random.uniform(-3,3,N),
})
df2 = pd.DataFrame(
{'X':np.random.uniform(-3,3,N),
'Y':np.random.uniform(-3,3,N),
'Z':np.random.uniform(-3,3,N),
})
X Y Z
0 0.292881 0.875365 1.750350
1 1.291136 -0.374477 0.173370
2 0.616580 2.350638 0.408267
3 0.269299 2.781977 2.553580
4 -0.458071 -0.699351 -2.573784
----------------
X Y Z
0 -2.477224 2.871710 0.839526
1 -2.878690 1.794951 -2.139880
2 1.995719 -0.231124 2.668014
3 1.668941 1.683175 0.131090
4 2.220073 -2.290353 -0.512028
How could I accomplish this? Thanks in advance.
Upvotes: 1
Views: 971
Reputation: 109546
You could just use a conditional list comprehension:
dfs = [df1, df2]
>>> [df for df in dfs if df['Z'].lt(0).sum() <= 1]
[ X Y Z
0 0.292881 0.875365 1.750350
1 1.291136 -0.374477 0.173370
2 0.616580 2.350638 0.408267
3 0.269299 2.781977 2.553580
4 -0.458071 -0.699351 -2.573784]
The result is a list of each dataframe that satisfies your condition.
Upvotes: 2
Reputation: 402433
Count the number of items under 0 using sum
and just yield
them.
def foo(df_list):
for df in df_list:
if (df['Z'] < 0).sum(0) <= 1:
yield df
df_list = [df1, df2]
for df in foo(df_list):
print(df)
X Y Z
0 0.292881 0.875365 1.750350
1 1.291136 -0.374477 0.173370
2 0.616580 2.350638 0.408267
3 0.269299 2.781977 2.553580
4 -0.458071 -0.699351 -2.573784
Upvotes: 5
Reputation: 12607
This would do
def func(dataframe_list, on_column):
returned_list = []
for df in dataframe_list:
if (df[on_column] < 0).sum() <= 1:
returned_list.append(df)
return returned_list
In your case, call func([df1, df2], on_column='Z')
Upvotes: 0