Reputation: 649
I have a Dataframe which I get from csv file using
df = pd.read_csv('data.csv')
I want to select some of the rows of this Dataframe and create a new Dataframe but the logic to select those rows is complex and needs to be inside a function. And this filter logic uses data from that row only, not from any other rows in the Dataframe. So how can I create a new Dataframe by using this filter function so select rows from this Dataframe?
Upvotes: 0
Views: 47
Reputation: 191
why are you not just a boolean mask like
idxs = df[df['foo'] == 'bar'].index.to_list()
df_slice = df.loc[idxs].copy()
alternatively
df_slice = df.query('col1 > 2 and col2 .....').copy()
If you really need to apply a function to a row i would do it like this:
# Define your function here which gets a series as input.
def check_condition(s)
if condition:
return 1
return 0
df['matches_cond'] = df[['foo', 'bar'...]].apply(
lambda x: check_condition(x), axis=1)
And then you can slice again using loc or query.
If you need something different please add a short example of you data and the desired output
Upvotes: 1