Reputation: 49
Consider the following dataframe with different types of elements in its columns (int and list).
>>> df
index A B C
1 0 11 [13, 16, 18]
2 1 0 nan
3 34 1 [13, 16]
4 11 nan [10, 16]
I want to iterate over the rows and for every row after dropping the NaNs, I want to check if any value of the columns is of the type "list". If no value has the type list (e.g. index 2), I create a query containing the pairs "column name" and "data (value) for that column":
for i in df.index:
line = df.loc[i].dropna()
*if no value in df.loc[i] is of type list:*
q1 = ' and '.join([f'`{variable}`=={int(value)}' for variale, value in line.items()])
For index 2 we have: q1 = 'A==1' and 'B==0'
But if at least one value has the type list, I need to modify the query by adding a new condition for values of type list. Something like this:
q2 = q1 + 'variable of type list IN (list of values)'
For example, for index 3 we want to have: q2 = 'A==34' and 'B==1' and 'C in (13, 16)'
Is it possible to create such query? Any help would be greatly appreciated.
Upvotes: 2
Views: 120
Reputation: 195653
Try:
x = df.apply(
lambda x: " and ".join(
[
f"'{i} in {tuple(v)}'" if isinstance(v, list) else f"'{i} == {v}'"
for i, v in zip(x[x.notna()].index, x[x.notna()])
]
),
axis=1,
)
print(x)
Prints:
0 'A == 0' and 'B == 11.0' and 'C in (13, 16, 18)'
1 'A == 1' and 'B == 0.0'
2 'A == 34' and 'B == 1.0' and 'C in (13, 16)'
3 'A == 11' and 'C in (10, 16)'
dtype: object
Upvotes: 1