Stina V
Stina V

Reputation: 29

How to evaluate "any" condition against a nested list in a dataframe?

I have a dataframe like so:

Child Parent Roots
2 1 [1,3,4]
2 3 [1,3,4]
2 4 [1,3,4]
5 2 [1,3,4]
6 5 [1,3,4]

The first 2 fields represent a parent child relationship while the roots field represent the root parents for each row and is stored as a list due to the possibility of there being several.

I am trying to create a new field that would indicate whether the parent ID is one of the root parent like so:

Child Parent Roots RootParent
2 1 [1,3,4] True
2 3 [1,3,4] True
2 4 [1,3,4] True
5 2 [1,3,4] False
6 5 [1,3,4] False

However, I am not sure how to apply the "any" logic correctly through the list comprehension, here are 2 methods I have tried thus far:

  1. Method 1:
    dummy = []
    
    for row in x.itertuples():
        child, roots= row[1], row[3]
        for i in roots:
            if any(child==i):
                test = True
                dummy.append(test)
            else:
                test = False
                dummy.append(test)
    
    x['rootparent'] = dummy
    
  2. Method 2:
    def test1 (a,b):
        for i in a:
            if b==i:
                return True
            else:
                return False
    dummy = []
    
    for row in x.itertuples():
        child, root = row[1], row[3]
        dummy.append(any(test1(root ,child)))
    
    x['rootparent'] = dummy
    

Is there any way to evaluate if the parent is within the root list for each row?

Upvotes: 1

Views: 90

Answers (2)

F. Strothmann
F. Strothmann

Reputation: 177

The shortest way to reach your goal would be to use .apply().

df["RootParent"] = df.apply(lambda row: row["parent"] in row["root"], axis=1)

The with axis=1 the lambda function receives the rows of the df. The lambda returns True if the parent is in the rootlist else it returns False.

Upvotes: 3

Henry Ecker
Henry Ecker

Reputation: 35676

Another option via Series.explode + Series.eq + Series.any on level=0:

df['RootParent'] = df['Roots'].explode().eq(df['Parent']).any(level=0)
   Child  Parent      Roots  RootParent
0      2       1  [1, 3, 4]        True
1      2       3  [1, 3, 4]        True
2      2       4  [1, 3, 4]        True
3      5       2  [1, 3, 4]       False
4      6       5  [1, 3, 4]       False

Complete Working Example:

import pandas as pd

df = pd.DataFrame({
    'Child': [2, 2, 2, 5, 6],
    'Parent': [1, 3, 4, 2, 5],
    'Roots': [[1, 3, 4], [1, 3, 4], [1, 3, 4], [1, 3, 4], [1, 3, 4]]
})

df['RootParent'] = df['Roots'].explode().eq(df['Parent']).any(level=0)
print(df)

Upvotes: 1

Related Questions