Reputation: 29
I have a dataframe like so:
Child | Parent | Roots |
---|---|---|
2 | 1 | [1,3,4] |
2 | 3 | [1,3,4] |
2 | 4 | [1,3,4] |
5 | 2 | [1,3,4] |
6 | 5 | [1,3,4] |
The first 2 fields represent a parent child relationship while the roots field represent the root parents for each row and is stored as a list
due to the possibility of there being several.
I am trying to create a new field that would indicate whether the parent ID is one of the root parent like so:
Child | Parent | Roots | RootParent |
---|---|---|---|
2 | 1 | [1,3,4] | True |
2 | 3 | [1,3,4] | True |
2 | 4 | [1,3,4] | True |
5 | 2 | [1,3,4] | False |
6 | 5 | [1,3,4] | False |
However, I am not sure how to apply the "any" logic correctly through the list comprehension, here are 2 methods I have tried thus far:
dummy = []
for row in x.itertuples():
child, roots= row[1], row[3]
for i in roots:
if any(child==i):
test = True
dummy.append(test)
else:
test = False
dummy.append(test)
x['rootparent'] = dummy
def test1 (a,b):
for i in a:
if b==i:
return True
else:
return False
dummy = []
for row in x.itertuples():
child, root = row[1], row[3]
dummy.append(any(test1(root ,child)))
x['rootparent'] = dummy
Is there any way to evaluate if the parent is within the root list for each row?
Upvotes: 1
Views: 90
Reputation: 177
The shortest way to reach your goal would be to use .apply()
.
df["RootParent"] = df.apply(lambda row: row["parent"] in row["root"], axis=1)
The with axis=1
the lambda function receives the rows of the df. The lambda
returns True if the parent
is in the root
list else it returns False.
Upvotes: 3
Reputation: 35676
Another option via Series.explode
+ Series.eq
+ Series.any
on level=0:
df['RootParent'] = df['Roots'].explode().eq(df['Parent']).any(level=0)
Child Parent Roots RootParent
0 2 1 [1, 3, 4] True
1 2 3 [1, 3, 4] True
2 2 4 [1, 3, 4] True
3 5 2 [1, 3, 4] False
4 6 5 [1, 3, 4] False
Complete Working Example:
import pandas as pd
df = pd.DataFrame({
'Child': [2, 2, 2, 5, 6],
'Parent': [1, 3, 4, 2, 5],
'Roots': [[1, 3, 4], [1, 3, 4], [1, 3, 4], [1, 3, 4], [1, 3, 4]]
})
df['RootParent'] = df['Roots'].explode().eq(df['Parent']).any(level=0)
print(df)
Upvotes: 1