james
james

Reputation: 570

Need to determine if group contains only one catagory in pandas dataframe

I currently have the following DataFrame with an id and a column called "childOrParent". A group cannot have children without Parents.

+----+---------------+
| id | childOrParent |
+----+---------------+
|  1 | Parent        |
|  1 | child         |
|  2 | Parent        |
|  3 | child         |
|  3 | child         |
|  3 | Parent        |
+----+---------------+

How do I check to see if the DataFrame is valid? If there is an id group were there is only children, then I need to know the id.

ex) the following dataframe would be invalid and I need to know that it is id: 3

+----+---------------+
| id | childOrParent |
+----+---------------+
|  1 | Parent        |
|  1 | child         |
|  2 | Parent        |
|  3 | child         |
|  3 | child         |
|  3 | child         |
+----+---------------+

I've tried to get only the counts of children or parent within a group and then merge the two DataFrames but that doesn't seem to be right.

Upvotes: 0

Views: 31

Answers (1)

BENY
BENY

Reputation: 323326

Using groupby with filter + all

df.groupby('id').filter(lambda x : (x['childOrParent']=='child').all())
Out[383]: 
   id childOrParent
3   3         child
4   3         child
5   3         child
df.groupby('id').filter(lambda x : (x['childOrParent']=='child').all()).id.unique()
Out[384]: array([3], dtype=int64)

Upvotes: 2

Related Questions