Reputation: 1097
I have a dataframe df
with about 1 million rows:
HOUSEID PERSONID ENDTIME STRTTIME TRVLCMIN
0 20000017 1 1020 955 25
1 20000017 1 1132 1130 2
2 20000017 1 1400 1330 30
3 20000017 2 1020 955 25
4 20000017 2 1025 1022 3
5 20000017 2 -9 1120 2
6 20000017 2 2035 2000 35
7 20000231 1 952 945 7
8 20000231 1 1000 -9 5
9 20000231 2 2013 2002 -9
Each row is a trip made by a person. And each unique combination of HOUSEID
and PERSONID
is a person.
I want to delete a person entirely if there is a "-9" in even one of the trips of these columns ENDTIME
, STRTTIME
and TRVLCMIN
.
Here is the output of my dataframe by this explanation:
HOUSEID PERSONID ENDTIME STRTTIME TRVLCMIN
0 20000017 1 1020 955 25
1 20000017 1 1132 1130 2
2 20000017 1 1400 1330 30
Can you help me with that?
I very much appreciate your time.
Upvotes: 0
Views: 49
Reputation: 150805
You could check for -9
in any of those columns, then groupby().transform
to identify the combination HOUSEID, PERSONID
that has some -9
:
# rows having -9 in one of the specified columns
s = df[['ENDTIME','STRTTIME','TRVLCMIN']].eq(-9).any(1)
# combination of `HOUSEID`, `PERSONID` having some `-9`
mask = s.groupby([df['HOUSEID'],df['PERSONID']]).transform('any')
# output
df[~mask]
Output:
HOUSEID PERSONID ENDTIME STRTTIME TRVLCMIN
0 20000017 1 1020 955 25
1 20000017 1 1132 1130 2
2 20000017 1 1400 1330 30
Upvotes: 1