Reputation: 303
I am looking to tidy up some rows in my dataset
output = {
'id1': ['False', 'False', 'False', 'False'],
'id2': ['True', 'False', 'False', 'False'],
'id3': ['True', 'False', 'True', 'False'],
'id4': ['True', 'False', 'True', 'True'],
}
So using the above row 2 contains all false
Due to this i wont need to use it so I would just like to remove it
newoutput = {
'id1': ['False', 'False', 'False'],
'id2': ['True', 'False', 'False'],
'id3': ['True', 'True', 'False'],
'id4': ['True', 'True', 'True'],
}
I got as far as checking for rows with false in it
output.drop(output[output != False].index, inplace=True)
But that just looks at ANY value in row being False and not All
Upvotes: 3
Views: 5332
Reputation: 862406
Boolean
values, not string
type..replace
to replace string
with Boolean
type.df = pd.DataFrame({'id1': [True, False, False, False], 'id2': [True, False, False, False], 'id3': [True, False, True, False], 'id4': [True, False, True, True]})
id1 id2 id3 id4
0 True True True True
1 False False False False
2 False False True True
3 False False False True
Use DataFrame.any
to match at least one True
with boolean indexing, which will remove rows that are all False
.
df = df[df.any(axis=1)]
id1 id2 id3 id4
0 True True True True
2 False False True True
3 False False False True
Alternatively, to delete rows that are all True
, use .all()
and negate with ~
.
df[~df.all(axis=1)]
id1 id2 id3 id4
1 False False False False
2 False False True True
3 False False False True
Upvotes: 10
Reputation: 78650
I constructed df
via df = pd.DataFrame(output)
.
You can use
>>> df[df.replace({'False': False, 'True': True}).any(1)]
id1 id2 id3 id4
0 False True True True
2 False False True True
3 False False False True
I strongly suggest using booleans instead of strings to indicate True
and False
. In that case the solution is a simple reassignment to df[df.any(1)]
.
Upvotes: 2