Reputation: 125
Let's say that I have a dataframe with 6 columns and 4 rows, and a separate list we will call boollist. What I want to do is loop through all of the rows, and then if 2 of the cells in the row contain blank strings I would then append "False" to boollist. Vice-versa, if the row contains less than 2 blank string cells it will append a "True". At the end of the process boollist should have the same length as the number of rows so that it can be added as a new column.
column0 column1 column2 column3 column4 column5
row0 'data' 'data' 'data' 'data' 'data' 'data'
row1 'data' '' 'data' 'data' 'data' 'data'
row2 '' '' '' '' 'data' ''
row3 'data' 'data' 'data' 'data' 'data' 'data'
In this example, boollist shoud end up containing [True,True,False,True].
Thanks in advance for any help.
Upvotes: 2
Views: 188
Reputation: 676
The great thing about pandas is you don't need to loop through anything.
If you don't want to edit your data to count strings as null then you can use applymap to go through your data.
applymap applys a function elementwise through your dataframe. Within applymap you can use a lambda function which will return True if the cell has an empty string. You then sum the empty strings in the row.
df.applymap(lambda x: x =='').sum() <2
returns a boolean array conditional on the rows with less than two empty strings.
Upvotes: 2
Reputation: 294228
Blanks resolve to False
in a bool
context.
(~df.astype(bool)).sum(1) < 2
To be more explicit
df.eq('').sum(1) < 2
Upvotes: 2