gseelig
gseelig

Reputation: 125

Looping through rows in a Pandas dateframe to check values in two separate columns

Let's say that I have a dataframe with 6 columns and 4 rows, and a separate list we will call boollist. What I want to do is loop through all of the rows, and then if 2 of the cells in the row contain blank strings I would then append "False" to boollist. Vice-versa, if the row contains less than 2 blank string cells it will append a "True". At the end of the process boollist should have the same length as the number of rows so that it can be added as a new column.

         column0    column1    column2    column3    column4    column5
row0     'data'     'data'      'data'    'data'      'data'     'data'
row1     'data'       ''        'data'    'data'      'data'     'data'
row2       ''         ''         ''          ''       'data'       ''
row3     'data'      'data'     'data'     'data'     'data'     'data'

In this example, boollist shoud end up containing [True,True,False,True].

Thanks in advance for any help.

Upvotes: 2

Views: 188

Answers (2)

Tkanno
Tkanno

Reputation: 676

The great thing about pandas is you don't need to loop through anything.

If you don't want to edit your data to count strings as null then you can use applymap to go through your data.

applymap applys a function elementwise through your dataframe. Within applymap you can use a lambda function which will return True if the cell has an empty string. You then sum the empty strings in the row.

df.applymap(lambda x: x =='').sum() <2

returns a boolean array conditional on the rows with less than two empty strings.

Upvotes: 2

piRSquared
piRSquared

Reputation: 294228

Blanks resolve to False in a bool context.

(~df.astype(bool)).sum(1) < 2

To be more explicit

df.eq('').sum(1) < 2

Upvotes: 2

Related Questions