jon
jon

Reputation: 359

Checking dataframe cells to see if they contain a value

Let's say I have a fairly simple code such as

import pandas
df_import=pandas.read_excel("dataframe.xlsx")
df_import['Company'].str.contains('value',na=False,case=False)

So this obviously imports pandas, creates a dataframe from an excel documentment and then searches the column titled Company for some value, and returns an index saying if the value of that cell contains that value (True or False)

However, I want to test 3 cases. Case A, no results were found (all False), case 2, only 1 case was found (only 1 True) and case 3, more that 1 result was found (# of True > 1).

My though is that I could set up a for loop, iterating through the column, and if a value of a cell is True, I add 1 to a variable (lets call it count). Then at the end, I have an if/elif/elif statement based on the value of count, whether it is 0,1,or >1.

Now, maybe there is a better way to check this but if not, I figured the for loop would look something like

for i in range (len(df_improt.index))
    if df_import.iloc[i,0].str.contains('value',na=False,case=False)
        count += 1

First of all, I'm not sure if I should use .iloc or .iat but both give me the error

AttributeError: 'str' object has no attribute 'str'

and I wasn't able to find a correction for this.

Upvotes: 0

Views: 214

Answers (1)

cs95
cs95

Reputation: 402333

Your current code is not going to work because iloc[i, 0] returns a scalar value, and of course, those don't have str accessor methods associated with them.


A quick and easy fix would be to just call sum on the series level str.contains call.

count = df_import['Company'].str.contains('value', na=False, case=False).sum()

Now, count contains the number of matches in that column.

Upvotes: 2

Related Questions