Korzak
Korzak

Reputation: 385

Can't evaluate column for empty values

I have read 20+ threads on this, and am still coming up empty (no pun intended).

I have a pandas dataframe df_s, which has a column that contains dates at iloc[:,8]. I am trying to add a new column to the dataframe with a value (yes/no) based on whether there is a value in the other column or not.

This is what I have been trying:

CDRFormUp = []
for row in df_s.iloc[:,8]:
    if row=="":
            CDRFormUp.append('No')
    else:
            CDRFormUp.append('Yes')
df_s['CDR Form Up'] = CDRFormUp

CDRFormUp would be the new column. I'm running every row in the dataframe, and checking to see if the value in the column is anything.

I have tried...

if row <>"":
if row == "":
if row is None:
if row:
if row>0:

Nothing is working. The column contains dates and empty cells and text. For example, the value in this column in the first row is "CDF Form", in the second row it is blank, in the third row it is "4865" or something like that.

If I set the iloc to a different column that just contains Country names, and set the condition to "Country = "Italy", it properly adds the "Yes" or "No" to the new column for each row...so it's not a wrong iloc or something else.

Any help would be incredibly appreciated. Thanks!

Upvotes: 2

Views: 449

Answers (2)

piRSquared
piRSquared

Reputation: 294338

I suspect you have elements with white space.

Consider the datafame df_s

df_s = pd.DataFrame([
    [1, 'a', 'Yes'],
    [2, '', 'No'],
    [3, ' ', 'No']
])

df_s

   0  1    2
0  1  a  Yes
1  2      No
2  3      No

Both rows 1 and 2 in column 1 have what look like blank strings. But they aren't

df_s.iloc[:, 1] == ''

0    False
1     True
2    False
Name: 1, dtype: bool

You may want to consider seeing if the entire thing is white space or stripping white space first.

Option 1
all white space

df_s.iloc[:, 1].str.match('^\s*$')

0    False
1     True
2     True
Name: 1, dtype: bool

Which we can convert to yes/no with

df_s.iloc[:, 1].str.match('^\s*$').map({True: 'no', False: 'yes'})

0    yes
1     no
2     no
Name: 1, dtype: object

Add a new column

df_s.assign(
    CDRFormUp=df_s.iloc[:, 1].str.match('^\s*$').map({True: 'no', False: 'yes'})
)

   0  1    2 CDRFormUp
0  1  a  Yes       yes
1  2      No        no
2  3      No        no

Option 2
strip white space then check if empty

df_s.iloc[:, 1].str.strip() == ''

0    False
1     True
2     True
Name: 1, dtype: bool

Add new column

df_s.assign(
    CDRFormUp=df_s.iloc[:, 1].str.strip().eq('').map({True: 'no', False: 'yes'})
)

   0  1    2 CDRFormUp
0  1  a  Yes       yes
1  2      No        no
2  3      No        no

Upvotes: 0

Scott Boston
Scott Boston

Reputation: 153460

You need to use np.where with Pandas dataframes.

df_s = pd.DataFrame(np.random.randint(1,10,(5,10)))

df_s.iloc[1,8] = ''

df_s.iloc[3,8] = np.nan

df_s['CDRFormUp'] = np.where(df_s.iloc[:,8].mask(df_s.iloc[:,8].str.len()==0).isnull(),'Yes','No')

print(df_s)

Output:

   0  1  2  3  4  5  6  7    8  9 CDRFormUp
0  6  5  5  5  9  3  3  5    3  9        No
1  5  4  7  3  9  6  8  9       9       Yes
2  5  2  2  7  7  6  3  2    5  2        No
3  8  2  1  9  7  3  7  8  NaN  8       Yes
4  4  4  1  5  3  5  9  4    4  9        No

Upvotes: 1

Related Questions