johnnydoe
johnnydoe

Reputation: 402

Checking if the cell values in a dataframe are strings

I have the following dataframe:

ID   Image   
a    None
b    ushfkf.jpg
c    ihfskjd.jpg
d    None

The .jpg's values are of string type. I want to check whether the row contains an Image. I tried:

df['hasimage'] = np.where(df['Image']==None, True, False)

But I only get an extra column of Falses. How can I simply check if the cell has a string in it, without complicating it with None?

Upvotes: 0

Views: 345

Answers (2)

FredMaster
FredMaster

Reputation: 1459

You could check if the string contains .jpg.

Example:

import pandas as pd
df = pd.DataFrame({"ID": list("abcd"), "Image": [None, "ushfkf.png", "ihfskjd.jpg","None"]})
df["hasimage"] = df["Image"].str.contains(".png|.jpg", na=False)
df

Upvotes: 1

jezrael
jezrael

Reputation: 862511

If there are None like Nonetypes:

If testing not NaN or Nones use Series.notna:

df = pd.DataFrame({"ID": list("abcd"), "Image": [None, "ushfkf.jpg", "ihfskjd.jpg",None]})

df['hasimage1'] = df['Image'].apply(lambda x: isinstance(x, str))
df['hasimage2'] = df['Image'].notna()

print (df)
  ID        Image  hasimage1  hasimage2
0  a         None      False      False
1  b   ushfkf.jpg       True       True
2  c  ihfskjd.jpg       True       True
3  d         None      False      False

EDIT:

If Nones are strings:

Test by Series.str.endswith:

df = pd.DataFrame({"ID": list("abcd"), "Image": ["None", "ushfkf.jpg", "ihfskjd.jpg","None"]})

df['hasimage1'] = df['Image'] != 'None'
df["hasimage2"] = df["Image"].str.endswith(".jpg", na=False)

print (df)
  ID        Image  hasimage1  hasimage2
0  a         None      False      False
1  b   ushfkf.jpg       True       True
2  c  ihfskjd.jpg       True       True
3  d         None      False      False

Upvotes: 1

Related Questions