David542
David542

Reputation: 110153

Substring function in pandas

What is the correct way to check if a string is contained in field in pandas? For example, I have:

np.where('DIGITAL_SOURCE' in df['file_name'], 1, 0)

But I get the following complaint from Pandas:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

What would be the proper way to do substr in str ? I believe the correct answer is using str.contains but was having some trouble with the syntax.

Upvotes: 1

Views: 296

Answers (3)

adhg
adhg

Reputation: 10863

you can also apply a lambda such that:

 df['new_column'] = df.apply(lambda x: 1 if 'DIGITAL_SOURCE' in x['file_name'] else 0, axis=1 )

example:

df = pd.DataFrame({"LOCATION":["USA","USA","USA","USA","JAPAN","JAPAN"],"file_name":["DIGITAL","DIGITAL","DIGITAL","DIGITAL","DIGITAL_SOURCE","DIGITAL_SOURCE"]})
 

    LOCATION    file_name
0   USA        DIGITAL
1   USA        DIGITAL
2   USA        DIGITAL
3   USA        DIGITAL
4   JAPAN      DIGITAL_SOURCE
5   JAPAN      DIGITAL_SOURCE


df['new_cl'] = df.apply(lambda x: 1 if 'DIGITAL_SOURCE' in x['file_name'] else 0, axis=1 )


    LOCATION    file_name      new_cl
0   USA          DIGITAL        0
1   USA          DIGITAL        0
2   USA          DIGITAL        0
3   USA          DIGITAL        0
4   JAPAN        DIGITAL_SOURCE 1
5   JAPAN        DIGITAL_SOURCE 1

Upvotes: -1

Andrej Kesely
Andrej Kesely

Reputation: 195428

As stated in the comments, you can use .str.contains (note the regex=False, to not treat the string as regular expression):

df = pd.DataFrame({'file_name': ['DIGITAL_SOURCE', 'Other1', 'Other3']})

df['contains'] = df['file_name'].str.contains('DIGITAL_SOURCE', regex=False).astype(int)
print(df)

Prints:

        file_name  contains
0  DIGITAL_SOURCE         1
1          Other1         0
2          Other3         0

Upvotes: 3

BENY
BENY

Reputation: 323226

You should do isin

np.where( df['file_name'].isin(['DIGITAL_SOURCE']), 1, 0)
#df['file_name'].isin(['DIGITAL_SOURCE']).astype(int)

Upvotes: 1

Related Questions