Reputation: 168

Perform a string split under a certain condition

I have a list of strings that I want to split but only if it does contain a certain string. For example say I have these strings:

'Take a look at this text: It is recommended to',
'Take a look at this text: blah blah blah'

I want the result to be like this:

'It is recommended to',
'Take a look at this text: blah blah blah'

I want to only remove the 'Take a look at this text: ' if we also have 'It is recommended to'.

I tried this:

if df['resolution_modified'].str.contains('It is recommended to ') == False:
    if df['resolution_modified'].str.contains('Take a look at this text: '):
        df['resolution_modified'] = df['resolution_modified'].str.split(':').str[0]
return df

But I get this error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Upvotes: 0

Answers (5)

Nick Becker

Reputation: 4214

You're getting an error because you cannot compare an array to Boolean. Instead, you should create an array representing the Boolean mask for your condition and use that to index your data.

I want to only remove the 'Take a look at this text: ' if we also have 'It is recommended to'.

import pandas as pd

df = pd.DataFrame({
    "col": ['Take a look at this text: It is recommended to', 'Take a look at this text: blah blah blah']
})

contains_mask = df['col'].str.contains("It is recommended to", regex=False) # turn off regex to save time

df.loc[contains_mask,"col"] = df.loc[contains_mask, 'col'].str.split(":").str[1]
df
col
0   It is recommended to
1   Take a look at this text: blah blah blah

Upvotes: 2

Fariliana Eri

Reputation: 301

You can also using apply

pandas = pd.DataFrame({'A':['Take a look at this text: It is recommended to','Take a look at this text: blah blah blah']})
pandas['value'] = pandas['A'].apply(lambda x: x.split(': ')[1] if x.split(': ')[1]=='It is recommended to' else np.nan)

here the result

A                                                 value
Take a look at this text: It is recommended to    It is recommended to
Take a look at this text: blah blah blah          NaN

Upvotes: 0

yulGM

Reputation: 1094

you can use np.where:


df['resolution_modified'] = np.where(df['resolution_modified'].str.contains('It is recommended to'),
                                    df['resolution_modified'].str.replace('Take a look at this text: ', '', np.nan), 
                                     df['resolution_modified'])

input:

resolution_modified
0   Take a look at this text: It is recommended to
1   Take a look at this text: blah blah blah

output:

resolution_modified
0   It is recommended to
1   Take a look at this text: blah blah blah

Upvotes: 1

Tim Biegeleisen

Reputation: 521178

We could use str.replace here for a regex option:

df['resolution_modified'] = df['resolution_modified']
                               .str.replace('^.*?:\s*(It is recommended to)$', '\\1')

Upvotes: 1

sourab maity

Reputation: 1045

s='Take a look at this text: It is recommended to'
ind=s.find(":")
list=[s[0:ind],s[ind+1:]]
print(list)

Output:

['Take a look at this text', ' It is recommended to']

i think it is helpful for you

Upvotes: 0

Perform a string split under a certain condition

Answers (5)

Related Questions