Snorrlaxxx
Snorrlaxxx

Reputation: 168

Perform a string split under a certain condition

I have a list of strings that I want to split but only if it does contain a certain string. For example say I have these strings:

'Take a look at this text: It is recommended to',
'Take a look at this text: blah blah blah'

I want the result to be like this:

'It is recommended to',
'Take a look at this text: blah blah blah'

I want to only remove the 'Take a look at this text: ' if we also have 'It is recommended to'.

I tried this:

if df['resolution_modified'].str.contains('It is recommended to ') == False:
    if df['resolution_modified'].str.contains('Take a look at this text: '):
        df['resolution_modified'] = df['resolution_modified'].str.split(':').str[0]
return df

But I get this error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Upvotes: 0

Views: 54

Answers (5)

Nick Becker
Nick Becker

Reputation: 4214

You're getting an error because you cannot compare an array to Boolean. Instead, you should create an array representing the Boolean mask for your condition and use that to index your data.

I want to only remove the 'Take a look at this text: ' if we also have 'It is recommended to'.
import pandas as pd
​
df = pd.DataFrame({
    "col": ['Take a look at this text: It is recommended to', 'Take a look at this text: blah blah blah']
})
​
contains_mask = df['col'].str.contains("It is recommended to", regex=False) # turn off regex to save time
​
df.loc[contains_mask,"col"] = df.loc[contains_mask, 'col'].str.split(":").str[1]
df
col
0   It is recommended to
1   Take a look at this text: blah blah blah

Upvotes: 2

Fariliana Eri
Fariliana Eri

Reputation: 301

You can also using apply

pandas = pd.DataFrame({'A':['Take a look at this text: It is recommended to','Take a look at this text: blah blah blah']})
pandas['value'] = pandas['A'].apply(lambda x: x.split(': ')[1] if x.split(': ')[1]=='It is recommended to' else np.nan)

here the result

A                                                 value
Take a look at this text: It is recommended to    It is recommended to
Take a look at this text: blah blah blah          NaN

Upvotes: 0

yulGM
yulGM

Reputation: 1094

you can use np.where:


df['resolution_modified'] = np.where(df['resolution_modified'].str.contains('It is recommended to'),
                                    df['resolution_modified'].str.replace('Take a look at this text: ', '', np.nan), 
                                     df['resolution_modified'])

input:

resolution_modified
0   Take a look at this text: It is recommended to
1   Take a look at this text: blah blah blah

output:

resolution_modified
0   It is recommended to
1   Take a look at this text: blah blah blah

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521178

We could use str.replace here for a regex option:

df['resolution_modified'] = df['resolution_modified']
                               .str.replace('^.*?:\s*(It is recommended to)$', '\\1')

Upvotes: 1

sourab maity
sourab maity

Reputation: 1045

s='Take a look at this text: It is recommended to'
ind=s.find(":")
list=[s[0:ind],s[ind+1:]]
print(list)

Output:

['Take a look at this text', ' It is recommended to']

i think it is helpful for you

Upvotes: 0

Related Questions