Reputation: 168
I have a list of strings that I want to split but only if it does contain a certain string. For example say I have these strings:
'Take a look at this text: It is recommended to',
'Take a look at this text: blah blah blah'
I want the result to be like this:
'It is recommended to',
'Take a look at this text: blah blah blah'
I want to only remove the 'Take a look at this text: '
if we also have 'It is recommended to'
.
I tried this:
if df['resolution_modified'].str.contains('It is recommended to ') == False:
if df['resolution_modified'].str.contains('Take a look at this text: '):
df['resolution_modified'] = df['resolution_modified'].str.split(':').str[0]
return df
But I get this error
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 0
Views: 54
Reputation: 4214
You're getting an error because you cannot compare an array to Boolean. Instead, you should create an array representing the Boolean mask for your condition and use that to index your data.
I want to only remove the 'Take a look at this text: ' if we also have 'It is recommended to'.
import pandas as pd
df = pd.DataFrame({
"col": ['Take a look at this text: It is recommended to', 'Take a look at this text: blah blah blah']
})
contains_mask = df['col'].str.contains("It is recommended to", regex=False) # turn off regex to save time
df.loc[contains_mask,"col"] = df.loc[contains_mask, 'col'].str.split(":").str[1]
df
col
0 It is recommended to
1 Take a look at this text: blah blah blah
Upvotes: 2
Reputation: 301
You can also using apply
pandas = pd.DataFrame({'A':['Take a look at this text: It is recommended to','Take a look at this text: blah blah blah']})
pandas['value'] = pandas['A'].apply(lambda x: x.split(': ')[1] if x.split(': ')[1]=='It is recommended to' else np.nan)
here the result
A value
Take a look at this text: It is recommended to It is recommended to
Take a look at this text: blah blah blah NaN
Upvotes: 0
Reputation: 1094
you can use np.where:
df['resolution_modified'] = np.where(df['resolution_modified'].str.contains('It is recommended to'),
df['resolution_modified'].str.replace('Take a look at this text: ', '', np.nan),
df['resolution_modified'])
input:
resolution_modified
0 Take a look at this text: It is recommended to
1 Take a look at this text: blah blah blah
output:
resolution_modified
0 It is recommended to
1 Take a look at this text: blah blah blah
Upvotes: 1
Reputation: 521178
We could use str.replace
here for a regex option:
df['resolution_modified'] = df['resolution_modified']
.str.replace('^.*?:\s*(It is recommended to)$', '\\1')
Upvotes: 1
Reputation: 1045
s='Take a look at this text: It is recommended to'
ind=s.find(":")
list=[s[0:ind],s[ind+1:]]
print(list)
Output:
['Take a look at this text', ' It is recommended to']
i think it is helpful for you
Upvotes: 0