Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here. I have a few strings with the following format (approximated): "firstword text ONE lastword" "firstword text TWO lastword" I need to extract the text after the 'firstword' and before 'ONE' or 'TWO' . So my output for the aforementioned strings would have to be: "text" How do I split or partition the string so I can: remove the first word (I already know how to do this with str.split(' ')) retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' | 'TWO'), but that obviously doesn't work and I haven't managed to find a solution right now. If possible, I would like to solve it with split() or partition() , but regex would be fine as well. Thank you for your help and sorry if this is a dumb question.

You can use this regex, which does a positive lookahead and positive lookbehind, (?<=firstword)\s*(.*?)\s*(?=ONE|TWO) Demo Explanation: (?<=firstword) --> Positive look behind to ensure the matched text is followed by firstword \s* --> Eats any white space (.*?) --> Captures your intended data \s* --> Eats any white space (?=ONE|TWO) --> Positive lookahead to ensure the matched text is followed by ONE or TWO

pythonregexstringsplit

remus2232

Reputation: 87

Split or partition string after certain words

Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here.

I have a few strings with the following format (approximated):

"firstword text ONE lastword"
"firstword text TWO lastword"

I need to extract the text after the 'firstword' and before 'ONE' or 'TWO'.

So my output for the aforementioned strings would have to be:

"text"

How do I split or partition the string so I can:

remove the first word (I already know how to do this with str.split(' '))
retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' | 'TWO'), but that obviously doesn't work and I haven't managed to find a solution right now.

If possible, I would like to solve it with split() or partition(), but regex would be fine as well.

Thank you for your help and sorry if this is a dumb question.

Upvotes: 2

Answers (5)

OSA413

Reputation: 476

Actually there's no need to use regex. You can store required separators into a list and then check if they exist.

orig_text = "firstword text ONE lastword"

first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]

output = []

#Splitting the original text into list
orig_text = orig_text.split(" ")

#Checking if there's the "firstword" just in case
if first_separator in orig_text:
    #Here we check if there's "ONE" or "TWO" in the text
    for i in last_separators:
        if i in orig_text:
            #taking everything between "firstword" and "ONE"/"TWO"
            output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
            break

#Converting to string
output = " ".join(output)

print(output)

Here's an example of outputs:

"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""

Upvotes: 1

Franco Piccolo

Reputation: 7410

You can use regex like:

import re
string = "firstword text TWO lastword"
re.search('firstword\s+(\w+)\s+[ONE|TWO]', string).group(1)
'text'

Upvotes: 1

Narendra Kamatham

Reputation: 340

Try This

str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])

Result:['text', 'text', 'text', 'text']

How i do this: May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list. I use list Compression and Map function to get our desired target list.

Upvotes: 1

Ali Kargar

Reputation: 189

When you split it with space you have a list of all the words then you can choose which word you want :

s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"

s = "firstword text TWO lastword"
print s.split(" ")[1]

Upvotes: 1

Pushpesh Kumar Rajwanshi

Reputation: 18357

You can use this regex, which does a positive lookahead and positive lookbehind,

(?<=firstword)\s*(.*?)\s*(?=ONE|TWO)

Demo

Explanation:

(?<=firstword) --> Positive look behind to ensure the matched text is followed by firstword
\s* --> Eats any white space
(.*?) --> Captures your intended data
\s* --> Eats any white space
(?=ONE|TWO) --> Positive lookahead to ensure the matched text is followed by ONE or TWO

Upvotes: 5

Split or partition string after certain words

Answers (5)

Related Questions