Jonathan
Jonathan

Reputation: 11365

How to get string between two delimiters python

I have entries like the following:

"<![CDATA[Lorem ipsum feed for an interval of 30 seconds]]>"

How do I get the string between the innermost square brackets i.e. 'Lorem ipsum feed for an interval of 30 seconds'

Where some of the entries are plain strings and some are delimited by [] as above

Upvotes: 4

Views: 13780

Answers (3)

Sudhanshu Dev
Sudhanshu Dev

Reputation: 388

Use the split() method of str. See the below code snippet:

 string = "<![CDATA[[[[[Lorem ipsum feed for an interval of 30 seconds]]]]]]]>"
 inner_str = string.split('[')[len(string.split('[')) -1 ].split(']')[0]
 print inner_str

Upvotes: 10

Hossein
Hossein

Reputation: 2111

Use regular expressions:

import re
string = '<![CDATA[Lorem ipsum feed for an interval of 30 seconds]]>'
reverse = string[::-1]
start = len(string)-re.search(r'\[', reverse).start()
end = re.search(r'\]', string).start()
print(string[start:end])

You should find the text between the last [ and the first ]. In the above code, I use the re.search() function to find the first occurrence of a character. It is ok for finding the first occurrence of ]. But to find the last occurrence of [, I reverse the string and find the first occurrence of it (the position is subtracted from len(string) since it is indexed backward).

Upvotes: 1

Mayazcherquoi
Mayazcherquoi

Reputation: 474

You can use what is mentioned in the answer to this question, except in order to get the inner most strings, you will have to recursively call that.

Modifying the accepted answer, you can achieve it using the following:

def find_inner(s):
    temp = s.partition('[')[-1].rpartition(']')[0]
    if not temp:
        return s

    return find_inner(temp)

Upvotes: 2

Related Questions