Bin Chen
Bin Chen

Reputation: 63309

python string parsing using regular expression

Given a string #abcde#jfdkjfd, how can I get the string between two # ? And I also want that if no # pair(means no # or only one #), the function will return None.

Upvotes: 2

Views: 6489

Answers (3)

Duncan
Duncan

Reputation: 95652

If you don't insist on regular expressions and are willing to accept an empty list instead of None for the case where there are no results then the easy way is:

>>> "#abcde#jfdkjfd".split('#')[1:-1]
['abcde']

Note that the result really has to be a list as you could have more than one result.

If you insist on getting None instead of an empty list (though not perfect as this would also turn any empty string into None):

>>> "#abcde#jfdkjfd".split('#')[1:-1] or None
['abcde']

If you only wanted the first marked string then you could do this:

>>> def first_marked(s):
    token = s.split('#')
    if len(token) >= 3:
        return token[1]
    else:
        return None


>>> first_marked("#abcde#jfdkjfd")
'abcde'

Upvotes: 1

fge
fge

Reputation: 121720

Use (?<=#)(\w+)(?=#) and capture the first group. You can even cycle through a string which contains several embedded strings and it will work.

This uses both a positive lookbehind and positive lookahead.

Upvotes: 2

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

>>> import re
>>> s = "abc#def#ghi#jkl"
>>> re.findall(r"(?<=#)[^#]+(?=#)", s)
['def', 'ghi']

Explanation:

(?<=#)  # Assert that the previous character is a #
[^#]+   # Match 1 or more non-# characters
(?=#)   # Assert that the next character is a #

Upvotes: 9

Related Questions