Soham
Soham

Reputation: 213

How to use * or + with brackets in regular expressions in Python?

There are multiple space separated characters in the input eg: string = "a b c d a s e "

What should the pattern be such that when I do re.search on the input using the pattern, I'd get the j'th character along with the space following it in the input by using .group(j)?

I tried something of the sort "^(([a-zA-Z])\s)+" but this is not working. What should I do?

EDIT My actual question is in the heading and the body described only a special case of it: Here's the general version of the question: if I have to take in all patterns of a specific type (initial question had the pattern "[a-zA-Z]\s") from a string, what should I do?

Upvotes: 2

Views: 129

Answers (3)

dawg
dawg

Reputation: 103774

You could do:

>>> string = "a b c d a s e "
>>> j=2
>>> re.search(r'([a-zA-Z]\s){%i}' % j, string).group(1)
'b '

Explanation:

  1. With the pattern ([a-zA-Z]\s) you capture a letter then the space;
  2. With the repetition {2} added, you capture the last of the repetition -- in this case the second one (base 1 vs base 0 indexing...).

Demo

Upvotes: 1

alecxe
alecxe

Reputation: 473833

Use findall() instead and get the j-th match by index:

>>> j = 2
>>> re.findall(r"[a-zA-Z]\s", string)[j]
'c '

where [a-zA-Z]\s would match a lower or upper case letter followed by a single space character.

Upvotes: 6

Kasravnd
Kasravnd

Reputation: 107287

Why use regex when you can simply use str.split() method and access to the characters with a simple indexing?

>>> new = s.split()
>>> new
['a', 'b', 'c', 'd', 'a', 's', 'e']

Upvotes: 5

Related Questions