user2057841
user2057841

Reputation: 213

Extracting text from a line: Regex in Python

I'm working with regular expressions in Python and I'm struggling with this. I have data in a file of lines like this one:

|person=[[Old McDonald]]

and I just want to be able to extract Old McDonald from this line.

I have been trying with this regular expression:

matchLine = re.match(r"\|[a-z]+=(\[\[)?[A-Z][a-z]*(\]\])", line)
print matchLine

but it doesn't work; None is the result each time.

Upvotes: 0

Views: 109

Answers (1)

Mikhail Vladimirov
Mikhail Vladimirov

Reputation: 13890

Construct [A-Z][a-z]* does not match Old McDonald. You probably should use something like [A-Z][A-Za-z ]*. Here is code example:

import re
line = '|person=[[Old McDonald]]'
matchLine = re.match ('\|[a-z]+=(?:\[\[)?([A-Z][A-Za-z ]*)\]\]', line)
print matchLine.group (1)

The output is Old McDonald for me. If you need to search in the middle of the string, use re.search instead of re.match:

import re
line = 'blahblahblah|person=[[Old McDonald]]blahblahblah'
matchLine = re.search ('\|[a-z]+=(?:\[\[)?([A-Z][A-Za-z ]*)\]\]', line)
print matchLine.group (1)

Upvotes: 3

Related Questions