JKodner
JKodner

Reputation: 19

Python Regex returning Empty Strings instead of Results

I have been experimenting with Python's Regex Module: Re.

I decided to write a simple expression that searches for links (href="url") in a file.

Here is my Regex: href *= *(\"|\').*\1

When I used a site called GSkinner, I decided to try out my expression. The results are here, along with the code.

When I decided to try it out on python regex, I used the following code:

lines = """Code found in link"""
results = re.findall(r"href *= *(\"|\').*\1", lines)
print results # Ouputs: ['"', '"'] instead of two provided links

Why are the results outputting in empty strings?

Upvotes: 1

Views: 131

Answers (1)

Explosion Pills
Explosion Pills

Reputation: 191729

findall will only return what is captured (unless nothing is captured). You have to capture the value you want as well:

r"href *= *(\"|\')(.*?)\1

All together you may want to use something like:

results = [x[1] for x in re.findall(r"href *= *(\"|\')(.*?)\1", lines)]

Upvotes: 1

Related Questions