shyam
shyam

Reputation: 21

Grep data from file in python

I need to process a html page and identify the hyper links present in the page. I am successful if the code is like this

<script type="text/javascript" src="/test/test.html">

I used a simple regex to identify the data which is between double quotes and that starts with / and I got all the liks which are of this type.

But I am not able to understand how to get the links if the script is like

<script type="text/javascript" src="test/test.html">

because I canot use the same old regex or if I try to use the regex gets data which is in double quotes then I will get "text/javascript" also in the output which is not required. Can I use seek() to do this ?

Thanks.

Upvotes: 0

Views: 477

Answers (1)

Andrew
Andrew

Reputation: 3061

Try using:

regex = re.compile('src="([^"]*)"')
result = regex.match(html)
print result.match(1)

Upvotes: 1

Related Questions