macvin9009
macvin9009

Reputation: 1

Parsing html with lxml on python

I have following html code:

<div class="test">
    "Test"
    <br>
    <script type="text/javascript"></script>
    <a href="mailto:[email protected]">[email protected]</a> 
    " "
</div>

How to get the email address from this code using lxml?

Upvotes: 0

Views: 182

Answers (1)

unutbu
unutbu

Reputation: 879481

import lxml.html as LH
text='''\
<div class="test">
    "Test"
    <br>
    <script type="text/javascript"></script>
    <a href="mailto:[email protected]">[email protected]</a> 
    " "
</div>
'''

doc=LH.fromstring(text)
print(doc.xpath('//a[starts-with(@href,"mailto:")]/text()')[0])
# [email protected]

Upvotes: 4

Related Questions