Reputation: 797
Is it possible to extract data from within onclick
attribute values like analysis(1644983)
, AsianOdds(1644983)
and EuropeOdds(1644983)
? I just want to show one number, as all are same within this HTML code.
HTML
<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>
Python Code
from bs4 import BeautifulSoup
soup=BeautifulSoup("""<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>""",'html.parser')
lines=soup.find_all('onclick')
for line in lines:
print(line['analysis'])
Expected output
1644983
Upvotes: 4
Views: 555
Reputation: 11282
I tried to explain everything in the comments:
from bs4 import BeautifulSoup
html = '''<td style="word-spacing:-3px" align="left">
<a href="javascript:" onclick="analysis(1644983)">析</a>
<a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a>
<a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a>
</td>'''
soup = BeautifulSoup(html, 'html.parser')
# Find all <a> elements
elements = soup.find_all('a')
# Loop over all found elements
for element in elements:
# Disregard element if it doesn't contain onclick attribute
if 'onclick' not in element.attrs:
continue
# Get attribute value
value = element['onclick']
# Disregard wrong elements
if not value.startswith('analysis('):
continue
# Extract position of opening bracket
position = value.index('(') + 1
# Slice string so only content inside bracket is left
value = value[position:-1]
# Print result
print(value)
Upvotes: 4