Mary
Mary

Reputation: 797

How can I extract data from within the brackets of `onclick` attribute values?

Is it possible to extract data from within onclick attribute values like analysis(1644983), AsianOdds(1644983) and EuropeOdds(1644983)? I just want to show one number, as all are same within this HTML code.

HTML

<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>

Python Code

from bs4 import BeautifulSoup

soup=BeautifulSoup("""<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>""",'html.parser')

lines=soup.find_all('onclick')
for line in lines:
    print(line['analysis'])

Expected output

1644983

Upvotes: 4

Views: 555

Answers (1)

finefoot
finefoot

Reputation: 11282

I tried to explain everything in the comments:

from bs4 import BeautifulSoup

html = '''<td style="word-spacing:-3px" align="left">
    <a href="javascript:" onclick="analysis(1644983)">析</a>
    <a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a>
    <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a>
    </td>'''

soup = BeautifulSoup(html, 'html.parser')

# Find all <a> elements
elements = soup.find_all('a')

# Loop over all found elements
for element in elements:
    # Disregard element if it doesn't contain onclick attribute
    if 'onclick' not in element.attrs:
        continue
    # Get attribute value
    value = element['onclick']
    # Disregard wrong elements
    if not value.startswith('analysis('):
        continue
    # Extract position of opening bracket
    position = value.index('(') + 1
    # Slice string so only content inside bracket is left
    value = value[position:-1]
    # Print result
    print(value)

Upvotes: 4

Related Questions