robots.txt
robots.txt

Reputation: 137

Can't scoop out a twitter link from a webpage

I've created a script in python to get the link to a twitter account of a player. The problem is the twitter link is within an iframe. I can parse that using selenium. However, I would like to know if there is any alternative to parse the link using requests module making use of script tag or something.

website link

If you scroll that site, you can see the twitter link located at the right sided area something like the image below:

enter image description here

I've tried with:

import requests
from bs4 import BeautifulSoup

link = "https://247sports.com/Player/JT-Tuimoloau-46048440/"

def get_links(link):
    res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
    soup = BeautifulSoup(res.text,"lxml")
    twitter = soup.select_one("a.customisable-highlight").get('href')
    print(twitter)

if __name__ == '__main__':
    get_links(link)

Upvotes: 1

Views: 37

Answers (1)

trotta
trotta

Reputation: 1226

I don't know how to actually get the iframe, but maybe there is another way for you to fetch the Twitter name (and create a link to this Twitter account afterwards).

It seems like the information you need is hidden in a div tag with class="tweets-comp". If you extract the value of the attribute data-username, you should end up with the name of the Twitter account:

import requests
from bs4 import BeautifulSoup

link = "https://247sports.com/Player/JT-Tuimoloau-46048440/"

res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
soup = BeautifulSoup(res.text,"html.parser")

div = soup.find('div', {'class':'tweets-comp'})
print(div['data-username'])
# JT_tuimoloau

Upvotes: 1

Related Questions