Reputation: 29
I've watched this video which he scraped an article from his website, https://youtu.be/ng2o98k983k?t=2317 But the thing missing in the video is that he didn't explain how I can scrape a specific line in loop from the article.
from bs4 import BeautifulSoup
import requests
import csv
source = requests.get('http://coreyms.com').text
soup = BeautifulSoup(source, 'lxml')
for article in soup.find_all('article'):
headline = article.h2.a.text
print(headline)
summary = article.find('div', class_='entry-content').p.text
print(summary)
try:
vid_src = article.find('iframe', class_='youtube-player')['src']
vid_id = vid_src.split('/')[4]
vid_id = vid_id.split('?')[0]
yt_link = f'https://youtube.com/watch?v={vid_id}'
except Exception as e:
yt_link = None
print(yt_link)
print()
I did this vid_src = article.find('iframe', class_='youtube-player')['src'][1]
but it doesn't work.
Upvotes: 0
Views: 323
Reputation: 20088
Based on your comments, you can search for the tag based on it's text. For example, here we search or an a
with the text "Python Tutorial: Zip Files – Creating and Extracting Zip Archives":
from bs4 import BeautifulSoup
import requests
source = requests.get("http://coreyms.com").text
soup = BeautifulSoup(source, "lxml")
my_tag = soup.find(
lambda tag: tag.name == "a"
and "Python Tutorial: Zip Files – Creating and Extracting Zip Archives"
in tag.text.strip()
).text
print(my_tag)
Output:
Python Tutorial: Zip Files – Creating and Extracting Zip Archives
Upvotes: 1