Void S
Void S

Reputation: 802

how to print 1st element in HTML tag

My code gets links/HTML from different "sections" of a page.

It prints 2 links per section, however I only want the first one printed.

Expected output should not contain the links ending with "video", as it does with my code.

from selenium import webdriver
from bs4 import BeautifulSoup
import time
driver = webdriver.Chrome()
jam=[]
baseurl='https://meetinglibrary.asco.org'
driver.get('https://meetinglibrary.asco.org/results?meetingView=2020%20ASCO%20Virtual%20Scientific%20Program&page=1')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source,'html.parser')
productlist=soup.find_all('a',class_='ng-star-inserted')
for item in productlist:
    for link in item.find_all('a',href=True):
        jam.append(baseurl+link['href'])
print(jam)

Upvotes: 0

Views: 47

Answers (2)

crazy-coding
crazy-coding

Reputation: 66

You can use the condition function before appending the script.

...
for item in productlist:
    ahrefs = item.find_all('a', href=True)
    for index in range(len(ahrefs)):
        if (index % 2 == 0) and ('video' not in ahrefs[index]['href']):
            jam.append(baseurl+ahrefs[index]['href'])
print(jam)
...

Let me know after trying. Good luck

Upvotes: 1

jizhihaoSAMA
jizhihaoSAMA

Reputation: 12672

Use os.path.basename to get the end of string.And use in operator to check whether "video" exists:

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import os

driver = webdriver.Chrome()
jam = []
baseurl = 'https://meetinglibrary.asco.org'
driver.get('https://meetinglibrary.asco.org/results?meetingView=2020%20ASCO%20Virtual%20Scientific%20Program&page=1')
time.sleep(3)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
productlist = soup.find_all('a', class_='ng-star-inserted')
for item in productlist:
    for link in item.find_all('a', href=True):
        url = link['href']
        if "video" not in os.path.basename(url):
            jam.append(baseurl + url)
print(jam)

result:

['https://meetinglibrary.asco.org/record/185955/abstract',
 'https://meetinglibrary.asco.org/record/185955/slide',
 'https://meetinglibrary.asco.org/record/185954/abstract',
 'https://meetinglibrary.asco.org/record/186048/abstract',
 'https://meetinglibrary.asco.org/record/186048/slide',
 'https://meetinglibrary.asco.org/record/190197/slide',
 'https://meetinglibrary.asco.org/record/192623/slide',
 'https://meetinglibrary.asco.org/record/185414/abstract',
 'https://meetinglibrary.asco.org/record/185414/slide',
 'https://meetinglibrary.asco.org/record/185415/abstract',
 'https://meetinglibrary.asco.org/record/185415/slide',
 'https://meetinglibrary.asco.org/record/185473/abstract',
 'https://meetinglibrary.asco.org/record/185473/slide',
 'https://meetinglibrary.asco.org/record/187584/slide',
 'https://meetinglibrary.asco.org/record/188561/slide',
 'https://meetinglibrary.asco.org/record/186710/abstract',
 'https://meetinglibrary.asco.org/record/186710/slide',
 'https://meetinglibrary.asco.org/record/186699/abstract',
 'https://meetinglibrary.asco.org/record/186699/slide',
 'https://meetinglibrary.asco.org/record/186698/abstract',
 'https://meetinglibrary.asco.org/record/186698/slide',
 'https://meetinglibrary.asco.org/record/187720/slide',
 'https://meetinglibrary.asco.org/record/187480/abstract',
 'https://meetinglibrary.asco.org/record/187480/slide',
 'https://meetinglibrary.asco.org/record/191961/slide',
 'https://meetinglibrary.asco.org/record/192626/slide',
 'https://meetinglibrary.asco.org/record/186983/abstract',
 'https://meetinglibrary.asco.org/record/186983/slide',
 'https://meetinglibrary.asco.org/record/188580/abstract',
 'https://meetinglibrary.asco.org/record/188580/slide',
 'https://meetinglibrary.asco.org/record/189047/abstract',
 'https://meetinglibrary.asco.org/record/189047/slide',
 'https://meetinglibrary.asco.org/record/190223/slide',
 'https://meetinglibrary.asco.org/record/190273/slide',
 'https://meetinglibrary.asco.org/record/184812/abstract',
 'https://meetinglibrary.asco.org/record/184812/slide',
 'https://meetinglibrary.asco.org/record/184927/slide',
 'https://meetinglibrary.asco.org/record/184805/abstract',
 'https://meetinglibrary.asco.org/record/184805/slide',
 'https://meetinglibrary.asco.org/record/184811/abstract',
 'https://meetinglibrary.asco.org/record/184811/slide',
 'https://meetinglibrary.asco.org/record/185576/slide',
 'https://meetinglibrary.asco.org/record/190147/slide']

Upvotes: 1

Related Questions