Triliang123
Triliang123

Reputation: 73

How to get values from specific/selective <dd> and <dt> elements in without class name?

Target website: Coins Britská Guyana

The HTML File

<div class="i_d">
<dl>
<dt class="">Série:</dt>
<dd><a href="/cs/coins/list/country/349-Britsk%C3%A1_Guyana/series/101385-Britsk%C3%A1_Guyana_-_Standardn%C3%AD_ra%C5%BEba">Britská Guyana - Standardní ražba</a></dd>
<dt>Katalogové číslo:</dt>
<dd><strong>WCC:</strong>km22</dd>
<dt>Témata:</dt><dd><a href="/cs/coins/list/country/349-Britsk%C3%A1_Guyana/theme/641-Kr%C3%A1lov%C3%A9">Králové</a> | <a href="/cs/coins/list/country/349-Britsk%C3%A1_Guyana/theme/3134-V%C4%9Bnce">Věnce</a></dd>
...
</dl>

I try to get this output:

Série: Britská Guyana - Standardní ražba
Katalogové číslo: WCC:km22
Témata: Králové|Věnce
...Next Coin value

I tried this code:

vysledek = soup.find_all('div', attrs={'class':'pl-it'})
for hledani_dat in vysledek:
    nazev_mince = hledani_dat.find('h2', attrs={'class':'item_header'})
    nazev_mince_final = nazev_mince.text.strip()

    dd = hledani_dat.find('div', attrs={'class':'i_d'})
    dd_final = dd.text.strip()

    print(nazev_mince_final, dd_final)

I got all the values of all coins in <div class=i_d></div> (Data from all dt dl elements)

But how to get only selective values of dt dl and not all?

EXPECTED OUTPUT:

Témata: Králové|Věnce

Upvotes: 0

Views: 65

Answers (1)

QHarr
QHarr

Reputation: 84465

You can use :contains to target the appropriate dt and then move with an adjacent sibling combinator to the dd. Add some handling for where target e.g. Témata: is not present

import requests
from bs4 import BeautifulSoup as bs
import re

r = requests.get('https://colnect.com/cs/coins/list/country/349-Britsk%C3%A1_Guyana', headers = {'User-Agent':'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
results = []

for coin in soup.select('.pl-it'):
    print('coin:' , coin.select_one('.item_header a').text)
    print('-' * 20)
    target = coin.select_one('dt:contains("Témata:") + dd')
    if target is None:
        print('Not present')
    else:
        print(target.get_text())
    print()

Upvotes: 1

Related Questions