Reputation: 65
I Want to get countries names only not the initials, how can i go about it here is the html code
<div class="item_country cell small-4 medium-2 large-2">
<img class="theme-flat" src="/AD/flat/64.png"/>
<p class="mb0 bold">AD</p>
<p>Andorra</p>
</div>, <div class="item_country cell small-4 medium-2 large-2">
<img class="theme-flat" src="/AE/flat/64.png"/>
<p class="mb0 bold">AE</p>
<p>United Arab Emirates</p>
I am getting :
AD
Andorra
AE
United Arab Emirates
instead of:
Andorra
United Arab Emirates
here is my python code
page = requests.get('https://www.countryflags.io')
soup = BeautifulSoup(page.text, 'html.parser')
tables = soup.find_all(class_="item_country cell small-4 medium-2 large-2")
for table in tables:
country= table.get_text()
print(country)
Upvotes: 1
Views: 42
Reputation: 195603
You can use CSS selector .item_country p:nth-of-type(2)
: that will select second <p>
tag under tag with class="item_country"
:
from bs4 import BeautifulSoup
html_text = '''<div class="item_country cell small-4 medium-2 large-2">
<img class="theme-flat" src="/AD/flat/64.png"/>
<p class="mb0 bold">AD</p>
<p>Andorra</p>
</div>, <div class="item_country cell small-4 medium-2 large-2">
<img class="theme-flat" src="/AE/flat/64.png"/>
<p class="mb0 bold">AE</p>
<p>United Arab Emirates</p>'''
soup = BeautifulSoup(html_text, 'html.parser')
for p in soup.select('.item_country p:nth-of-type(2)'):
print(p.text)
Prints:
Andorra
United Arab Emirates
If you prefer standard bs4 API:
countries = soup.find_all('div', class_="item_country cell small-4 medium-2 large-2")
for c in countries:
print(c.find('p', class_="").text)
Upvotes: 1