Retrieve all names from html tags using BeautifulSoup

Question

I managed to setup by Beautiful Soup and find the tags that I needed. How do I extract all the names in the tags?

tags = soup.find_all("a")
print(tags)

After running the above code, I got the following output

[Alfred the Great, Queen Elizabeth I, Family tree of Scottish monarchs, Kenneth MacAlpin]

How do I retrieve the names, Alfred the Great,Queen Elizabeth I, Kenneth MacAlpin, etc? Do i need to use regular expression? Using .string gave me an error

Md. Fazlul Hoque · Accepted Answer

No need to apply re. You can easily grab all the names by iterating all a tags then call title attribute or get_text() or .find(text=True)

html='''

 
  
   Alfred the Great
  
  ,
  
   Queen Elizabeth I
  
  ,
  
   Family tree of Scottish monarchs
  
  ,
  
   Kenneth MacAlpin
  
 


'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(html,'lxml')

#print(soup.prettify())

for name in soup.find_all('a'):
    txt = name.get('title')
    #OR
    #txt = name.get_text(strip=True)
    print(txt)

Output:

Alfred the Great
Queen Elizabeth I
Family tree of Scottish monarchs
Kenneth MacAlpin

Retrieve all names from html tags using BeautifulSoup

Answers (2)

Related Questions