Reputation: 235
My "test" function should output a list of lists from "items" object whose first string element of each list within items is contained in header1 object list. Please See desired out put below.
from bs4 import BeautifulSoup as soup
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'C:\Users\Main\Documents\Work\Projects\chromedriver')
my_url = "https://www.carehome.co.uk/carehome.cfm/searchazref/10001005FITA"
def make_soup(url):
driver.get(url)
m_soup = soup(driver.page_source, features='html.parser')
return m_soup
main_page = make_soup(my_url)
headers1 = ['Group','Person in charge', 'Type of Service', 'Registered Care Categories*', 'Specialist Care Categories','Languages Spoken by Staff (other than English)','Single Rooms','Single Rooms', 'Shared Rooms','Facilities & Service']
tags = main_page.select(".profile-group-description.col-xs-12.col-sm-8>p")
items = [tag.text.replace('\n','').split('\n')[0].split(':') for tag in tags]
indexs = list(range(len(items)))
def test():
selected = []
for i in indexs:
if any(x in headers1 for x in items[i]):
selected.append(items[i])
return(selected)
Current Output:
[['Group', 'Excelcare Holdings']]
Desired Output:
['Group', 'Excelcare Holdings']
['Person in charge', ' Denise Marks (Registered Manager)']
['Type of Service', '\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tCare Home only (Residential Care)\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t– Privately Owned\t\t\t\t\t\t\t\t\t\t\t\t\t, Registered for a maximum of 44 Service Users \t\t\t\t\t\t\t\t\t\t\t\t']
['Registered Care Categories*', '\t\t\t\t\t\t\t\t\t\t\t\tDementia • Learning Disability • Mental Health Condition • Old Age \t\t\t\t\t\t\t\t\t\t\t']
['Specialist Care Categories', "\t\t\t\t\t\t\t\t\t\t\t\tAlzheimer's • Down Syndrome • Schizophrenia • Stroke \t\t\t\t\t\t\t\t\t\t\t\t\t"]
['Languages Spoken by Staff (other than English)', ' Bengali; Bangla, Polish']
['Single Rooms', ' 38']
['Shared Rooms', ' 3']
Upvotes: 0
Views: 740
Reputation: 643
Your test()
function should be like:
def test():
selected = []
for i in indexs:
if any(x in headers1 for x in items[i]):
selected.append(items[i])
return(selected)
Notice the return statement is outside the for loop and not inside it.
Upvotes: 2