Stackcans
Stackcans

Reputation: 351

Append values to list that have been indexed

I want to append values within a list where values within the list have been indexed, however I am trying this over a list but it returns only a single value as opposed to all.

For example:

href_url = ['https://www.nhs.uk/Services/Trusts/Overview/DefaultView.aspx?id=103',
 'https://www.nhs.uk/Services/Trusts/Overview/DefaultView.aspx?id=827']

storeit = {'phone':[],'hospital':[],'postcode':[],'link':[]}
strip = []
for i in range(0, 2, 1):
    r = requests.get(href_url[i])
    soup = BeautifulSoup(r.content, 'lxml')
    codes = soup.find('div',{'class':'panel-content'}).find_all('p')
    if codes!=None:
        for h in codes:
            strip.append(h.text.strip())
            list_data = [l.split(',') for l in strip[0].split('\n') if l]#problem starts here
            storeit['phone'].append(list_data[::4])
            storeit['hospital'].append(list_data[1::3][0][0)
            storeit['postcode'].append(list_data[2::3][0][3])
            storeit['link'].append(list_data[3::3])

When I print the elements of list_data using the indexing notations, I get:

[['01535 652511']]
[['01535 652511']]
Airedale General Hospital
Airedale General Hospital
 BD20 6TD
 BD20 6TD
[['http://www.airedale-trust.nhs.uk/']]
[['http://www.airedale-trust.nhs.uk/']]

They just repeat the values and I think it's during the of strip[0], however if I remove this then I get the error:

'list' object has no attribute 'split'

Because I can only split a string as opposed to a list of strings, how can I overcome this?

Upvotes: 1

Views: 50

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195563

You can use next example how to parse information about the hospitals and add them to storeit dictionary you have prepared:

import requests
from bs4 import BeautifulSoup

href_url = [
    "https://www.nhs.uk/Services/Trusts/Overview/DefaultView.aspx?id=103",
    "https://www.nhs.uk/Services/Trusts/Overview/DefaultView.aspx?id=827",
]

storeit = {"phone": [], "hospital": [], "postcode": [], "link": []}

for url in href_url:
    soup = BeautifulSoup(requests.get(url).content, "html.parser")

    storeit["phone"].append(soup.select_one('[property="telephone"]').text)
    txt = (
        soup.select_one('[typeof="PostalAddress"]')
        .get_text(strip=True)
        .split(",")
    )

    storeit["hospital"].append(txt[0])
    storeit["postcode"].append(txt[-1])
    storeit["link"].append(soup.select_one('[property="url"]')["href"])


print(storeit)

Prints:

{
    "phone": ["01535 652511", "0151 228 4811"],
    "hospital": ["Airedale General Hospital", "Alder Hey Children's Hospital"],
    "postcode": ["BD20 6TD", "L12 2AP"],
    "link": ["http://www.airedale-trust.nhs.uk/", "http://www.alderhey.nhs.uk"],
}

Upvotes: 1

Related Questions