Reputation: 55
I need to parse price list which is in "span" tags with "class". Numbers has a specific setting (2 400 p) so I also need to remove spaces and "p" letter. This is my code below:
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("1.html"))
for link in soup.findAll("span", { "class" : "b-sbutton mod_price skin_product size_normal scheme_available" }):
links = link.get_text()
print(links)
links_len = len(links)
int(links_len)
for links_len in links:
a = links[links_len]
a.replace(' ', '')
a.replace('р', '')
print(links)
But when I try to run the script there is an error
Traceback (most recent call last):
File "get_data.py", line 9, in <module>
a = links[links_len]
TypeError: string indices must be integers
What am I missing here?
Upvotes: 2
Views: 1737
Reputation: 473863
You've mixed up lists, strings, indexes. You can make it using a list comprehension
:
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("1.html"))
links = [link.text.replace(' ', '').replace('p', '')
for link in soup.find_all("span",
{"class": "b-sbutton mod_price skin_product size_normal scheme_available"})]
Upvotes: 2