Reputation: 69
I can't get the slicing to work properly. I have a list of strings looking like this:
['subdomain', 'name', 'url']
['https://www.pedidosya.com.ar/restaurantes/buenos-aires/recoleta/empanadas-delivery?bt=RESTAURANT&page=1', 'Cümen-Cümen Empanadas Palermo', 'https://www.pedidosya.com.ar/restaurantes/buenos-aires/cumen-cumen-empanadas-palermo-menu']
['https://www.pedidosya.com.ar/restaurantes/buenos-aires/recoleta/empanadas-delivery?bt=RESTAURANT&page=1', 'Cümen-Cümen Empanadas - Barrio Norte', 'https://www.pedidosya.com.ar/restaurantes/buenos-aires/cumen-cumen-empanadas-barrio-norte-menu']
What I need is to save the 'url' in a new list to further work on it.
This is what I'm trying
for row[3:3] in reader:
menus = []
menus.append[row]
But this is what I get when I print():
['https://www.pedidosya.com.ar/restaurantes/buenos-aires/recoleta/empanadas-delivery?bt=RESTAURANT&page=5', 'La Pergola - Recoleta', 'https://www.pedidosya.com.ar/restaurantes/buenos-aires/la-pergola-recoleta-menu']
Which is the last part of the list. What I need is:
menus = ['https://www.pedidosya.com.ar/restaurantes/buenos-aires/cumen-cumen-empanadas-palermo-menu', 'https://www.pedidosya.com.ar/restaurantes/buenos-aires/cumen-cumen-empanadas-barrio-norte-menu']
I've added the rest of the code. The issue is that it's not a list of str as I thought but type() = '_csv.reader'
Here is the entire code:
urls = ["https://www.pedidosya.com.ar/restaurantes/buenos-aires/recoleta/empanadas-delivery",]
with open("output1.csv", 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
writer.writerow(['subdomain', 'name', 'url'])
for url in urls:
base = url+ "?bt=RESTAURANT&page="
page = 1
restaurants = []
while True:
soup = bs(requests.get(base + str(page)).text, "html.parser")
sections = soup.find_all("section", attrs={"class": "restaurantData"})
if not sections: break
for section in sections:
for elem in section.find_all("a", href=True, attrs={"class": "arrivalName"}):
restaurants.append({"name": elem.text, "url": elem["href"],})
writer.writerow([base+str(page),elem.text,elem["href"]])
page += 1
#reading
file = open("output1.csv", 'r')
reader = csv.reader(file)
Upvotes: 2
Views: 93
Reputation: 2298
Assuming you have a list of lists (ie an extra []
around your lists) and not 3 isolated lists as your question implies, you can loop through your lists of lists and take the url
element from each (element 2) to append to a new list.
reader = csvreader or whatever you do to define it
menu = []
for n, i in enumerate(reader):
if(n != 0):
print(i[2])
menu.append(i[2])
I have altered the code to work with the csvreader object. Instead of my old way to ignore the first element we will use enumerate
a fantastic function that counts which element of the reader we are in as n
. So as long as n
is not zero we will continue like before.
Upvotes: 1
Reputation: 23139
Seems like you want this:
menus = []
for row in reader:
menus.append(row[2])
I don't understand what you're trying to do by making row[3:3]
the iterated variable of a for
loop. I think you want to iterate over simple rows and then do something with each row inside the loop.
Upvotes: 1
Reputation: 1688
The issue does not lie in the slicing (although you could also directly index with [2]). However, you reinitialize menus in the loop. So for every run of the loop, you overwrite what was previously there. This should fix it:
menus = []
for row in reader:
menus.append[row[2]]
A cleaner (and more pythonic) approach would be to use a list comprehension:
menus = [row[2] for row in reader]
Upvotes: 0