CJ090
CJ090

Reputation: 81

Writing a for loop to a CSV

I'm using beautifulsoup to scrape reviews. I have the scraping part down and am ready to write my code to a csv file. Looking at many examples online, I am still not understanding how to write to a csv file. My scraping code is

for i in range(0,200,5):
    url = "https://www.tripadvisor.com/Hotel_Review-g39143-d92240-Reviews-or" + str(i) + "-Hawthorn_Suites_by_Wyndham_Wichita_East-Wichita_Kansas"
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
    response = requests.get(url, headers=headers, verify=False).text
    soup = BeautifulSoup(response, "lxml")
    reviews = soup.find_all('div', 'reviewSelector')
    for r in reviews:
        print("Rating: ", int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10)
        print("Review snipet: ", r.p.text)

To write to a csv I tried wrapping my code in the csv.writer method

with open('TA-reviews.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"')
    for i in range(0,200,5):
    url = "https://www.tripadvisor.com/Hotel_Review-g39143-d92240-Reviews-or" + str(i) + "-Hawthorn_Suites_by_Wyndham_Wichita_East-Wichita_Kansas"
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
    response = requests.get(url, headers=headers, verify=False).text
    soup = BeautifulSoup(response, "lxml")
    reviews = soup.find_all('div', 'reviewSelector')
    for r in reviews:
        print("Rating: ", int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10)
        print("Review snipet: ", r.p.text)
        writer.writerow((rating, review))

Which returns an error that rating is undefined yet one rating is printed out

Upvotes: 0

Views: 58

Answers (1)

bruno desthuilliers
bruno desthuilliers

Reputation: 77952

Which returns an error that rating is undefined

Of course rating is undefined. Where in your code do you have a statement binding anything to the name rating ?

yet one rating is printed out

what you print out is the expression int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10. This does not define any rating variable.

You want:

for r in reviews:
    rating = int(r.find('span','ui_bubble_rating')['class'][1].split('_')[1])/10
    review = r.p.text
    writer.writerow((rating, review))

Upvotes: 1

Related Questions