SIM
SIM

Reputation: 22440

Can't store the scraped results in third and fourth column in a csv file

I've written a script which is scraping Address and Phone number of certain shops based on Name and Lid. The way it is searching is that It takes Name and Lid stored in column A and Column B respectively from a csv file. However, after fetching the result based on the search, I expected the parser to put that results in column C and column D respectively as it is shown in the second Image. At this point, I got stuck. I don't know how to manipulate Third and Fourth column using reading or writing method so that the data should be placed there. I'm trying with this now:

import csv
import requests
from lxml import html
Names, Lids = [], []
with open("mytu.csv", "r") as f:
    reader = csv.DictReader(f)
    for line in reader:
        Names.append(line["Name"])
        Lids.append(line["Lid"])
with open("mytu.csv", "r") as f:
    reader = csv.DictReader(f)
    for entry in reader:
        Page = "https://www.yellowpages.com/los-angeles-ca/mip/{}-{}".format(entry["Name"].replace(" ","-"), entry["Lid"])
        response = requests.get(Page)
        tree = html.fromstring(response.text)
        titles = tree.xpath('//article[contains(@class,"business-card")]')
        for title in titles:
            Address= title.xpath('.//p[@class="address"]/span/text()')[0]
            Contact = title.xpath('.//p[@class="phone"]/text()')[0]
            print(Address,Contact)

How my csv file looks like now:

enter image description here

My desired output is something like:

enter image description here

Upvotes: 0

Views: 32

Answers (1)

Bill Bell
Bill Bell

Reputation: 21643

You can do it like this. Create a fresh output csv file whose header is based on the input csv, with the addition of the two columns. When you read a csv row it's available as a dictionary, in this case called entry. You can add the new values to this dictionary from the stuff you've gleaned on the 'net. Then write each newly created row out to file.

import csv
import requests
from lxml import html
with open("mytu.csv", "r") as f, open('new_mytu.csv', 'w', newline='') as g:
    reader = csv.DictReader(f)
    newfieldnames = reader.fieldnames + ['Address', 'Phone']
    writer = csv.writer = csv.DictWriter(g, fieldnames=newfieldnames)
    writer.writeheader()
    for entry in reader:
        Page = "https://www.yellowpages.com/los-angeles-ca/mip/{}-{}".format(entry["Name"].replace(" ","-"), entry["Lid"])
        response = requests.get(Page)
        tree = html.fromstring(response.text)
        titles = tree.xpath('//article[contains(@class,"business-card")]')
        #~ for title in titles:
        title = titles[0]
        Address= title.xpath('.//p[@class="address"]/span/text()')[0]
        Contact = title.xpath('.//p[@class="phone"]/text()')[0]
        print(Address,Contact)
        new_row = entry
        new_row['Address'] = Address
        new_row['Phone'] = Contact
        writer.writerow(new_row)

Upvotes: 1

Related Questions