John_Muir
John_Muir

Reputation: 119

Taking Scraped data output and turning it into excel

Wondering how I could make the output for this code into a excel format.The code was written by Andrej Kesely who really helped me understand how scraping works.

import json
import requests
from bs4 import BeautifulSoup


url = "https://bopis.mastermindtoys.com/ajax/getstorelocation.php"

headers = {
    "Referer": "https://www.mastermindtoys.com/",
}

payload = {
    "doAction": "getNearStoreData",
    "latitude": "",
    "longitude": "",
    "flagDisplayFirstRecord": "Y",
    "shopifyCustomerId": "",
    "selectedStoreInCookies": "",
}

data = requests.post(url, headers=headers, data=payload).json()


print(json.dumps(data, indent=4))

for i, d in enumerate(data["details"], 1):
    print(
        "{:<3} {:<25} {:<30} {:<30} {:<30} {:<30} {:<30} {:<30} {:<30}".format(
            i, d["name"], d["address1"], d["address2"], d["city"], d["postCode"], d["province"], d["phoneNumber"], d["storeId"]
        )
    )
    

Upvotes: 0

Views: 47

Answers (2)

QHarr
QHarr

Reputation: 84465

Another way is to use pandas DataFrame and a list comprehension, dropping the enumerate and using the index to handle the numbering. Importing just DataFrame is a lighter way to go than bringing in all of pandas.

import json
import requests
from bs4 import BeautifulSoup
from pandas import DataFrame


url = "https://bopis.mastermindtoys.com/ajax/getstorelocation.php"

headers = {
    "Referer": "https://www.mastermindtoys.com/",
}

payload = {
    "doAction": "getNearStoreData",
    "latitude": "",
    "longitude": "",
    "flagDisplayFirstRecord": "Y",
    "shopifyCustomerId": "",
    "selectedStoreInCookies": "",
}

data = requests.post(url, headers=headers, data=payload).json()

df = DataFrame([(d["name"], d["address1"], d["address2"], d["city"], d["postCode"], d["province"], d["phoneNumber"], d["storeId"]) for d in data["details"]]
                 , columns = ["name", "address1", "address2", "city", "postCode", "province", "phoneNumber", "storeId"])
df.index = df.index + 1
df.to_csv('data.csv', encoding='utf-8-sig', index= True)

You could also just subset the columns of interest:

cols = ['name', 'address1', 'address2', 'city', 'postCode', 'province', 'phoneNumber', 'storeId']
df = DataFrame(data["details"])
df = df[cols]
df.index = df.index + 1
df.to_csv('data.csv', encoding='utf-8-sig', index= True)

Upvotes: 0

pullidea-dev
pullidea-dev

Reputation: 1823

You can export the result as a CSV file and open it with Excel.

import json
import csv
import requests
from bs4 import BeautifulSoup


url = "https://bopis.mastermindtoys.com/ajax/getstorelocation.php"

headers = {
    "Referer": "https://www.mastermindtoys.com/",
}

payload = {
    "doAction": "getNearStoreData",
    "latitude": "",
    "longitude": "",
    "flagDisplayFirstRecord": "Y",
    "shopifyCustomerId": "",
    "selectedStoreInCookies": "",
}

data = requests.post(url, headers=headers, data=payload).json()


print(json.dumps(data, indent=4))

with open("test.csv", 'w', encoding="UTF-8", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=['id','name','address1','address2','city','postCode','province','phoneNumber','storeId'])
    writer.writeheader()
    for i, d in enumerate(data["details"], 1):
        writer.writerow({'id':i, 'name':d["name"], 'address1':d["address1"], 'address2':d["address2"], 'city':d["city"], 'postCode':d["postCode"], 'province':d["province"], 'phoneNumber':d["phoneNumber"], 'storeId':d["storeId"]})

Upvotes: 1

Related Questions