Mr.D
Mr.D

Reputation: 151

Webscraping data from a json source, why i get only 1 row?

I'am trying to get some information from a website with python, from a webshop.

I tried this one:

def proba():

    my_url = requests.get('https://www.telekom.hu/shop/categoryresults/?N=10994&contractType=list_price&instock_products=1&Ns=sku.sortingPrice%7C0%7C%7Cproduct.displayName%7C0&No=0&Nrpp=9&paymentType=FULL')
    data = my_url.json()
    results = []
    products = data['MainContent'][0]['contents'][0]['productList']['products']
    for product in products:
        name = product['productModel']['displayName']
        try:
            priceGross = product['priceInfo']['priceItemSale']['gross']
        except:
            priceGross = product['priceInfo']['priceItemToBase']['gross']
        url = product['productModel']['url']
        results.append([name, priceGross, url])
    df = pd.DataFrame(results, columns = ['Name', 'Price', 'Url'])    
# print(df)  ## print df
    df.to_csv(r'/usr/src/Python-2.7.13/test.csv', sep=',', encoding='utf-8-sig',index = False )

while True:
    mytime=datetime.now().strftime("%H:%M:%S")
    while mytime < "23:59:59":
    print mytime
    proba()
    mytime=datetime.now().strftime("%H:%M:%S")

In this webshop there are 9 items, but i see only 1 row in the csv file.

Upvotes: 0

Views: 31

Answers (1)

QHarr
QHarr

Reputation: 84455

Not entirely sure what you intend as end result. Are you wanting to update an existing file? Get data and write out all in one go? Example of latter shown below where I add each new dataframe to an overall dataframe and use a Return statement for the function call to provide each new dataframe.

import requests
from datetime import datetime
import pandas as pd

def proba():
    my_url = requests.get('https://www.telekom.hu/shop/categoryresults/?N=10994&contractType=list_price&instock_products=1&Ns=sku.sortingPrice%7C0%7C%7Cproduct.displayName%7C0&No=0&Nrpp=9&paymentType=FULL')
    data = my_url.json()
    results = []
    products = data['MainContent'][0]['contents'][0]['productList']['products']
    for product in products:
        name = product['productModel']['displayName']
        try:
            priceGross = product['priceInfo']['priceItemSale']['gross']
        except:
            priceGross = product['priceInfo']['priceItemToBase']['gross']
        url = product['productModel']['url']
        results.append([name, priceGross, url])
    df = pd.DataFrame(results, columns = ['Name', 'Price', 'Url'])  
    return df

headers = ['Name', 'Price', 'Url']
df = pd.DataFrame(columns = headers)

while True:
    mytime = datetime.now().strftime("%H:%M:%S")
    while mytime < "23:59:59":
        print(mytime)
        dfCurrent = proba()
        mytime=datetime.now().strftime("%H:%M:%S")
        df = pd.concat([df, dfCurrent])

df.to_csv(r"C:\Users\User\Desktop\test.csv", encoding='utf-8') 

Upvotes: 1

Related Questions