ORCos
ORCos

Reputation: 99

How to write data in an excel file using python

Im making an API request to pull some articles bodies and ids. I would like to pull them into an excel file where on the first column the id's are displayed and on the next column the corresponding bodies are displayed.

I managed to bring it this far but for some reason the formatting in excel is not right. There is a lot of data that I'm trying to pull here but I would want to display each one of them in it's separate cell.

Can I specifically ask my script to transpose the data automatically?

import requests
import csv
import unicodedata
import getpass
import xlsxwriter 

url = 'https://.../articles.json'

workbook = xlsxwriter.Workbook('bodies.xlsx') 
worksheet = workbook.add_worksheet() 


output_1 = []

output_2 = []


listOf = ([output_1],[output_2])
 

while url:
    user = '[email protected]'
    pwd = '1234'
    response = requests.get(url, auth=(user, pwd))
    data = response.json()
    for article in data['articles']:
        article_id = article['id']
        body = article['body']
        decode_1 = int(article_id)
        decode_2 = unicodedata.normalize('NFKD', body)
        output_1.append(decode_1)
        output_1.append(decode_2)

    print(data['next_page'])
    url = 'https://.../articles.json' and data['next_page']

row = 0
col = 0
for output_1, output_2 in (listOf): 
    worksheet.write_row(row, col, output_1) 
    worksheet.write_column(col, col + 1, output_2) 
    col += 1
  
workbook.close()

Update:

So basically in my script I'm reading a json that looks something like this

"per_page": 30,
"previous_page": null,
"articles": [
        {
            "id": 360239848018,
            "url": "https://.../articles/360239848018.json",
            "html_url": "https://...",
            "author_id": 5201232,
            "body": "<div class....a lot of html" }

Upvotes: 1

Views: 911

Answers (1)

Paul P
Paul P

Reputation: 3927

It looks like all the data that you are scraping with the code you've shared ends up in output_1, while output_2 is empty.

Since you'd like to have the article_id in one column and the content in another column, I would suggest to store the content in output_2 instead of output_1.

Apart from that, you are using write_row() on output_1. As per documentation (emphasis mine):

Write a row of data starting from (row, col).

But it sounds like you'd like to write it as a column.

Another thing to keep in mind is that your listOf is a tuple containing two lists. Iterating it won't get you far.

With all of the above said, this is what should work:

import csv
import requests
import unicodedata
import xlsxwriter 

url = 'https://.../articles.json'

output_1 = []
output_2 = []

while url:
    user = '[email protected]'
    pwd = '1234'
    response = requests.get(url, auth=(user, pwd))
    data = response.json()
    for article in data['articles']:
        article_id = article['id']
        body = article['body']
        decode_1 = int(article_id)
        decode_2 = unicodedata.normalize('NFKD', body)
        output_1.append(decode_1)
        output_2.append(decode_2)
      # ^^^^^^^^
      # This is output_2 now.

    print(data['next_page'])
    url = 'https://.../articles.json' and data['next_page']


workbook = xlsxwriter.Workbook('bodies.xlsx') 
worksheet = workbook.add_worksheet()

row = 0
col = 0

worksheet.write_column(row, col, output_1)
#         ^^^^^^^^^^^^
#         use write_column instead of write_row
worksheet.write_column(row, col + 1, output_2)

workbook.close()

Upvotes: 1

Related Questions