Download PDFs from multiple JSON URLs using Python

Question

I have been tasked to create a method to download multiple PDFs from URLs included in JSON files. Probably 1 URL per JSON file, with approx 500k JSON files to process in any one batch.

Here's a sample of the JSON file:

{
  "from": null,
  "id": "sfm_c4kjatol7u8psvqfati0",
  "imb_code": "897714123456789",
  "mail_date": null,
  "mail_type": "usps_first_class",
  "object": "self_mailer",
  "press_proof": "https://lob-assets.com/sid-self_mailers/sfm_c4kjatol7u8psvqfati0.pdf?version=v1&expires=1635274615&signature=AZlb0MSzZPuCjtKFkXRr_OoHzDzEy23UqzmKFWs5bycKCEcIyfe2od58zHzfP1a-iW5d9azFYUT1PnosqKcvBg",
  "size": "11x9_bifold",
  "target_delivery_date": null,
  "to": {
    "address_city": "SAN FRANCISCO",
    "address_country": "UNITED STATES",
    "address_line1": "185 BERRY ST STE 6100",
    "address_line2": null,
    "address_state": "CA",
    "address_zip": "94107-1741",
    "company": "Name.COM",
    "name": "EMILE ILES"
  }
}

The JSON file is converted to CSV and the URL is downloaded.

Here's what I have been trying to use but it is not working. What am I missing?

Import urllib.request, json, requests, os, csvkit


from itertools import islice
from pathlib import Path

path = Path("/Users/MyComputer/Desktop/self_mailers")
paths = [i.path for i in islice(os.scandir(path), 100)]
in2csv data.json > data.csv
with open('*.json', 'r') as f:
    urls_dict = json.load(f)

urls_dict = urls_dict[0]
itr = iter(urls_dict)

len(list(itr))
f.write(r.pdf)

Download PDFs from multiple JSON URLs using Python

Answers (1)

Related Questions