Reputation: 25
I am writing a parser to extract the list of ads
response = requests.get(url).json()
items = response['data']
iter1 = []
for item in items:
iter1.append({
'name': item.get('name', 'NA'),
'owner': item.get('owner', 'NA'),
'date_published': item.get('date_published', 'NA'),
'images': item.get('images', 'NA'),
'short_url': item.get('short_url', 'NA')
})
At the moment, I get the following output. I need to make my conclusion shorter.
[
{
"name": "Announcement name",
"owner": {
"id": "58f84949700743"
},
"date_published": 1627666233,
"images": [
{
"id": "58fb7032ca5544fb5a2",
"num": 1,
"url": "https://cache3.com/images/orig/58/fb/58fb70f2132a554804fb5a2.jpg",
"width": 1936,
"height": 2581
},
{
"id": "58fb70f29e94ba0384507554",
"num": 2,
"url": "https://cache3.com/images/orig/58/fb/58fb70f29e94b384507554.jpg",
"width": 750,
"height": 1334
},
{
"id": "58fb70f2f8efdc109d76c2e5",
"num": 3,
"url": "https://cache3.com/images/orig/58/fb/58fb70f2fdc109d76c2e5.jpg",
"width": 750,
"height": 1334
}
],
"short_url": "https://short.com/p58gb7b9a4c80320f03"
}
]
I would like to bring to the form:
"name": "Announcement name", #Name
"id": "58f84949700743" #Owner ID
"date_published": 1627666233, #Date
"url": "https://cache3.com/images/orig/58/fb/58fb70f2132a554804fb5a2.jpg",#Url-img
"short_url": "https://short.com/p58gb7b9a4c80320f03" #Announcement url
How can I extract information from owner{.id} and images[.url] ?
Upvotes: 0
Views: 47
Reputation: 626
dict=[ { "name": "Announcement name", "owner": { "id": "58f84949700743" }, "date_published": 1627666233, "images": [ { "id": "58fb7032ca5544fb5a2", "num": 1, "url": "https://cache3.com/images/orig/58/fb/58fb70f2132a554804fb5a2.jpg", "width": 1936, "height": 2581 }, { "id": "58fb70f29e94ba0384507554", "num": 2, "url": "https://cache3.com/images/orig/58/fb/58fb70f29e94b384507554.jpg", "width": 750, "height": 1334 }, { "id": "58fb70f2f8efdc109d76c2e5", "num": 3, "url": "https://cache3.com/images/orig/58/fb/58fb70f2fdc109d76c2e5.jpg", "width": 750, "height": 1334 } ], "short_url": "https://short.com/p58gb7b9a4c80320f03" } ]
result = {}
result["name"] = dict[0].get("name", 'NA')
result["id"] = dict[0].get('owner', {}).get('id', 'NA')
result["date_published"] = dict[0].get("date_published", 'NA')
result["url"] = []
result["short_url"] = dict[0].get("short_url", 'NA')
for img in dict[0].get("images", []):
if "url" in img:
result["url"].append(img["url"])
print(result)
Upvotes: 1
Reputation: 123423
You could do it by only extracting the information you want:
items = response['data']
iter1 = []
for item in items:
iter1.append({
'name': item.get('name', 'NA'),
'id': item.get('owner', {}).get('id', 'NA'),
'date_published': item.get('date_published', 'NA'),
'urls': [entry.get('url', 'NA') for entry in item.get('images', [])],
'short_url': item.get('short_url', 'NA')
})
Result:
[{'name': 'Announcement name',
'id': '58f84949700743',
'date_published': 1627666233,
'urls': ['https://cache3.com/images/orig/58/fb/58fb70f2132a554804fb5a2.jpg',
'https://cache3.com/images/orig/58/fb/58fb70f29e94b384507554.jpg',
'https://cache3.com/images/orig/58/fb/58fb70f2fdc109d76c2e5.jpg'],
'short_url': 'https://short.com/p58gb7b9a4c80320f03'}]
Upvotes: 0
Reputation:
You could replace:-
'owner': item.get('owner', 'NA'),
...with...
'id': item.get('owner', {}).get('id', 'NA'),
Upvotes: 0