Reputation: 3011
I am parsing data from Elasticsearch index and have received the data in json format as follows:
{
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 5,
"total": 5
},
"hits": {
"hits": [
{
"_id": "wAv4u2cB9qH5eo0Slo9O",
"_index": "homesecmum",
"_score": 1.0870113,
"_source": {
"image": "0000000028037c08_1544283640.314629.jpg"
},
"_type": "dataRecord"
},
{
"_id": "wwv4u2cB9qH5eo0SmY8e",
"_index": "homesecmum",
"_score": 1.0870113,
"_source": {
"image": "0000000028037c08_1544283642.963721.jpg"
},
"_type": "dataRecord"
},
{
"_id": "wgv4u2cB9qH5eo0SmI8Z",
"_index": "homesecmum",
"_score": 1.074108,
"_source": {
"image": "0000000028037c08_1544283640.629583.jpg"
},
"_type": "dataRecord"
}
],
"max_score": 1.0870113,
"total": 5
},
"timed_out": false,
"took": 11
}
I am trying to extract only the image parameter from json data and store it as an array. I tried the following:
for result in res['hits']['hits']:
post = result['_source']['image']
print(post)
and this:
respars = json.loads(res['hits']['hits'][0]['_source'])['image']
print(json.dumps(respars, indent=4, sort_keys = True))
Both these throws an error:
TypeError: byte indices must be integers or slices, not str
I am sure similar problems were raised earlier here, but I couldn't get through this error. How can I fix it?
Upvotes: 2
Views: 6584
Reputation: 347
To get all image in _source entry as list, you can use list comprehension:
image_list = [source['_source']['image'] for source in res['hits']['hits']]
Output:
['0000000028037c08_1544283640.314629.jpg',
'0000000028037c08_1544283642.963721.jpg',
'0000000028037c08_1544283640.629583.jpg']
Upvotes: 0
Reputation: 5833
Instead of going through the pain of manually handling the response, you could use the Elasticsearch-DSL package from PyPi.
Upvotes: 3