Reputation: 2729
I am working on an app to get images details from Instagram by using selenium with python.
driver.execute_script(SCROLL_TOP)
driver.execute_script(SCROLL_BOTTOM)
In the result, all posted images and captions can be gotten from driver.page_source
But when I trying to get more information about an image (e.g, number of likes, date of the image published). I need to access
<script type="text/javascript">window._sharedData = {...}</script>
The '...' in the previous code is a JSON block. It contains the first 12 media's details. Is there a way I get all image's details in the window._shareData JSON block?
Thanks for your advice
Upvotes: 2
Views: 2503
Reputation: 2557
Take a look at my answer which solves your problem but with php. Anyway, you can do the same with python:
Load the json by http from the url: https://www.instagram.com/nasa/?__a=1
(replace nasa
with any public username).
Get 12 media details from the json: user->media->nodes
.
Get the additional media info from the json: user->media->page_info
. There are has_next_page
(boolean) and end_cursor
(integer). Use it to get next 12 media with url https://www.instagram.com/nasa/?__a=1&max_id=[VALUE-FROM-end_cursor]
.
Upvotes: 5