Reputation: 135
In my example code below I have navigated to Obama's first Instagram post. I am trying to point to the portion of the page that is his post and the comments beside it.
driver.get("https://www.instagram.com/p/B-Sj7CggmHt/")
element = driver.find_element_by_css_selector("div._97aPb")
I want this to work for the page of any post and of any Instagram user, but it seems that the xpath for the post alongside the comments changes. How can I find the post image + comments combined block regardless of which post it is? Would appreciate any help thank you.
I would also like to be able to individually point to the image and individually point to the comments. I have gone through multiple user profiles and multiple posts but both the xpaths and css selectors seem to change. Would also appreciate guidance on any reading or resources where I can learn how to properly point to different html elements.
Upvotes: 1
Views: 821
Reputation: 12004
You could try selecting based on the top level structure. Looking more closely, there is always an article
tag, and then the photo is in the 4th div in, right under the header.
You can do this with BeautifulSoup
with something like this:
from BeautifulSoup import BeautifulSoup as soup
article = soup.find('article')
divs_in_article = article.find_all('div')
divs_in_article[3]
should have the data you are looking for. If BeautifulSoup
grabs dives under that first header
tag, you may have to get creative and skip that tag first. I would test it myself but I don't have ChromeDriver running right now.
Alternatively, you could try:
images = soup.find_all('img')
to grab all image tags in the page. This may work too.
BeautifulSoup has a lot of handy methods to get you tagging things based on structure. Take a look at going back and forth, going sideways , going down and going up. You should be able to discern the structure using the developer tools in your browser and then come up with a way to select the collections you care about for comments.
Upvotes: 1