Reputation: 43
I am trying to download the post of any Instagram user with the help of the selenium module in python The code is working perfectly fine but it just shows some errors sometimes
There are mainly two errors shown
The first one is from the Instagram html code side
I don't know why but the code randomly shows that it is not able to find the class name in the code of website
Extension of error 1 : after some changes in the source code by the server and corresponding changes by me, I always reach at a point where I get the original class name again, and then after doing the changes in the program it starts working
When I check it, I find that the source code has been changed, but when I do the changes in my program, the source code changes again, I don't know if it is user-specific or something else
The second one is sometimes selenium module shows that the class does not exist, but upon checking the code the website shows the class in inspect method,
I don't know the reason for any of the errors. Here is my program
#! ---MODULES---
#! -------TIME-------
import time
#! -------SCREEN SHOT-------
import pyscreenshot as ImageGrab
#! -------WEB SCRAPPING-------
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
#! -------WEBDRIVER CHROME EXTENSIONS-------
from webdriver_manager.chrome import ChromeDriverManager
print("Here we go...")
#! ---VRAIBLES---
target = input('Enter User Name Of User For Downloading Posts : ')
url = 'https://instagram.com/'+ target
picture_0 = 'Picture_0.png'
#! ---WEB SCRAPPING---
#! -------CHROME SCRAPPING-------
#! -----------OPENING CHROME WEBDRIVER-----------
chrome = webdriver.Chrome(ChromeDriverManager().install())
#! -----------DIRECTLY LANDING ON THE TARGET INSTAGRAM HANDLE-----------
chrome.get(url)
time.sleep(6)
#! ---------------LOGIN BUTTON---------------
log_but = chrome.find_element(By.CLASS_NAME, "_acap")
log_but.click()
time.sleep(4)
#! ---------------FINDING THE USERNAME INPUT FIELD---------------
usern = chrome.find_element(By.NAME, "username")
usern.send_keys('rkus6987')
#! ---------------FINDING THE PASSWEORD INPUT FIELD---------------
passw = chrome.find_element(By.NAME, "password")
passw.send_keys('ayush 1234')
passw.send_keys(Keys.RETURN)
time.sleep(5)
#! ---------------NOT NOW BUTTON---------------
save = chrome.find_element(By.CLASS_NAME, "_acas")
save.click()
time.sleep(3)
#turn_on = chrome.find_element(By.CLASS_NAME, "_a9_1")
#turn_on.click()
time.sleep(3)
#! ---------------FIRST POST BUTTON---------------
pic = chrome.find_element(By.CLASS_NAME, "_aagw").click()
time.sleep(2)
# caption = chrome.find_element(By.CLASS_NAME, "_aacl").text
# time.sleep(2)
# likes = chrome.find_element(By.CLASS_NAME, "_aade").text
# time.sleep(2)
#! -----------SCREEN SHOTTING WITH THEE DESIGNATED DIMENSIONS-----------
im=ImageGrab.grab(bbox=(100,400,600,800))
im.save('Picture_1.png')
#! -----------CLOSING CHROME-----------
chrome.close()
These images shows error 2
Image showing the class in the source code
I had not taken the screenshot of the 1st error and I don't know when it will occur again, as soon as I get the error again, I will update the question
Here is the Instagram page: https://www.instagram.com/ashishchanchlani/
Upvotes: 0
Views: 92
Reputation: 305
The problem of your code is that selenium can't find the correct HTML element to scrape. Instagram make changes to its classes for some reason frequently. Its not a good idea to make something that is gonna change for sure.
Both of you problems can be solved if you relative xpath to find the element/the data you want to scrape.
For example
In this I used Xpath to select all the posts. If you look closely I used the article html element to grab on because it is not going to change and then i selected another xpath for the posts. You can also go on level deep into the tree and grab the image element like this
//article//a[@role="link"]//img
This will result in selection of all loaded posts in that page.
Upvotes: 1