Retrieving text content from Javascript URL

Question

I am modifying the play-scraper API to scrape play-store app details. It uses BeautifulSoup to parse HTML pages [reference].

I am particularly interested in all the additional information available for an app as shown in the screenshot below. (The above screenshot is taken from this app.)

I am stuck at extracting the list of permissions that an app asks for (shown in the above figure) because the View details URL under Permissions is as follows.

View details

Clicking the View details URL shows a list of permissions (screenshot as follows) that I want to extract.

I am not familiar with Javascript. Any help would be appreciated.

Md Golam Rahman Tushar · Accepted Answer

If I understand the question correctly you are trying to scrape the data from a modal. And when the website loads for the first time these modals data aren't available inside html. They are fetched after you click the view details button. That's why the parser doesn't get the data inside the modal, in your case the permission informations. So this is the reason of your problem.

Now about the solution, one possible solution could be achieved by using the Selenium and chromedriver by performing click event on the view details text and then fetching the modal data. Have a look at this link to get an idea.

Update: To get an idea about the solution using Selenium and chromedriver consider the following code:

options = Options()
options.headless = True
driver = webdriver.Chrome('local_path_to_chrome_driver', options=options)

driver.get(url_of_the_play_store_app)
time.sleep(5) #sleep for 5 secs sometime to fetch the data
driver.find_element_by_link_text("View details").click() #performing the click event
time.sleep(5) # again sleep for 5 secs to fetch the modal data
soup = BeautifulSoup(driver.page_source, "lxml")

The soup variable now has the updated scraped data including the modal window data and you can retrieve the modal window data from soup.

Retrieving text content from Javascript URL

Answers (1)

Related Questions