coalcracker462
coalcracker462

Reputation: 11

Web Scraping javascript in Python / R

I'm doing some personal data science projects and one of them is to see how often certain songs are played on the radio.

http://www.iheart.com/live/radio-1045-3401/

Looking at the above URL, when I look at page source, no values of interest populate. Not sure why, but when I use inspect element in chrome when I hover over the "Now Playing" header, I can see values for song and artist now playing.

Example:

a class="player-song" href="/artist/rem-3610/songs/-2450662/" title="Losing My Religion" data-reactid=".1hpdfx1l4ow.a.1.0.1.1">Losing My Religion</a

My two questions are:

  1. Why isn't this showing up in page source, but I can see it under Inspect Element?
  2. How would I web scrape this info since it is not appearing in page source?

Upvotes: 0

Views: 333

Answers (1)

Akshat Mahajan
Akshat Mahajan

Reputation: 9846

  1. Most web pages that involve dynamic elements have page elements generated and inserted by Javascript that the browser parses and executes for you. You already guessed this, I suspect, based on the question title.

    What you see in the page source is the raw HTML before Javascript kicks in and updates it.

  2. You want a headless browser: a browser without a graphical user interface. This will parse and execute Javascript for you, and update page HTML accordingly.


Here is a full list of headless browsers. Note that you can do this task in any language.

Upvotes: 3

Related Questions