Chris Michael
Chris Michael

Reputation: 1617

How to access the iframe #document using puppeteer?

I'm trying to scraping the anime videos page [jkanime], but I'm having problems with the formats mp4 videos since they are in an iframe #document.

In chrome dev tool I put the following: $('#jkvideo_html5_api source').src

And the src of the mp4 shows me. But I do not know how to apply the query *$('#jkvideo_html5_api source').src * with puppeteer.

Now ... what I do want to achieve is how to get the value of _navigationURL, then make request and refer to the mp4 video source.

Any help will be appreciated.!!

Image

devtool source code section

  const getAnimeVideo = async (id: string, chapter: number) => {
    const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
    const browser = await puppeteer.launch() 
    const page = await browser.newPage()
    await page.goto(BASE_URL);
    const elementHandle = await page.$('.player_conte')
    const frame = await elementHandle.contentFrame();
    const $ = cheerio.load(`${frame}`);
    console.log(frame)
 }

Part of the Output Obtained

....
OMWorld {
     _frameManager:
      FrameManager {
        _events: [Object],
        _eventsCount: 3,
        _maxListeners: undefined,
        _client: [CDPSession],
        _page: [Page],
        _networkManager: [NetworkManager],
        _timeoutSettings: [TimeoutSettings],
        _frames: [Map],
        _contextIdToContext: [Map],
        _isolatedWorlds: [Set],
        _mainFrame: [Frame] },
     _frame: [Circular],
     _timeoutSettings:
      TimeoutSettings { _defaultTimeout: null, _defaultNavigationTimeout: null },     _documentPromise: null,
     _contextResolveCallback: null,
     _contextPromise: Promise { [ExecutionContext] },
     _waitTasks: Set {},
     _detached: false },
  _childFrames: Set {},
  _name: '',
  _navigationURL:
   'https://jkanime.net/um.php?e=Q0VxeUQ2MmZRRlNWeUdHKzdoWlJQOGFLNjFRUnljVkFTaEtFMElZUjFmTlRPQnhnUUtqbnRodjhEVHlGYnVleWJsdnNnRy9wNzVLd0MrMURuRVBKV0tQZjVuT0tIblc3cUNmZDNzdFVFaEE9OjrIf8cc_60GOGTTN7Th9Q_a' }

Output that I want to obtain

   {
     "src": [
       "https://storage.googleapis.com/markesito.appspot.com/tokgho/01.mp4"
     ]
   }

Problem solved: 11:34am

  const getAnimeVideo = async (id: string, chapter: number) => {
    const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
    const browser = await puppeteer.launch() 
  const page = await browser.newPage()
  await page.goto(BASE_URL);
  const elementHandle = await page.$('.player_conte')
  const frame = await elementHandle.contentFrame();
  const video = await frame.$eval('#jkvideo_html5_api', el =>
  Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
  return video;
 }

Upvotes: 1

Views: 3843

Answers (1)

Chris Michael
Chris Michael

Reputation: 1617

const getAnimeVideo = async (id: string, chapter: number) => {
  const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
  const browser = await puppeteer.launch() 
  const page = await browser.newPage()
  await page.goto(BASE_URL);
  const elementHandle = await page.$('.player_conte')
  const frame = await elementHandle.contentFrame();
  const video = await frame.$eval('#jkvideo_html5_api', el =>
  Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
  return video;
 }

Upvotes: 1

Related Questions