Reputation: 608
I am trying to fetch some details from YouTube channels but this is not give me the desired output.
I have tried below code to fetch YouTube video links:
import requests
import urllib.request as ur
import pandas as pd
username = "Google"
path = "https://www.youtube.com/user/Google/videos"
alldata = str(requests.get(path).content).split(' ')
item = 'href="/watch?'
vids = [line.replace('href="', 'https://www.youtube.com') for line in alldata if item in line] # list of all videos listed twice
print(pd.DataFrame(vids))
Current Output: This code is giving me the Approx. 30-40 Videos link as output but in the given page almost more then 1000+ video available.
Expected output: I want to extract all(1000+) video link available in this page.
What I need to change in above page to get Expected output?
Thanks for you time.
Upvotes: 0
Views: 1020
Reputation: 157
It is because only 40 videos are presented at the initial load of the page as you scroll down, AJAX (via javascript) is called and more data is sent to the page.
This is done to increase the page load time. If you send all 1000+ video data at the initial load the page would take a week to show anything.
As you scrolldown the page you can see the loading circle appear at the bottom as more data is pulled in.
You want to use something like selenium to replicate the scrolling to the bottom of the page and then capture the data once they have all loaded in.
Good luck.
Upvotes: 1