Reputation: 3
Assumption / What I want to achieve
I want to use YouTube Data API V3 to get the video ID without any omissions, and find out if the cause of the trouble is in the code or in the video settings of YouTube (API side).
Problem
The following code is used to get the video information from YouTube Data API, but the number of IDs I got did not match the number of videos that are actually posted.
from apiclient.discovery
import build
id = "UCD-miitqNY3nyukJ4Fnf4_A" #sampleID
token_check = None
nextPageToken = None
id_info = []
while True:
if token_check != None:
nextPageToken = token_check
Search_Video = youtube.search().list(
part = "id",
channelId = id,
maxResults = 50,
order = 'date',
safeSearch = "none",
pageToken = nextPageToken
).execute()
for ID_check in Search_Video.get("items", []):
if ID_check["id"]["kind"] == "youtube#video":
id_info.append(ID_check["id"]["videoId"])
try:
token_check = Search_Video["nextPageToken"]
except:
print(len(id_info)) #check number of IDs
break
I also used the YouTube Data API function to get the videoCount
information of the channel, and noticed that the value of videoCount
did not match the number of IDs obtained by the code above, which is why I posted this.
According to channels()
API, this channel have 440 videos, but the above code gets only 412 videos (at 10:30 a.m. JST).
Supplemental Information
・Python 3.9.0
・YouTube Data API v3
Upvotes: 0
Views: 180
Reputation: 6975
You have to acknowledge that the Search.list
API endpoint does not have a crisp behavior. That means you should not expect precise results from it. Google does not document this behavior as such, but this forum has many posts from users experiencing that.
If you want to obtain all the IDs of videos uploaded by a given channel then you should employ the following two-step procedure:
Step 1: Obtain the ID of the Uploads Playlist of a Channel.
Invoke the Channels.list
API endpoint, queried with its request parameter id
set to the ID of the channel of your interest (or, otherwise, with its request parameter mine
set to true
) for to obtain that channel's uploads playlist ID, contentDetails.relatedPlaylists.uploads
.
def get_channel_uploads_playlist_id(youtube, channel_id):
response = youtube.channels().list(
fields = 'items/contentDetails/relatedPlaylists/uploads',
part = 'contentDetails',
id = channel_id,
maxResults = 1
).execute()
items = response.get('items')
if items:
return items[0] \
['contentDetails'] \
['relatedPlaylists'] \
.get('uploads')
else:
return None
Do note that the function get_channel_uploads_playlist_id
should only be called once for to obtain the uploads playlist
ID of a given channel; subsequently use that ID as many times as needed.
Step 2: Retrieve All IDs of Videos of a Playlist.
Invoke the PlaylistItems.list
API endpoint, queried with its request parameter playlistId
set to the ID obtained from get_channel_uploads_playlist_id
:
def get_playlist_video_ids(youtube, playlist_id):
request = youtube.playlistItems().list(
fields = 'nextPageToken,items/snippet/resourceId',
playlistId = playlist_id,
part = 'snippet',
maxResults = 50
)
videos = []
is_video = lambda item: \
item['snippet']['resourceId']['kind'] == 'youtube#video'
video_id = lambda item: \
item['snippet']['resourceId']['videoId']
while request:
response = request.execute()
items = response.get('items', [])
assert len(items) <= 50
videos.extend(map(video_id, filter(is_video, items)))
request = youtube.playlistItems().list_next(
request, response)
return videos
Do note that, when using the Google's APIs Client Library for Python (as you do), API result set pagination is trivially simple: just use the list_next
method of the Python API object corresponding to the respective paginated API endpoint (as was shown above):
request = API_OBJECT.list(...)
while request:
response = request.execute()
...
request = API_OBJECT.list_next(
request, response)
Also note that above I used twice the fields
request parameter. This is good practice: ask from the API only the info that is of actual use.
Yet an important note: the PlaylistItems.list
endpoint would not return items that correspond to private videos of a channel when invoked with an API key. This happens when your youtube
object was constructed by calling the function apiclient.discovery.build
upon passing to it the parameter developerKey
.
PlaylistItems.list
returns items corresponding to private videos only to the channel owner. This happens when the youtube
object is constructed by calling the function apiclient.discovery.build
upon passing to it the parameter credentials
and if credentials
refer to the channel that owns the respective playlist.
An additional important note: according to Google staff, there's an upper 20000 limit set by design for the number of items returned via PlaylistItems.list
endpoint when queried for a given channel's uploads playlist. This is unfortunate, but a fact.
Upvotes: 1