Reputation:
Today I have an (I think) simple question.
I want to exclude YouTube live streams from my YouTube Data API Search.list
result sets.
How can I do that? I can't find a function to do that in the docs from the API.
That's what I tried:
https://www.googleapis.com/youtube/v3/search?channelId=UCZMsvbAhhRblVGXmEXW8TSA&part=snippet,id&order=viewCount&maxResults=1®ionCode=DE&eventType=completed&type=video&publishedAfter=2021-02-05T00:00:00Z&key={KEY}
But that includes live streams, I want to remove them from the search. The live streams has always LIVE
in the video title, maybe that helps. I tried also to use q
, but I get always 0 search results.
Upvotes: 1
Views: 1708
Reputation: 180
I came across the same issue and couldn't find a parameter to filter out the live broadcasting.
I did the way with just the duration
property in the response. Seems like the broadcasting duration has a const value P0D
instead of an actual duration. I use that to filter out the videos I don't want.
Upvotes: 0
Reputation: 6985
The short answer is the following: you'll have to filter out manually the videos that are live streams from each result set obtained from the Search.list
API endpoint.
The longer answer follows below:
A given video is a live stream if and only if that video has attached the property liveStreamingDetails
.
I'd warmly recommend to go through a very recent answer of mine that details a Python solution to precisely this issue, the function get_non_livestream_videos
:
def get_non_livestream_videos(youtube, video_ids):
assert len(video_ids) <= 50
if not video_ids: return []
response = youtube.videos().list(
fields = 'items(id,liveStreamingDetails)',
part = 'id,liveStreamingDetails',
maxResults = len(video_ids),
id = ','.join(video_ids),
).execute()
items = response.get('items', [])
assert len(items) <= len(video_ids)
not_live = lambda video: \
not video.get('liveStreamingDetails')
video_id = lambda video: video['id']
return map(video_id, filter(not_live, items))
If you have a list of video IDs video_ids
then this function calls the Videos.list
API endpoint for to determine the existence of the property liveStreamingDetails
for each of the respective videos. Any video that has such a property gets filtered out from the resulting list of video IDs.
Note that above I used the fields
request parameter for to get from the APIs only the info that's actually needed.
Also note that a precondition of using get_non_livestream_videos
is that its list argument video_ids
to be of at most 50 elements.
This is not an actual restriction when using this function, because it's supposed to be used on a list of video IDs that's obtained from a paged API result set. (Search.list
returns paginated result sets of at most 50 items.)
The solution indicated above excludes from a list of video IDs all the videos that are upcoming, live, or completed live broadcasts.
Notice that videos corresponding to completed live broadcasts are excluded too.
Now, if you don't want these kind of videos to be excluded from your result set (that is that you need to have excluded the videos corresponding to only upcoming or currently live broadcasts), then there's a simpler solution to your inquiry:
The Search resource
objects returned by the Search.list
endpoint provide the following property:
snippet.liveBroadcastContent
(string)An indication of whether a
video
orchannel
resource has live broadcast content. Valid property values areupcoming
,live
, andnone
.For a
video
resource, a value ofupcoming
indicates that the video is a live broadcast that has not yet started, while a value oflive
indicates that the video is an active live broadcast. For achannel
resource, a value ofupcoming
indicates that the channel has a scheduled broadcast that has not yet started, while a value oflive
indicates that the channel has an active live broadcast.
Consequently, you may well filter out manually from Search.list
queries the videos having snippet.liveBroadcastContent
of value different than none
as simply as shown below:
not_live = lambda item: \
item['snippet']['liveBroadcastContent'] == 'none'
request = youtube.search().list(
fields = 'nextPageToken,items(id,snippet)',
publishedAfter = '2021-02-05T00:00:00Z',
channelId = 'UCZMsvbAhhRblVGXmEXW8TSA',
part = 'id,snippet',
order = 'viewCount',
regionCode = 'DE',
maxResults = 50,
type = 'video'
)
videos = []
while request:
response = request.execute()
items = response.get('items', [])
videos.extend(filter(not_live, items))
request = youtube.search().list_next(
request, response)
An important note: the solution above is simple, but limited by the following specification of the channelId
request parameter of Search.list
:
channelId
(string)The
channelId
parameter indicates that the API response should only contain resources created by the channel.Note: Search results are constrained to a maximum of 500 videos if your request specifies a value for the
channelId
parameter and sets thetype
parameter value tovideo
, but it does not also set one of theforContentOwner
,forDeveloper
, orforMine
filters.
Consequently, the pagination loop above, over the result sets provided by Search.list
, would lead to a videos
list of at most 500 elements.
Upvotes: 3