user14766510
user14766510

Reputation:

YouTube Data API: Exclude livestreams from Search.list result set

Today I have an (I think) simple question. I want to exclude YouTube live streams from my YouTube Data API Search.list result sets. How can I do that? I can't find a function to do that in the docs from the API.

That's what I tried:

https://www.googleapis.com/youtube/v3/search?channelId=UCZMsvbAhhRblVGXmEXW8TSA&part=snippet,id&order=viewCount&maxResults=1&regionCode=DE&eventType=completed&type=video&publishedAfter=2021-02-05T00:00:00Z&key={KEY}

But that includes live streams, I want to remove them from the search. The live streams has always LIVE in the video title, maybe that helps. I tried also to use q, but I get always 0 search results.

Upvotes: 1

Views: 1708

Answers (2)

Luyang Du
Luyang Du

Reputation: 180

I came across the same issue and couldn't find a parameter to filter out the live broadcasting.

I did the way with just the duration property in the response. Seems like the broadcasting duration has a const value P0D instead of an actual duration. I use that to filter out the videos I don't want.

Upvotes: 0

stvar
stvar

Reputation: 6985

The short answer is the following: you'll have to filter out manually the videos that are live streams from each result set obtained from the Search.list API endpoint.

The longer answer follows below:

A given video is a live stream if and only if that video has attached the property liveStreamingDetails.

I'd warmly recommend to go through a very recent answer of mine that details a Python solution to precisely this issue, the function get_non_livestream_videos:

def get_non_livestream_videos(youtube, video_ids):
    assert len(video_ids) <= 50
    if not video_ids: return []

    response = youtube.videos().list(
        fields = 'items(id,liveStreamingDetails)',
        part = 'id,liveStreamingDetails',
        maxResults = len(video_ids),
        id = ','.join(video_ids),
    ).execute()

    items = response.get('items', [])
    assert len(items) <= len(video_ids)

    not_live = lambda video: \
        not video.get('liveStreamingDetails')
    video_id = lambda video: video['id']

    return map(video_id, filter(not_live, items))

If you have a list of video IDs video_ids then this function calls the Videos.list API endpoint for to determine the existence of the property liveStreamingDetails for each of the respective videos. Any video that has such a property gets filtered out from the resulting list of video IDs.

Note that above I used the fields request parameter for to get from the APIs only the info that's actually needed.

Also note that a precondition of using get_non_livestream_videos is that its list argument video_ids to be of at most 50 elements.

This is not an actual restriction when using this function, because it's supposed to be used on a list of video IDs that's obtained from a paged API result set. (Search.list returns paginated result sets of at most 50 items.)


The solution indicated above excludes from a list of video IDs all the videos that are upcoming, live, or completed live broadcasts.

Notice that videos corresponding to completed live broadcasts are excluded too.

Now, if you don't want these kind of videos to be excluded from your result set (that is that you need to have excluded the videos corresponding to only upcoming or currently live broadcasts), then there's a simpler solution to your inquiry:

The Search resource objects returned by the Search.list endpoint provide the following property:

snippet.liveBroadcastContent (string)

An indication of whether a video or channel resource has live broadcast content. Valid property values are upcoming, live, and none.

For a video resource, a value of upcoming indicates that the video is a live broadcast that has not yet started, while a value of live indicates that the video is an active live broadcast. For a channel resource, a value of upcoming indicates that the channel has a scheduled broadcast that has not yet started, while a value of live indicates that the channel has an active live broadcast.

Consequently, you may well filter out manually from Search.list queries the videos having snippet.liveBroadcastContent of value different than none as simply as shown below:

not_live = lambda item: \
    item['snippet']['liveBroadcastContent'] == 'none'

request = youtube.search().list(
    fields = 'nextPageToken,items(id,snippet)',
    publishedAfter = '2021-02-05T00:00:00Z',
    channelId = 'UCZMsvbAhhRblVGXmEXW8TSA',
    part = 'id,snippet',
    order = 'viewCount',
    regionCode = 'DE',
    maxResults = 50,
    type = 'video'
)
videos = []

while request:
    response = request.execute()
    items = response.get('items', [])
    videos.extend(filter(not_live, items))
    request = youtube.search().list_next(
        request, response)

An important note: the solution above is simple, but limited by the following specification of the channelId request parameter of Search.list:

channelId (string)

The channelId parameter indicates that the API response should only contain resources created by the channel.

Note: Search results are constrained to a maximum of 500 videos if your request specifies a value for the channelId parameter and sets the type parameter value to video, but it does not also set one of the forContentOwner, forDeveloper, or forMine filters.

Consequently, the pagination loop above, over the result sets provided by Search.list, would lead to a videos list of at most 500 elements.

Upvotes: 3

Related Questions