bobygerm
bobygerm

Reputation: 483

Youtube Data API v3 search query with Freebase topics id and date

I am trying to use the YouTube Data API to search video by using Freebase topics. It works for a simple search with one topic but it doesn't work for a search with 2 parameters : a topic id and a "publishedAfter" date.

For example, to search for videos about Richard Branson (Freebase id = /m/0n839), the youtube data api site : https://developers.google.com/youtube/v3/docs/search/list#try-it with the parameters :

part : snippet

topicId : /m/0n839

show a result of 2165 videos

But when you try to search videos about id = /m/0n839 with

part : snippet

topicId : /m/0n839

publishedAfter : 2013-05-21T21:47:38Z

the response indicates 147 videos but only 3 videos are contained in the response. In the response, you can see that the 3 video ids are J6PY5vxLU8Y, SCkFwpW3kiE, CBvDp0i8Iok.

{
 "kind": "youtube#searchListResponse",
 "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/_-rYO0M0nvpPO6QN7DPFGEYa9ho\"",
 "pageInfo": {
     "totalResults": 147,
     "resultsPerPage": 5
     },
 "nextPageToken": "CAUQAA",
 "items": [
     {
     "kind": "youtube#searchResult",
     "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/3nxQ-ejnv0qekcbyq09HD2RQt_w\"",
     "id": {
         "kind": "youtube#video",
         "videoId": "J6PY5vxLU8Y"
     },
     "snippet": {
         "publishedAt": "2013-05-22T09:58:34.000Z",
         "channelId": "UCimPiDCqxvfqwVJltL4YzHg",
         "title": "Bono, Richard Branson, and Olivia Wilde Joined Matt Damon's Strike!",
         "description": "Support the strike: http://strikewithme.org/ Millions of celebrities have joined Matt Damon's \"Toilet Strike\" protesting the lack of access to safe water and...",
         "thumbnails": {
             "default": {
                 "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/default.jpg"
                 },
             "medium": {
                 "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/mqdefault.jpg"
                 },
             "high": {
                 "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/hqdefault.jpg"
                 }
             },
         "channelTitle": "water"
    }
},
{
 "kind": "youtube#searchResult",
 "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/gNslTbFkShGLlUBcXgHw7q9tuJc\"",
 "id": {
     "kind": "youtube#video",
     "videoId": "SCkFwpW3kiE"
     },
 "snippet": {
     "publishedAt": "2013-05-23T16:27:31.000Z",
     "channelId": "UCeF4WiRKOA4XzphWYzR9aVw",
     "title": "Sir Richard Branson dresses as an air stewardess after losing bet",
     "description": "(Reuters) - A man was arrested at the Cannes film festival on Friday after firing a starting pistol during a live TV broadcast on the palm-lined waterfront, ...",
     "thumbnails": {
         "default": {
             "url": "https://i.ytimg.com/vi/SCkFwpW3kiE/default.jpg"
             },
         "medium": {
             "url": "https://i.ytimg.com/vi/SCkFwpW3kiE/mqdefault.jpg"
             },
         "high": {
             "url": "https://i.ytimg.com/vi/SCkFwpW3kiE/hqdefault.jpg"
             }
         },
     "channelTitle": "TheDailyPolitics247"
     }
},
{
 "kind": "youtube#searchResult",
 "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/-_OEOHhzgCBTLr7x5UoDk2kHlJM\"",
 "id": {
     "kind": "youtube#video",
     "videoId": "CBvDp0i8Iok"
     },
 "snippet": {
     "publishedAt": "2013-05-25T22:38:00.000Z",
     "channelId": "UC2j75rAKcfjBAhW7WmNY4Qg",
     "title": "Virgin Galactic Spaceship Passes Big Test (Y)",
     "description": "A spaceship bankrolled by British tycoon Sir Richard Branson made its first engine-powered flight Monday. The test flight moves Virgin Galactic toward its go...",
     "thumbnails": {
     "default": {
         "url": "https://i.ytimg.com/vi/CBvDp0i8Iok/default.jpg"
         },
     "medium": {
         "url": "https://i.ytimg.com/vi/CBvDp0i8Iok/mqdefault.jpg"
         },
     "high": {
         "url": "https://i.ytimg.com/vi/CBvDp0i8Iok/hqdefault.jpg"
     }
 },
"channelTitle": "NewActionNews"
}
}
]
}

If you search for the next videos with nextPageToken : "CAUQAA", the response contains the same video ids (J6PY5vxLU8Y, SCkFwpW3kiE, CBvDp0i8Iok):

{
 "kind": "youtube#searchListResponse",
 "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/xMtmG2pQsuo_TFF8AtaaPea-cNc\"",
 "pageInfo": {
     "totalResults": 144,
     "resultsPerPage": 5
     },
 "nextPageToken": "CAoQAA",
 "prevPageToken": "CAUQAQ",
 "items": [
     {
      "kind": "youtube#searchResult",
      "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/3nxQ-ejnv0qekcbyq09HD2RQt_w\"",
      "id": {
       "kind": "youtube#video",
       "videoId": "J6PY5vxLU8Y"
       },
      "snippet": {
      "publishedAt": "2013-05-22T09:58:34.000Z",
      "channelId": "UCimPiDCqxvfqwVJltL4YzHg",
      "title": "Bono, Richard Branson, and Olivia Wilde Joined Matt Damon's Strike!",
      "description": "Support the strike: http://strikewithme.org/ Millions of celebrities have joined Matt Damon's \"Toilet Strike\" protesting the lack of access to safe water and...",
      "thumbnails": {
          "default": {
            "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/default.jpg"
            },
          "medium": {
            "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/mqdefault.jpg"
            },
          "high": {
            "url": "https://i.ytimg.com/vi/J6PY5vxLU8Y/hqdefault.jpg"
            }
          },
      "channelTitle": "water"
    }
},
{
 "kind": "youtube#searchResult",
 "etag": "\"g-RLCMLrfPIk8n3AxYYPPliWWoo/cEIRgKqwt1aa9hcWMNtGTiCJImc\"",
 "id": {
    "kind": "youtube#video",
    "videoId": "h7hJ3FDGWY8"
    },
 "snippet": {
    "publishedAt": "2013-05-22T10:01:25.000Z",
    "channelId": "UCqcE1T9zcUQyX3hHH4EM7sQ",
    "title": "Sir Richard Branson in Dubai",
    "description": "The man behind the Virgin brand stopped by Kris Fade's show last week - broadcasting from the Burj Khalifa, the world's tallest building.",
    "thumbnails": {
      "default": {
        "url": "https://i.ytimg.com/vi/h7hJ3FDGWY8/default.jpg"
        },
      "medium": {
        "url": "https://i.ytimg.com/vi/h7hJ3FDGWY8/mqdefault.jpg"
        },
      "high": {
        "url": "https://i.ytimg.com/vi/h7hJ3FDGWY8/hqdefault.jpg"
        }
      },
    "channelTitle": "Kimberleyleonard"
   }
 }
]

}

Did I do something wrong ?

Upvotes: 1

Views: 3111

Answers (2)

Ibrahim Ulukaya
Ibrahim Ulukaya

Reputation: 12877

I'm pretty sure the discrepancy between number of results and number actually shown is due to how date restrict ( start_time: 1369172858, the representation of "publishedAfter : 2013-05-21T21:47:38Z") is implemented. By the way, I now see 10 matches, including the three mentioned.

First results are retrieved using the narrowist restrict range that includes the requested date range. That's how you can get ~140 matches. The videos that survive retrieval are then filtered, rejecting those outside the actual requested range. It's pretty plausible that 130 videos are dropped at that stage, leaving the 10 that actually satisfy the request.

The count of matches indicates number retrieved -- for date range, this will typically be an overestimate, possibly severe. We generally don't warrant that the number reported as "matched" all actually match, since various kinds of filtering happen after retrieval.

Upvotes: 0

jlmcdonald
jlmcdonald

Reputation: 13667

The "totalResults" attribute has, in previous versions of the API, always just been an estimated value that the search algorithms provide before actually retrieving any results, so it's likely safe to assume that is the case for v3 as well. However, it is a little odd that the estimate could be so far off; for your query, there really are just several results (5 or 6, I think ... as several have been uploaded since you made this initial post).

I've played around with various parameters for a bit, and it looks as though the factor which has the single-most influence as to the accuracy of the totalResults approximation is the 'q' parameter -- the more specific of a value you provide there, the more accurate the totalResults become.

Of course, in your query, then, the q parameter is empty, and the totalResults approximation is incredibly far off (in fact, if you change the publishedAfter parameter to be May 1st intead of May 21st, you actually get it approximating LESS total results, even though it's an earlier date!). If you do a query like this, however:

https://www.googleapis.com/youtube/v3/search?part=id&maxResults=50&publishedAfter=2010-05-01T21%3A47%3A38Z&topicId=%2Fm%2F0n839&key={YOUR_KEY}&q=Bran

Then you get a totalResults approximation that exactly matches the number of real results.

Of course this can be frustrating, because when incorporating Freebase IDs you often don't WANT to use the q parameter -- the whole point of freebase IDs is to use semantically relevant info that's based in more than just string matching! But it's clear that this is telling us something about the internal YouTube search algorithms and what they rely on. I'd venture that as Freebase integration becomes more mature, the search algorithms will be better able to adapt and you'll start seeing better totalResults approximations again.

As a workaround, you can use what you've noticed with the 'nextPageToken' to get a better count. In your query, set your maxResults to 50, and when you get actual results, have your routine count them. If they're less than 50, you've got them all. If it is 50 on that page, you might want to pre-fetch the next set of results (if they are new results, you're good to go, while if they are the same results as what you have, then you had exactly 50 responses). The one problem is that this will prevent you from displaying in your app an accurate count of total results (i.e. if you've got pagination going somewhere), so it's not perfect, but what workaround is?

Upvotes: 3

Related Questions