av abhishiek
av abhishiek

Reputation: 667

Downloading embedded video using python

I am trying to download an emmbedded link via python, below is a sample link

https://matterhorn.dce.harvard.edu/engage/player/watch.html?id=f7ff1893-fbf7-4909-b44e-12e61a98a677

When I navigate to that page it takes some to load and also have to press play,

Any help would be much appreciated.

Upvotes: 1

Views: 4694

Answers (1)

Dekel
Dekel

Reputation: 62676

If you will view the generated source of the page (after the DOM was loaded and javascript code run) you will see that it's an HTML page (And not a link to a video). The source contains javascript code that generates this html:

<div id="playerContainer_videoContainer_container" role="main" style="position: relative; display: block; margin-left: auto; margin-right: auto; width: 1902px; height: 1070px; top: 0px;">
<div id="overlayContainer" role="main" style="position: absolute; left: 0px; right: 0px; top: 0px; bottom: 0px; overflow: hidden; z-index: 10;"></div>
<img id="playerContainer_videoContainer_bkg" src="config/profiles/resources/slide_professor_paella.jpg" alt="" width="100%" height="100%" style="position: relative; top: 0px; left: 0px; right: 0px; bottom: 0px; z-index: 0;">
<video id="playerContainer_videoContainer_1" preload="auto" style="top: 18.4722%; left: 0.390625%; width: 65%; height: 65%; position: absolute; z-index: 1;" poster="https://da4w749qm6awt.cloudfront.net/engage-player/f7ff1893-fbf7-4909-b44e-12e61a98a677/attachment-5/presenter_delivery.jpg">
    <source src="https://da4w749qm6awt.cloudfront.net/engage-player/f7ff1893-fbf7-4909-b44e-12e61a98a677/24320288-b79e-49e5-93b6-96b4c208f8cb/presenter_delivery.mp4" type="video/mp4">
</video>
<video id="playerContainer_videoContainer_2" preload="auto" style="top: 33.4722%; left: 66.0156%; width: 33.75%; height: 33.75%; position: absolute; z-index: 1;" poster="https://da4w749qm6awt.cloudfront.net/engage-player/f7ff1893-fbf7-4909-b44e-12e61a98a677/attachment-8/presentation_delivery.jpg">
    <source src="https://da4w749qm6awt.cloudfront.net/engage-player/f7ff1893-fbf7-4909-b44e-12e61a98a677/93271e20-3f4b-4650-a7e3-95aac41fd3e5/presentation_delivery.mp4" type="video/mp4">
</video>

So the file you actually want to download is

https://da4w749qm6awt.cloudfront.net/engage-player/f7ff1893-fbf7-4909-b44e-12e61a98a677/24320288-b79e-49e5-93b6-96b4c208f8cb/presenter_delivery.mp4"

If you check your network tab (in the developers toolbar) you will notice an ajax request to this url:

https://matterhorn.dce.harvard.edu/search/episode.json?id=f7ff1893-fbf7-4909-b44e-12e61a98a677&_=1477764682940

(you can see that the id here is the same one as the id in your original URL).

The response from this request is a json string:

{"search-results":{"searchTime":"1","total":"1","limit":"1","offset":"0","query":"(id:f7ff1893\\-fbf7\\-4909\\-b44e\\-12e61a98a677) AND oc_organization:mh_default_org AND (o

Only partial response, since it's too big to put here.

Part of the response is:

search-results.result.mediapackage.media.track

Which have 6 items, each of them has a URL property you can use to take the relevant video-links:

enter image description here

I think this information gives you a good place to start :)

Upvotes: 3

Related Questions