Reputation: 52
I have been playing around with the requests library in Python 3 for quite some time now, and have decided to create a test program. For this program, I'm using the website https://ytmp3.cc/ as an example. But it turns out that a lot is going on, on the client-side it seems.
Some keys and other stuff are being generated, and I have been using Firefox's built-in network monitor, to figure out in which requests this is being made, but without luck.
As far as I know, the requests-library can't keep a "page" open and modify the DOM and content, by making more requests.
Anyone whom could take a look, and give a qualified guess on how the special keys are generated, and how I could possibly get these for my own requests.
Fx when loading the webpage, the first request made is for the root, and the response contains the webpage HTML. What I noticed is that at the bottom, there's an url containing some key and number.
<script id="cs" src="js/converter-1.0.js?o=7_1a-a&=_1519520467"></script>
id 7_1a-a
number _1519520467`
This is used for making the next request, but then a lot of following requests are being made, and some other keys are made as well. But I can't find where these come from since they are not returned by a request.
I know that when inserting a Youtube link, a request will be made to an url, as seen below.
https://d.ymcdn.cc/check.php?callback=jQuery33107639361236859977_1519520481166&v=eVD9j36Ke94&f=mp3&k=7_1a-a&_=1519520481168
This returns the following:
jQuery33107639361236859977_1519520481166({"sid":"21","hash":"2a6b2475b059101480f7f16f2dde67ac","title":"M\u00d8 - Kamikaze (Official Video)","ce":1,"error":""})
From this I can construct the download url, using the hash from above:
https://yyd.ymcdn.cc/ + 2a6b2475b059101480f7f16f2dde67ac (hash) + /eVD9j36Ke94
(youtube video id)
But how do I get
jQuery33107639361236859977_1519520481166&v=eVD9j36Ke94
and 1519520481168
Which I need to create the request?
Upvotes: 0
Views: 500
Reputation: 298136
You can probably save yourself and the operator of that website a lot of headache by just using youtube-dl
, specifically with the --extract-audio --audio-format mp3
options. It's probably what that website itself uses.
youtube-dl
is written in Python and can easily be used programatically.
If you insist on sending requests to that website for whatever reason, here's how I'd do it:
callback=jQuery33107639361236859977_1519520481166
specifies the name of the callback for the JSONP request. Any name you provide will be printed back out. For example, passing callback=foo
will result in the following response:
foo({...})
You can omit it entirely and the server will serve just a JSON response in this case, which is nice.
_=1519520481168
is just to prevent the response being cached. It's randomly generated, just like the above parameter. The website checks for existence, however, so you have to at least pass something in.
The website, like many, checks for a valid Referer
header.
Here's a minimal cURL command line to make a request to that website:
curl 'https://d.ymcdn.cc/check.php?v=eVD9j36Ke94&f=mp3&k=aZa4__&_=1' -H 'Referer: https://ytmp3.cc/'
Upvotes: 1