Reputation: 155
Youtube is using cipher signature for some of the videos when the use_cipher_signature = true
in the dictionary returned through
http://www.youtube.com/get_video_info?&video_id=Video_Id
Example id: _JQH3G0cCtY
Ciphered signature is actually the scrambled signature which anybody can debug with few sets of working signatures. But Youtube keeps on changing the scrambling algo. I have seen few Youtube video downloader which are working smoothly without being affected through this changing game. I think they are checking the player and extracting the decode script from the player file and hence it keeps them going.
I need some help about the actual technique to use in this case. I am aware of 'youtube-dl' - a python program to download the videos. As I am not good in python, I think that they are using the same approach.
Also there is a user-script JS file available here: http://userscripts.org/scripts/show/25105 , which is doing the same thing.
Any help about the sane approach to decode the cipher code in PHP or JS will be appreciated.
Upvotes: 12
Views: 30352
Reputation: 157
Well, I wish I'd have found this post earlier, because I just spend several days trying to understand why my old base.js stopped working. Mine is build b128dda0:
https://www.youtube.com/s/player/b128dda0/player_ias.vflset/en_GB/base.js
and after they released a new one:
https://www.youtube.com/s/player/8c7583ff/player_ias.vflset/en_GB/base.js
and changed the challenge (which they call a cipher, while it's actually a challenge) calculation algorithms. Maybe for newbies, who stumble upon this page, let me explain, how this works. When you go to a youtube page, they will send you a html page that contains this challenge in two different forms, sometimes one is used, sometimes another, don't know how they decide which to use.
Located at (after creating an object out of the text using JSON.parse):
streamingData -> formats[XXX] array -> url
streamingData -> formats[XXX] array -> signatureCipher
You take that url and you turn the url search into an URLSearchParams object and:
URLSearchParams.prototype.get.call(URLSearchParams_Instance, "n")
"n" holds the challenge. However, if you are NOT on an youtube page (for example are using "youtube.com/embed/" code), or you navigate around youtube.com , in which case youtube doesn't actually open new pages, but rather ajaxes new content into the existing page, then you get your challenge at:
https://www.youtube.com/youtubei/v1/player?key=
But why is this challenge even necessary? Well, you use html from youtube.com , but their video streaming data is located on googlevideo.com , so it works like this:
- youtube.com sends you the challenge, either thru html, or through a POST to: youtubei/v1/player
- base.js script takes this challenge and solves it using some math, which the googies constantly change, which is really annoying, because I made some changes to base.js and really don't feel like transferring all my changes to a new script, which they are constantly updating, changing the challenge solving algorithms along the way all the time
- the solved challenge is sent as a string to googlevideo.com as an URL search param:
https://XXX.googlevideo.com/videoplayback?n=Your_Solved_Challenge
You get it wrong, you get a big fat 403 from server. Took me some time to find the two (yes, damn it, there are TWO challenge functions!) functions. Problem is, google's choosing function and object names randomly when they release their code, but I think these two functions retain their names, they are called "vsa" and "Wla" respectively. You don't solve "vsa", you get 403 from server. But! BUT! You fail to solve the second function, you get the video stream, you can watch it, but the googies are insidious, they will send you the stream at 1/10 of normal speed, meaning video seeking, even watching gets REAL slow.
So rather than moving all MY changes to the new script, I moved their updated challenge functions to my old. Problem solved, right? I open a video for testing, it's still slow like hell. Trying with the new script - works flawlessly. What am I missing? So I check the body of the XMLHttpRequest that is sent to "youtube.com/youtubei/v1/player" . I get both bodies - the one sent by the old script and the one sent by the new one, beautify them, unescape them and compare for differences. Nothing there! The only differences are:
visitorData -> string from html page, randomly chosen. irrelevant
appInstallData -> string from html page, randomly chosen. irrelevant
deviceExperimentId -> string from html page, randomly chosen. irrelevant
clientScreenNonce -> ???
cpn -> ???
adSignalsInfo array -> you like ads?
signatureTimestamp -> well, just a time signature, right? totally irrelevant, I am sure
So I think, maybe there's a difference in the headers. Compare the headers, the new script sends "INNERTUBE_CONTEXT_CLIENT_VERSION", which my own script doesn't, because if you send unsafe headers, modern browsers will send an OPTIONS request first, which doesn't suit me, because I am using a local html page to ajax youtube html into a local iframe using XMLHttpRequest to avoid the crossorigin iframe problem, where frames from different origins cannot access each other. Now you may ask, how the hell have you managed to get videos working in a local iframe, considered it sends "file://" for Origin? Well, I told my browser proxy to replace "file://" by the youtube host (https://www.youtube.com) when "youtubei/v1/player" is called, lol, and used String.prototype.replace to add a new BASE to the head of the responseText returned by XMLHttpRequest
<base href='https://www.youtube.com/'>
Plus, I had to set location.protocol to "https:" explicitly in base.js, plus redirected calls to:
captions.js
heartbeat.js
annotations_module.js
to my own scripts, because those scripts depend on base.js, so when googies change their base.js function/object names during updates, the originals wouldn't work (they would try to call functions from base.js by their new names), plus I made some changes to them anyway.
Well, actually, Chrome-based browsers (don't know about Firefox) can be tricked into accepting a fake OPTIONS request. Tell your local proxy to stay on the look for "Access-Control-Request-Headers" and if you don't want the request to arrive at the server, send a 200 immediately back, without passing the request to the server + a new "Access-Control-Allow-Headers" containing all the headers Chrome/Edge etc. asked for. But that's annoying, so I disabled all unsafe headers inside base.js.
So I disable all unsafe headers in the new script as well to see what happens. Well, guess what, it still worked, I was getting googlevideo.com data at full speed! So it's not the headers. So what am I missing?! So I take a closer look at the challenge. What the...? "youtubei/v1/player" is sending me a challenge with 19 characters (used by my old "Wla" function), while the new script is getting a challenge string with 18 characters. How does the youtube server distinguish them? I am at a loss. Headers sent are the same, body sent is pretty much the same with insignificant differences, what is going on? So I delete the body properties and objects that differ - "visitorData"? Nada. "appInstallData"? Nada. "deviceExperimentId"? Nada. "adSignalsInfo"? Nada. "clientScreenNonce"? Nada. "cpn"? Nada. What the hell is going on? There's nothing left but the ridiculously irrelevant "signatureTimestamp" property! It can't be it! I delete it, request the challenge... and... this "signatureTimestamp" is not a time stamp at all. It's the actual build/version number of the youtube player. My old script sent "19513", the new one - "19515". String's missing, youtube assumes you have the latest build. So I found the function which holds the build number and replaced it by the new one. Now I get the new challenge which works for the new challenge functions.
Problem is, if the challenge is located inside html (as the "n" search parameter) - see above, it will always contain the challenge for the newest challenge solving functions, as opposed to the "youtubei/v1/player" POST request, html doesn't distinguish between different build versions, so the old algorithms were causing problems.
Upvotes: 1
Reputation: 35402
I've translated to Swift 3 the Akhilesh's answer for the iOS people:
func decryptSignature(signature:String)->String {
return bz(signature)
}
func bz(_ a:String)->String {
var arrayA = Array(a.characters)
arrayA = cz(arrayA, 61)
arrayA = cz(arrayA, 5)
arrayA = arrayA.reversed()
arrayA = Array(arrayA[2..<arrayA.count])
arrayA = cz(arrayA, 69)
arrayA = Array(arrayA[2..<arrayA.count])
arrayA = arrayA.reversed()
return String(arrayA)
}
func cz(_ a:Array<Character>, _ b:Int)->Array<Character> {
var arrayA = a
let c = a[0]
arrayA[0] = a[b % a.count];
arrayA[b] = c
return arrayA
}
But I think that this algorithm isn't enough, it decrypt the signature following a specific rule. In fact , according with this perl script (youtubedown from Jamie Zawinski) the algorithm change everytime and the script collect a list of rules and algorithms during days!. So far, only three commands are used in the ciphers, so we can represent them compactly:
# - r = reverse the string;
# - sN = slice from character N to the end;
# - wN = swap 0th and Nth character.
I think that the best way is to realize something like:
func decryptChiper(_ commands:String, signature:String)->String {
var a = Array(signature.characters)
let cmdArray:[String]! = commands.components(separatedBy: " ")
for cmd in cmdArray {
var value:Int!
if cmd.characters.count>1 {
let secondChar = cmd.index(cmd.startIndex, offsetBy: 1)
value = Int(cmd.substring(from:secondChar))
}
switch cmd[cmd.startIndex] {
case "r": a = a.reversed()
case "s":
if let sliceFrom = value {
a = Array(a[sliceFrom..<a.count])
}
case "w":
if let swapValue = value {
a = swap(a,swapValue)
}
default:break
}
}
return String(a)
}
func swap(_ a:Array<Character>, _ b:Int)->Array<Character> {
var arrayA = a
let c = a[0]
arrayA[0] = a[b % a.count];
arrayA[b] = c
return arrayA
}
Usage:
To make an example following that Akhilesh answer:
let signature = "D3D3434498D70C3080D9B084E48350F6519A9E9A71094.25F300BB180DDDD918EE0EBEDD174EE5D874EFEFF"
let decryptedSign = decryptChiper("w61 w5 r s2 w69 s2 r", signature: signature )
print(decryptedSign)
Output:
33D3494498D70C3E80D9B084E48350F6519A9E9A71094.25F300BB180DDDDD18EE0EBEDD174EE5D874E
Upvotes: 1
Reputation: 41
Ye old s.ytimg.com/yts/jsbin/html5player-vfltdb6U3.js
is now 404 and the new URL i think looks a bit more like hxxps://s.ytimg.com/yts/jsbin/player-en_US-vfl_cdzrt/base.js
if you search the JavaScript you will find a bit of code that looks like this
function(a,b,c)
{
a=new Mr(a);
a.set("alr","yes");a.set("keepalive","yes");a.set("ratebypass","yes");a.set("mime",(0,window.encodeURIComponent)(b.mimeType.split(";")[0]));c&&a.set("signature",xr(c));return a},Jt=function(a,b){var c=Yr(b,"id"),c=c.replace(":",";");..............
}
The xr
function that the above code calls looks like this
xr=function(a)
{
a=a.split("");
wr.rF(a,54);
wr.fs(a,75);
wr.N0(a,1);
wr.rF(a,52);
wr.N0(a,3);
wr.fs(a,31);
wr.rF(a,16);
wr.fs(a,38);
return a.join("")
}
After that i start to get a bit lost with the javascript and could do with a bit of help myself but talking about this on the code project gets you in troube.
Upvotes: 2
Reputation: 1104
Url structure and cipher code keeps on changing by Youtube. Presently, the best approach to decode the cipher signature is explained below:
Ciphered signature in Youtube are just 'scrambled' signature that you have to rearrange them according to the Algorithm present in the player file (HTML5 player or Flash player).
For example http://www.youtube.com/watch?v=UxxajLWwzqY
is presently using the following HTML5 player file : //s.ytimg.com/yts/jsbin/html5player-vfltdb6U3.js
in this file you can easily search for signature decipher code by searching for 'sig'. Here in this case the Algo is:
function bz(a) {
a = a.split("");
a = cz(a, 61);
a = cz(a, 5);
a = a.reverse();
a = a.slice(2);
a = cz(a, 69);
a = a.slice(2);
a = a.reverse();
return a.join("")
}
function cz(a, b) {
var c = a[0];
a[0] = a[b % a.length];
a[b] = c;
return a
};
Above is the deciphering code.
But be aware, it keeps on changing when they change the player file, so you have to keep a tap on the player file being used.
Also to download videos with cipher signature you have to take care of the sending the same cookies, using the same user-agent header, sending the request from the same IP address, and sending the request shortly after extraction. All these are or were required at some point
If interested in cipher decryption algo, please visit CipherAPI
Another cool API: TYstream API
Upvotes: 28