hdezela
hdezela

Reputation: 546

Google Cloud Storage Signed Url for media

I've got a video site set up that serves up m3u8 and associated ts files for users. I don't want the media files freely available so what I've done is: when the user is on the site, a session is created in mysql with their IP and a token; when they ask for any file in the specific media subdomain (mp4.domain.com), Nginx auth module queries localhost:8080 with the url and the attached token as a request cookie set through javascript...queries the database and allows/denies access to the file dependant on session information.

Now this works fine, overhead is between 8 - 20 MS depending on server load and there's no need to mangle urls in PHP when generating the m3u8 link; OSMF just gets the m3u8 and asks for the files, javascript adds the token cookie and Bob's your uncle.

Now we're migrating to Google Cloud Storage and the issue I'm faced with is that I can't really control any of that...a signed url to the m3u8 is easy enough, but each and every m3u8 would have to be dynamically generated with signed urls for every ts file for every resolution, thumbnail and audio aac (to give you an idea, a random video I chose has 1,043 files total)...that's 1,043 different signed urls to generate, at roughly 6 MS each gives 6 seconds total generation time...which is horrid.

Is there an alternative way to manage this? I'm not too keen on the Cloud Storage API but I can't seem to find anything else...ACLs seem to be useless for this and the only other thing I can think of is rotating file locations on a ...daily?... basis in order to obfuscate them. Has anyone had a similar situation or have an idea of where I can start playing around to resolve this?

Upvotes: 2

Views: 3484

Answers (1)

hdezela
hdezela

Reputation: 546

After more research, I've come to the following conclusion:

My first calculations where done using Google's gsutil tool, which seems to introduce a lot of overhead when calculating a signed URL hash, for example:

gsutil code:

gsutil signurl -d 60m /path/to/google.p12 gs://bucket/file

Execution time: 0.73812007904053

,however, using native PHP functions to create a signed URL is much faster:

PHP code:

function storageURL($bucket,$archivo) {
    $expires = time()+60; 
    $to_sign = ("GET\n\n\n".$expires."\n/".$bucket.'/'.$archivo);
    $fp = fopen('/path/to/google.pem', 'r');
    $priv_key = fread($fp, 8192);
    fclose($fp);
    $pkeyid = openssl_get_privatekey($priv_key);
    if(!openssl_sign($to_sign,$signature,$pkeyid,'sha256')) {
        $signature = 'sinfirma';
    } else {
        $signature = urlencode(base64_encode($signature));
    }
    return ('https://'.$bucket.'.storage.googleapis.com/'.$archivo.'[email protected]&Expires='.$expires.'&Signature='.$signature);
}

Execution time: 0.0007929801940918

That changes everything, as running 2000 iterations of the PHP code still only gives me an execution time of 1.0643119812012 plus an additional 0.0325711573357 for creation of all the m3u8 files plus 0.0039050579071045 for an additional 6 iterations to create the signed URLs for the m3u8s; giving a total execution time of 1.100788196444004 seconds, with the largest part being dependent on the length of the video.

This actually seems fine, as users are used to longer "loading" or "buffering" times for longer videos so the ~0.5 - ~1.5 additional seconds when the video is longer will not really affect usability that much.

As an aside, in the current state, there are currently 689 videos on the server, with a total of 864,138 related .ts and .aac files, each video having 6 m3u8s (180,360,480,720,1080,AAC) plus an additional m3u8 for the master playlist...so generating an hourly url for all videos would require (689 [master m3u8] + 864,138 [assets] + 4134 [qual m3u8]) 868,961 iterations of the PHP code, a total runtime of 467.15262699127 (~ 7 minutes), which is manageable but moot considering the runtime to dynamically generate each URL.

This is all using a Google Compute n1-highmem-2 instance, which is not that powerful, so switching to a more powerful machine will make all of this even faster.

But all of this brings another dimension into the fold, as Google (as all other also do) charges per PUT operation on each bucket, so a cost calculation is in order. Looking at our stats for the last month, I see a total of 447,103 total video plays (hey, it's a small site), which, under the proposed scheme, would have generated 7 PUT operations for each video hit (6 bitrate m3u8 + 1 master m3u8), a total of 3,129,721 additional PUTs that month, calculating for cost (3129721 / 10000 * 0.01) gives me a dollar figure of $3.13 additional cost for this...small but could become an issue if the site becomes more popular. The other solution (hourly signed URLs for everyone) would generate ((689 [master m3u8] + 4134 [qual m3u8]) * 24 [gens per day] * 30 [days per month]) 3,472,560 additional PUTs...which is roughly the same, so I am at or near the break-even point (cost-wise) for choosing between the two schemes. I have to do more numbers here using previous months' data to get a better idea of this since one scheme (URL per hit) depends on amount of users and the other (global URL generation) depends on the amount of videos...and they each scale in totally different ways.

In essence, using native code, the issue seems to be purely monetary with a small coding vector (rewriting the video play code vs. introducing hourly URL generation). Both need to be looked at and compared before making a final decision.

Although, a new ACL in the Cloud Storage API (say media-part-file with the m3u8 as payload) that can be tied to an m3u8 would make everything smoother to work out...is there someplace I could propose this to the Google Storage team?

-- 30/10 Edit: Final Solution --

This is the final solution I've come up with, and it seems to be working fine so far.

Setup:

Nginx on Google Cloud Compute Instance - m3u8.domain.com

  • The video converter does the following: 1.- ffmpeg to convert source files to 180,360,480,720,1080,AAC sub-files 2.- ffmpeg segments files into 11 second chunks (less files, iOS still accepts it) 3.- PHP copies all media files to GS bucket 4.- PHP parses the generated m3u8 files and creates a dynamic m3u8 file 5.- PHP copies size.m3u8 files and master.m3u8 file to proper directory on attached HDD

  • New server block in nginx.conf that parses .m3u8 files as PHP 1.- OSMF player requests master m3u8, JS adds session token 2.- PHP checks session token + IP to validate user 3.- If validated, echos current video m3u8 4.- If not validated, echos m3u8 saying you are not allowed to see this video

The process, for a 2:44:08 video file takes between 0.7 - 0.9 seconds, almost invisible to users. For shorter videos it is exponentially tiny.

Cloud Storage Bucket (mp4domain) - mp4.domain.com

The bucket has a default ACL applied that makes all files private but accessible to the Google ID used to generate the signed URLs.

So, a single video has the following files:

SERVER/nginx/mp4/uniqid/uniqid.m3u8
SERVER/nginx/mp4/uniqid/180p/stream.m3u8
SERVER/nginx/mp4/uniqid/360p/stream.m3u8
SERVER/nginx/mp4/uniqid/480p/stream.m3u8
SERVER/nginx/mp4/uniqid/720p/stream.m3u8
SERVER/nginx/mp4/uniqid/1080p/stream.m3u8
SERVER/nginx/mp4/uniqid/audio/stream.m3u8

GS/bucketmp4/uniqid/180p/segment##.ts
GS/bucketmp4/uniqid/360p/segment##.ts
GS/bucketmp4/uniqid/480p/segment##.ts
GS/bucketmp4/uniqid/720p/segment##.ts
GS/bucketmp4/uniqid/1080p/segment##.ts
GS/bucketmp4/uniqid/audio/segment##.aac

(SO seems to think this is code and won't let me format it otherwise)

That way, writes to GS are only done once and, since all clients think they're receiving plain m3u8 files, no hacking has to be done client-side.

Hopefully this can help someone with similar issues.

Upvotes: 2

Related Questions