Reputation: 17827
In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for base64_encode()
, most of the comments suggest using that function, wrapped with strtr()
:
function base64_url_encode($input)
{
return strtr(base64_encode($input), '+/=', '-_,');
}
The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:
sub encode ($) {
my $data = encode_base64($_[0], '');
$data =~ tr|+/=|\-_|d;
return $data;
}
Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma) as PHP does. Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.
Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function to remove the padding which shows prominently in Google results for 'python base64 url safe':
from base64 import urlsafe_b64encode, urlsafe_b64decode
def uri_b64encode(s):
return urlsafe_b64encode(s).strip('=')
def uri_b64decode(s):
return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))
The desire here is to have a string that can be included in a URL without further encoding, hence the ditching or translation of the characters '+', '/', and '='. Since there isn't a defined standard, what is the right way?
Upvotes: 6
Views: 4855
Reputation: 75456
I don't think there is right or wrong. But most popular encoding is
'+/=' => '-_.'
This is widely used by Google, Yahoo (they call it Y64). The most url-safe version of encoders I used on Java, Ruby supports this character set.
Upvotes: 9
Reputation: 25931
There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:
This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in table 2.
+
and /
should be replaced by - (minus)
and _ (understrike)
respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.
Note that this requires that you URL encode the (pad) =
characters, but I prefer that over URL encoding the +
and /
characters from the standard base64 alphabet.
Upvotes: 11
Reputation: 140050
If you're asking about the correct way, I'd go with proper URL-encoding as opposed to arbitrary replacement of characters. First base64-encode your data, then further encode special characters like "=" with proper URL-encoding (i.e. %<code>
).
Upvotes: 1
Reputation: 10582
I'd suggest running the output of base64_encode through urlencode. For example:
function base64_encode_url( $str )
{
return urlencode( base64_encode( $str ) );
}
Upvotes: 2
Reputation: 35497
Why don't you try wrapping it in a urlencode()
? Documentation here.
Upvotes: 0