Reputation: 20103
I have a (private) blob in Azure blob storage that was written through an account that has write and read access to it (it was written through this account by terraform). I am trying to fetch it through Python (without Azure SDK) and I have been unable to.
My request is as follows:
import datetime
import requests
key = ...
secret = ...
now = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
# the required settings, as per https://learn.microsoft.com/en-us/rest/api/storageservices/get-blob
headers = {'Authorization': 'SharedKey {}:{}'.format(key, secret),
'Date': now,
'x-ms-version': '2018-03-28'
}
storage_account = ...
container = ...
url = 'https://{}.blob.core.windows.net/{}/terraform.tfstate'.format(storage_account, container)
response = requests.get(url, headers=headers)
print(response.status_code)
print(response.text)
This yields
400
<?xml version="1.0" encoding="utf-8"?><Error>
<Code>OutOfRangeInput</Code><Message>One of the request inputs is out of range.
RequestId:...
Time:...</Message></Error>
I have validated that this file exists (Storage explorer) and that, when I access it via the console, I get the same URL as the one above, but with extra GET parameters.
For those wondering: the reason I decided not to use Azure SDK for Python: I only need to get a blob and pip install azure[blob]
would add 88 dependencies to the project (IMO unacceptably high number for such a simple task).
Upvotes: 0
Views: 1610
Reputation: 20103
So, the reason is that the signature
mentioned in the documentation is constructed from the request and is described here in detail.
The Python 3-equivalent of the whole thing is:
import base64
import hmac
import hashlib
import datetime
import requests
def _sign_string(key, string_to_sign):
key = base64.b64decode(key.encode('utf-8'))
string_to_sign = string_to_sign.encode('utf-8')
signed_hmac_sha256 = hmac.HMAC(key, string_to_sign, hashlib.sha256)
digest = signed_hmac_sha256.digest()
encoded_digest = base64.b64encode(digest).decode('utf-8')
return encoded_digest
def get_blob(storage_account, token, file_path):
now = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
url = 'https://{account}.blob.core.windows.net/{path}'.format(account=storage_account, path=file_path)
version = '2018-03-28'
headers = {'x-ms-version': version,
'x-ms-date': now}
content = 'GET{spaces}x-ms-date:{now}\nx-ms-version:{version}\n/{account}/{path}'.format(
spaces='\n'*12,
now=now,
version=version,
account=storage_account,
path=file_path
)
headers['Authorization'] = 'SharedKey ' + storage_account + ':' + _sign_string(token, content)
response = requests.get(url, headers=headers)
assert response.status_code == 200
return response.text
where file_path
is of the form {container}/{path-in-container}
.
Using this snippet was still superior to add 88 dependencies to the project.
Upvotes: 1