Reputation: 302
My Http Triggered Azure Function has a workflow that consists of 3 steps:
It receives an API call with some parameters
It reads the data from the Azure Blob with this function:
def read_dataframe_from_blob(account_name, account_key, container_name, blob_name):
# Create a connection string to the Azure Blob storage account
connect_str = f"DefaultEndpointsProtocol=https;AccountName={account_name};AccountKey={account_key};EndpointSuffix=core.windows.net"
# Create a BlobServiceClient object using the connection string
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
# Get a reference to the Parquet blob
blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
# Download the blob data as a stream
blob_data = blob_client.download_blob()
# Read the Parquet data from the stream into a pandas DataFrame
df = pd.read_parquet(io.BytesIO(blob_data.readall()))
return df
I previously created a very similiar workflow and the Function Log Stream was pretty clean, it included only elements defined in logging. However, when I read the data from blob, the logs in Azure Function Log Stream (and local, of course) start with:
2023-06-05T07:35:42Z [Information] Request URL: 'https://myaccount.blob.core.windows.net/mycontainer/my.parquet'
Request method: 'GET'
Request headers:
'x-ms-range': 'REDACTED'
'x-ms-version': 'REDACTED'
'Accept': 'application/xml'
'User-Agent': 'azsdk-python-storage-blob/12.16.0 Python/3.10.11 (Linux-5.10.164.1-1.cm1-x86_64-with-glibc2.31)'
'x-ms-date': 'REDACTED'
'x-ms-client-request-id': '932afd88-0373-11ee-8724-1270efe16c2d'
'Authorization': 'REDACTED'
No body was attached to the request
2023-06-05T07:35:42Z [Information] Response status: 206
Response headers:
'Content-Length': '33554432'
'Content-Type': 'application/octet-stream'
'Content-Range': 'REDACTED'
'Last-Modified': 'Thu, 01 Jun 2023 08:00:30 GMT'
'Accept-Ranges': 'REDACTED'
'ETag': '"0x8DB627644CFEA3E"'
'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'
'x-ms-request-id': '08843836-f01e-0019-6780-974298000000'
'x-ms-client-request-id': '932afd88-0373-11ee-8724-1270efe16c2d'
'x-ms-version': 'REDACTED'
'x-ms-creation-time': 'REDACTED'
'x-ms-blob-content-md5': 'REDACTED'
'x-ms-lease-status': 'REDACTED'
'x-ms-lease-state': 'REDACTED'
'x-ms-blob-type': 'REDACTED'
'Content-Disposition': 'REDACTED'
'x-ms-server-encrypted': 'REDACTED'
'Date': 'Mon, 05 Jun 2023 07:35:42 GMT'
...repeated multiple times. Then I get the info from my logs.
What is the reason for such behaviour? Is there any smooth way to optimize the code or avoid these bloated logs?
Edit: I've found a similar discussion here but I'm not sure how to replicate it for Python app.
Edit2: It's not a solution, but I've found a github bug report here
Still - would appreciate any workarounds.
Upvotes: 1
Views: 364
Reputation: 11
import logging
# Set the desired log level (e.g., INFO, DEBUG, ERROR)
logging.basicConfig(level=logging.INFO)
def main(req):
# Your code to access data from Azure Blob
# Example logging statements
logging.info("Accessing data from Azure Blob")
logging.debug("Debug message")
logging.error("Error message")
# Rest of your function code
return "Function executed successfully"
In this code, logging.basicConfig() sets up the basic configuration for logging, including the desired log level. You can adjust the log level to control the verbosity of the logs (e.g., logging.INFO, logging.DEBUG, logging.ERROR).
Upvotes: -1