Reputation: 537
I am developing an Azure function blob trigger to read a column from CSV in a blob and pass column values to API request to get JSON response of each value, I want to write each response to Azure blob storage in a new .JSON file.
Here is my code
import logging
import azure.functions as func
from azure.storage.blob import BlobServiceClient,BlobClient
import pandas as pd
import requests
import json
import os, io
def main(inputBlob: func.InputStream, outputBlob: func.Out[str]):
logging.info(f"Blob trigger executed!")
logging.info(f"Blob Name: {inputBlob.name} ({inputBlob.length}) bytes")
logging.info(f"Full Blob URI: {inputBlob.uri}")
# connections
connection_string = "DefaultEndpointsProtocol=https;AccountName=nhtsa;AccountKey=*********==;EndpointSuffix=core.windows.net"
containerName = "myblobcontainer"
blobName = "input/NHTSA_Inbound.csv"
out_blob = "output"
blob = BlobClient.from_connection_string(conn_str=connection_string, container_name=containerName, blob_name=blobName)
# reading csv column values and pass into request to get each value response
blobStream = blob.download_blob().content_as_bytes()
logging.info(blobStream)
df = pd.read_csv(io.BytesIO(blobStream), sep=',', dtype=str)
vin_df = pd.DataFrame(df['vin'])
vin_tup = list(vin_df.to_records(index=False))
for i,t in enumerate(vin_tup):
nhtsa_url = 'https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVinValuesExtended/%s?format=json&modelyear=2011'%t[0]
nhtsa_req_content = requests.get(nhtsa_url)
nhtsa_data = nhtsa_req_content.json()
with open(os.path.join(outputBlob,'Veh_%s.json'%t[0]),"w+") as output_file:
json.dump(list(list(nhtsa_data.values())[0]),output_file)
Everything was working fine except the file write section. I am debugging function in my VS Code and my breakpoint is throwing error at this location
with open(os.path.join(outputBlob,'Veh_%s.json'%t[0]),"w+") as output_file: json.dump(list(list(nhtsa_data.values())[0]),output_file)
error
Executed 'Functions.BlobTrigger1' (Failed, Id=8c1dd2bb-7fb9-4670-9f44-4fb290530325, Duration=34741ms)
[2021-05-15T15:51:01.978Z] System.Private.CoreLib: Exception while executing function: Functions.BlobTrigger1. System.Private.CoreLib: Result: Failure
Exception: TypeError: expected str, bytes or os.PathLike object, not Out
Stack: File "C:\Program Files\Microsoft\Azure Functions Core Tools\workers\python\3.7/WINDOWS/X64\azure_functions_worker\dispatcher.py", line 372, in _handle__invocation_request
self.__run_sync_func, invocation_id, fi.func, args)
File "C:\Program Files\Python37\lib\concurrent\futures\thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "C:\Program Files\Microsoft\Azure Functions Core Tools\workers\python\3.7/WINDOWS/X64\azure_functions_worker\dispatcher.py", line 548, in __run_sync_func
return func(**params)
File "C:\Users\yyy1u39\Desktop\Campaign Group Dashboard\Portfolio\Azure Data Factory\BlobTrigger1\__init__.py", line 35, in main
with open(os.path.join(outputBlob,'Veh_%s.json'%t[0]),"w+") as output_file:
File "C:\Program Files\Python37\lib\ntpath.py", line 76, in join
path = os.fspath(path)
.
Upvotes: 1
Views: 1367
Reputation: 12153
It is recommended to use Azure storage blob SDK to upload multiple request results instead of Azure function blob output binding.
Just try the code below:
import requests
from azure.storage.blob import ContainerClient
#skip reading csv
#Call demo API and save result individually
requestURLs =['https://reqres.in/api/users?page=1','https://reqres.in/api/users?page=2']
storageConnStr = ''
containerName =''
container_client = ContainerClient.from_connection_string(storageConnStr,containerName)
count = 0
for req in requestURLs:
result = requests.get(req).text
container_client.get_blob_client('output/req'+ str(count) + '.json').upload_blob(result)
count+=1
Result:
Let me know if you have any more questions.
Upvotes: 3
Reputation: 79
you can use Azure storage blob SDK: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python
Upvotes: 0