Reputation: 164
Hello this tutorial pretty much describes exactly what I need to do and explains it pretty well, but it just doesn't work for me. Isearched for similar cases and even found others having trouble with it but their solutions did not help me.
I changed the changed the code from the article a little so I would extract the metadata from the same static object, so i can easily just hit test in lambda and dont have to upload new files to s3 everytime to trigger the function. I also added some outputs which show that I applied the solutions of others such as adding "shell=True" and making sure that the file really is executable.
import logging
import subprocess
import boto3
import botocore.session
SIGNED_URL_EXPIRATION = 300 # The number of seconds that the Signed URL is valid
DYNAMODB_TABLE_NAME = "metadata_test_db"
DYNAMO = boto3.resource("dynamodb")
TABLE = DYNAMO.Table(DYNAMODB_TABLE_NAME)
logger = logging.getLogger('boto3')
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
"""
:param event:
:param context:
"""
# Loop through records provided by S3 Event trigger
for s3_record in event['Records']:
logger.info("Working on new s3_record...")
# Extract the Key and Bucket names for the asset uploaded to S3
key = s3_record['s3']['object']['key']
bucket = s3_record['s3']['bucket']['name']
logger.info("Bucket: {} \t Key: {}".format(bucket, key))
# Generate a signed URL for the uploaded asset
signed_url = get_signed_url(SIGNED_URL_EXPIRATION, bucket, key)
logger.info("Signed URL: {}".format(signed_url))
# Launch MediaInfo
# Pass the signed URL of the uploaded asset to MediaInfo as an input
# MediaInfo will extract the technical metadata from the asset
# The extracted metadata will be outputted in XML format and
# stored in the variable xml_output
out2 = subprocess.check_output(["ls", "-l", "mediainfo"])
logger.info(out2)
xml_output = subprocess.check_output(["./mediainfo", "--full", "--output=XML", "https://public-s3-file"], shell=True)
logger.info("Output: {}".format(xml_output))
#save_record(key, xml_output)
def save_record(key, xml_output):
"""
Save record to DynamoDB
:param key: S3 Key Name
:param xml_output: Technical Metadata in XML Format
:return:
"""
logger.info("Saving record to DynamoDB...")
TABLE.put_item(
Item={
'sryKey': key,
'technicalMetadata': xml_output
}
)
logger.info("Saved record to DynamoDB")
def get_signed_url(expires_in, bucket, obj):
"""
Generate a signed URL
:param expires_in: URL Expiration time in seconds
:param bucket:
:param obj: S3 Key name
:return: Signed URL
"""
s3_cli = boto3.client("s3")
presigned_url = s3_cli.generate_presigned_url('get_object', Params={'Bucket': bucket, 'Key': obj},
ExpiresIn=expires_in)
return presigned_url
the cloudwatch error log from lambda returns the following
13:27:13
START RequestId: cad3123b-f514-11e6-b8b1-45fa69956450 Version: $LATEST
13:27:13
[INFO] 2017-02-17T13:27:13.562Z cad3123b-f514-11e6-b8b1-45fa69956450 Working on new s3_record...
13:27:13
[INFO] 2017-02-17T13:27:13.562Z cad3123b-f514-11e6-b8b1-45fa69956450 Bucket: sourcebucket Key: HappyFace.jpg
13:27:13
[INFO] 2017-02-17T13:27:13.665Z cad3123b-f514-11e6-b8b1-45fa69956450 Signed URL: https://s3.us-east-2.amazonaws.com/sourcebucket/HappyFace.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=300&X-Amz-Date=20170217T132713Z&X-Amz-SignedHeaders=host&X-Amz-Security-Token=FQoDYXdzEEMaDARR11pSPLcxi%2BkSZyL2AU2NjzOk37%2F2Ruwc33ZY5uN%2Ffg9O1c1awRcJf0qej4b29woEf%2BDhHsehkTr4WaKq19MxLL%2BdqmQBXArWXCYaGIcxdy
13:27:13
[INFO] 2017-02-17T13:27:13.683Z cad3123b-f514-11e6-b8b1-45fa69956450 -rwxrwxrwx 1 slicer 497 9269401 Feb 17 09:51 mediainfo
13:27:13
Command '['./mediainfo', '--full', '--output=XML', 'https://public-s3-file']' returned non-zero exit status 255: CalledProcessError Traceback (most recent call last): File "/var/task/lambda_function.py", line 38, in lambda_handler xml_output = subprocess.check_output(["./mediainfo", "--full", "--output=XML", "https://public-s3-file
Command '['./mediainfo', '--full', '--output=XML', 'https://public-s3-file']' returned non-zero exit status 255: CalledProcessError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 38, in lambda_handler
xml_output = subprocess.check_output(["./mediainfo", "--full", "--output=XML", "https://public-s3-file"], shell=True)
File "/usr/lib64/python2.7/subprocess.py", line 574, in check_output
raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['./mediainfo', '--full', '--output=XML', 'https://public-s3-file']' returned non-zero exit status 255
13:27:13
END RequestId: cad3123b-f514-11e6-b8b1-45fa69956450
13:27:13
REPORT RequestId: cad3123b-f514-11e6-b8b1-45fa69956450 Duration: 127.38 ms Billed Duration: 200 ms Memory Size: 512 MB Max Memory Used: 32 MB
any hint is much appreciated.
Upvotes: 3
Views: 2720
Reputation: 164
So for me the solution was to add stderr=subprocess.STDOUT like also suggested here in a comment to my question and also setting shell=False. It did not work with shell=True. That hint comes from a comment to this answer to a similar question where it "magically" works with this setup
Upvotes: 2