Reputation: 855
I am trying to load csv into dynamodb table with python program as below but getting like index out of range error
Input csv file looks like:
1st line is atrributes
2nd line is datatype for attributes
3rd line onwards actual data
csv file content:
customer_id,key_id,dashboard_name,tsm,security_block,core_block,type,subscription,account_id,region,sed,jumpbox,dc,av,gl,backup,cpm,zb
int,int,string,string,string,string,string,string,string,string,string,string,string,string,string,string,string,string
1,1,Act,yes,no,no,az,xxxxx-xxx-xxxx-xxxx-xxxx,null,eu-west-1,yes,yes,yes,no,yes,no,notapplicable,yes
1,2,Act,no,no,yes,az,xxxxx-xxx-xxxx-xxxx-xxxx,null,eu-west-1,no,yes,no,yes,no,yes,notapplicable,no
2,1,Cap,no,no,yes,aws,notapplicable,xxxxxxxx,us-west-2,yes,no,no,no,yes,no,yes,yes
2,2,Cap,yes,no,no,aws,notapplicable,xxxxxxxx,us-west-2,yes,no,no,no,yes,no,no,yes
2,3,Cap,no,yes,no,aws,notapplicable,xxxxxxxx,us-west-2,no,yes,no,yes,no,yes,yes,no
2,4,Cap,yes,no,no,aws,notapplicable,xxxxxxxx,us-west-1,yes,no,no,no,yes,no,no,yes
2,5,Cap,no,no,yes,aws,notapplicable,xxxxxxxx,us-east-1,no,yes,no,yes,no,yes,yes,yes
What I tried:
# Python Script to insert csv records in dynamodb table.
from __future__ import print_function # Python 2/3 compatibility
from __future__ import division # Python 2/3 compatiblity for integer division
import argparse
import boto3
from csv import reader
import time
# command line arguments
parser = argparse.ArgumentParser(
description='Write CSV records to dynamo db table. CSV Header must map to dynamo table field names.')
parser.add_argument('csvFile', help='Path to csv file location')
parser.add_argument('table', help='Dynamo db table name')
parser.add_argument('writeRate', default=5, type=int, nargs='?',
help='Number of records to write in table per second (default:5)')
parser.add_argument('delimiter', default=',', nargs='?', help='Delimiter for csv records (default=,)')
parser.add_argument('region', default='us-west-2', nargs='?', help='Dynamo db region name (default=us-west-2')
args = parser.parse_args()
print(args)
# dynamodb and table initialization
endpointUrl = "https://dynamodb.us-west-2.amazonaws.com"
dynamodb = boto3.resource('dynamodb', region_name=args.region, endpoint_url=endpointUrl)
table = dynamodb.Table(args.table)
# write records to dynamo db
with open(args.csvFile) as csv_file:
tokens = reader(csv_file, delimiter=args.delimiter)
# read first line in file which contains dynamo db field names
header = next(tokens)
# read second line in file which contains dynamo db field data types
headerFormat = next(tokens)
# rest of file contain new records
for token in tokens:
print(token)
item = {}
for i, val in enumerate(token):
print(val)
if val:
key = header[i]
if headerFormat[i] == 'int':
val = int(val)
if headerFormat[i] == 'stringset':
tempVal = val.split('|')
val = set()
for tok in enumerate(tempVal):
print(tok)
val.add(str(tok[1]))
print(val)
item[key] = val
print(item)
table.put_item(Item=item)
time.sleep(1 / args.writeRate) # to accomodate max write provisioned capacity for table
Error I am getting:
Traceback (most recent call last):
File "C:\csv\dbinsert.py", line 39, in <module>
key = header[i]
IndexError: list index out of range
I am passing filename and table name as parameter. Actually first two columns are numbers in dynamodb table, that means, in csv, 1,1 are considered as strings ? not sure where i am getting it wrong.
Can any one suggest please
Upvotes: 0
Views: 810
Reputation: 855
Fixed issue with suggestion from @jarmod and added and referring to u'\ufeff' in Python string
This worked :
with open(args.csvFile, mode ='r',encoding='utf-8-sig') as csv_file:
Upvotes: 1