Sebastian
Sebastian

Reputation: 434

Parse Excel documents on AWS Lambda

For a project, I convert Excel documents to JSON with in the Java application using Apache Poi. In the future, this task shall be done using AWS Lambda, because it currently it can take very long (up to 20sec) and has a high memory consumption.

Requirements:

With AWS Lamda I now can use, Java, Python or NodeJs. My question is: Is my Apache POI approach the way to go or are there more suitable frameworks? E.g. sheetjs seems to be a good candidate. I was not able to find a up to date performance comparison of such frameworks.

Upvotes: 2

Views: 5139

Answers (1)

Yash Mochi
Yash Mochi

Reputation: 967

Give a shot to pyexcel_xlsx library in python. I have used this for converting xlsx to json. Sweet and simple one. And fast also as compared to other python libraries.

Sample code:

from pyexcel_xlsx import get_data;
import time;
import json;

data = get_data("RefinedProduct.xlsx")
sheetName = "Table 6b";

for i in range(0, len(data[sheetName])):
    for j in range(0, len(data[sheetName][i])):
        print("Row: " + str(i) + ", Column: " + str(j) + ", Value: "+ str(data[sheetName][i][j]));

Upvotes: 1

Related Questions