Reputation: 3502
I'm currently using my app service to process files. I'm uploading Excel files and processing them using PhpSpreadsheet.
Locally, they're going through in less than 90 seconds (laptop, SSD, i7, 16GB RAM). On my app service (small Linux plan), the same file and script, can take upwards of 10 min. This results in the 504 Gateway Timeout error. I then manually confirm data upload in the database.
What would be an alternative to this way of processing files?
Edit #1
By Processing, I mean uploading an Excel file, going through the rows, extracting data and updating a database using PhpSpreadsheet. My current file is 102KB, has 4300 rows, 23 columns and takes around 6 minutes. Locally, only a few seconds.
Example:
if ($this->request->is('post'))
{
$file = $this->request->getData()['my_file'];
$path = TMP.time()."_".$file->getClientFileName();
$file->moveTo($path);
$spreadsheet = IOFactory::load($path);
$sheetData = $spreadsheet->getActiveSheet()->toArray(null, false, true, true);
foreach($sheetData as $sheet_row)
{ ... }
}
Upvotes: 1
Views: 124
Reputation: 801
Comments offer some good thoughts on basic options, summarized briefly:
Either may alleviate your problem in the short term. However let's say your algorithm is well-optimized and your server is as fast as you're willing to pay for, both of which may be true for you and are often true in real-world cases. Let's also assume that your file size is somewhat unpredictable, so even when well-optimized and fast, you may need to support outlier cases of larger files that are slow to process.
It sounds like your workflow is (synchronously):
I would agree with the comments suggesting taking out Step 2 and putting it in background-job processing. This would make your workflow look like:
n
seconds to check if the job is doneThere are a LOT of different ways to do this (the above background processing link details a few Azure-esque ways).
One Azure-centric way I can think of is to use a function app triggered by a queue. In this case, your server could accept an Excel File, persist it somewhere (or your client could upload it directly to Azure Blob Storage), and then adds an entry to your Queue, saying where your file to be processed is. Using a queue trigger, this can then automatically trigger your function app code, which can pull the oldest message from your queue, go and find the file to be processed, process it in as much time as it takes, and then upload the "processed" file back to blob storage for client consumption.
I know less about PHP background job tooling, hence the Azure-ness of the above suggestions, but there are definitely dozens of other ways to do the same basic workflow proposed above, and the benefit of the above is that you can implement it using basically any language you want, the core architecture remains the same.
Upvotes: 1