Mozak
Mozak

Reputation: 2798

Reading contents of csv file in node.js

I am trying to implement a module in nodejs(just started working in nodejs) which has requirement below as

  1. Upload .csv file.
  2. Read content of the csv file.

Frameworks currently being used for restful api is "express": "~4.2.0" and multer for file upload.

Now I have configured multer like below in my app.js

app.use(multer({
  onFileUploadData : function(file, data){
    console.log('onFileUploadData Called with data - '+ data);
  }
}));

In my route file, I have a post endpoint like below

app.post('/sample.csv',lead.processCSV);

This route is being called from an ajax call below as

$.ajax({
            xhrFields: {withCredentials: true},
            url: '/sample.csv',
            type: 'POST',
            success: function (data) {
                $scope.handleResponse(data);
            },
            error: function (error, xhr) {
                angular.element('#csvUploadBusyIcon').hide();
                alert('Oops! Upload failed');
            },
            data: formData,
            cache: false,
            contentType: false,
            processData: false
        });

Now I want to get the content of the csv file, i.e. when all the content has been loaded then I should handle my lead.processCSV method.

Also do I need any other module for csv files, or multer is sufficient in my case?

Any suggestion/guidance in right direction will be helpful. Thanks in Advance.

Upvotes: 12

Views: 21437

Answers (2)

Ramon Menezes
Ramon Menezes

Reputation: 1

I had a similar request to process csv file and I tried to implement your solution: it works but as long as I used it with console log. I tried to store the 'record' variable on a array called 'results', but I just got an empty array [] and after the presented this empty array I received the console.log response presenting the parsed CSV data.

So it seems to be a matter of sync.. I mean, the processing of csv file takes a while. So I tried to compact your code and transform it into a Promise and then execute it. so, after the execution of the promise, my array was ready to be used.

  1. Note: I'm a beginner, so it may contain some error. So far, it is working fine for me.
  2. Note: The content of my CSV test file is:
title, type, value, category
Loan, income, 1500, Others
Website Hosting, outcome, 50, Others
Ice cream, outcome, 3, Food
  1. Note: There are some differences from your case: I'm receiving one single file from the rote '/import. I'm using Insomnina Designer app to send a multipart form body with one file named importFile

  2. Note:I imported the same libraries that you used and I also used the concept of middlewares

  3. Note:In this case I was just expecting one file, so I used multer({dest: './upload'}).single('importFile'). It could be used also .any().

  4. Note: I'm using typescript, so for JS it is just a matter to remove after some variable declaration :@type, for instance

  5. Note: I left option 1 - working only with arrays and option 2 - using objects.

const results: object[] = [];
becomes:
const results = [];

Let's go to the code:

import { Router, Request, Response } from 'express';
import csv from 'csv-parse';
import multer from 'multer';
import fs from 'fs';

// used on option 2 due typescript
interface CSVTransactionDTO {
  title: string;
  value: number;
  type: 'income' | 'outcome';
  category: string;
}

app.post(
  '/import', // route name
  multer({ dest: './upload' }).single('importFile'), // middleware to download one file (csv)
  async (request: Request, response: Response) => {//last middleware with CSV parsing with arrow function
    const filePath = request.file.path;
    
    
let rowCounter = 0;
    const results: string[] = [];// option 1
    const newTransactions: CSVTransactionDTO[] = [];// option 2
    
    function parseCSVPromise(): Promise<void> {
      return new Promise((resolve, reject) => {
        const ConfigCSV = {
          // delimiter:';',//other delimiters different from default = ','
          from_line: 2, // data starts here
          trim: true, // ignore white spaces immediately around the delimiter (comma)
        };

        fs.createReadStream(filePath)
          .pipe(csv(ConfigCSV))
          .on('data', /* async */ row => {
            rowCounter += 1;// counter of how many rows were processed
            // console.log(data); // just test
            results.push(row); // Option1 - The simplest way is to push a complete row

            const [title, type, value, category] = row;// Option2, process it as an object
            newTransactions.push({title, type, value, category});// Option2, process it as an object
          })
          .on('error', error => {
            reject(error);
            throw new Error('Fail to process CSV file');
          })
          .on('end', () => {
            resolve();// ends the promise when CSV Parse send 'end' flag
          });
      });
    }


    await parseCSVPromise(); // now using the created promise - await finishing parsingCSV
    console.log('option1', results);// option1
    console.log('option2',newTransactions);// option2
    return response.json({ resultsCounter, results }); // For testing only - interrupting the rote execution
    

    // continue processing results and send it to dataBase...
    //await fs.promises.unlink(filePath); // optionally you can delete the file parsed/processed
    

option1 response:

 [
  [ 'Loan', 'income', '1500', 'Others' ],
  [ 'Website Hosting', 'outcome', '50', 'Others' ],
  [ 'Ice cream', 'outcome', '3', 'Food' ]
 ]
  

option2 response:

  [
    { title: 'Loan',            type: 'income',  value: '1500', category: 'Others' },
    { title: 'Website Hosting', type: 'outcome', value:   '50', category: 'Others' },
    { title: 'Ice cream',       type: 'outcome', value:    '3', category: 'Food' }
  ]

Upvotes: 0

Marcel Batista
Marcel Batista

Reputation: 752

There is an awesome node project which helped me a lot. You should check it out What we're going to use is their csv-parse module. It is able to get a stream as input and read it line by line without blocking the event loop, so basically while you are processing a file your server won't be stuck and other requests can still be processed normally.

Since you said you are just starting with nodejs, you should make a quick search and understand how midlewares work in the request handling process. As a simplification for request handling,a middleware is a function(req, res, next). With req you get request data. With res you can send the response, and next you send your req and res objects to the next middleware. This way you can process a request in parts and the last middleware of the flow will send response to the client (res.send(200) for example)

the Multer({...}) call returns a middleware function. When a request gets to this Middleware, multer will try to download any files the user send in the post request. When u say app.use(Multer({...})), you are asking multer to try and download files from ANY post requests that contains files. This is a security risk if not all your routes are expecting files to be uploaded.

Ok, that being said, here's a sample code I wrote to handle your use case:

//Important Security advice: 
//don't add multer as a middleware to all requests. 
//If you do this, people will be able to upload files
//in ALL YOUR 'post' handlers!!! 

var Multer = require('multer');
var Parse = require('csv-parse');
var fs = require('fs')

function parseCSVFile(sourceFilePath, columns, onNewRecord, handleError, done){
    var source = fs.createReadStream(sourceFilePath);

    var linesRead = 0;

    var parser = Parse({
        delimiter: ',', 
        columns:columns
    });

    parser.on("readable", function(){
        var record;
        while (record = parser.read()) {
            linesRead++;
            onNewRecord(record);
        }
    });

    parser.on("error", function(error){
        handleError(error)
    });

    parser.on("end", function(){
        done(linesRead);
    });

    source.pipe(parser);
}

//We will call this once Multer's middleware processed the request
//and stored file in req.files.fileFormFieldName

function parseFile(req, res, next){
    var filePath = req.files.file.path;
    console.log(filePath);
    function onNewRecord(record){
        console.log(record)
    }

    function onError(error){
        console.log(error)
    }

    function done(linesRead){
        res.send(200, linesRead)
    }

    var columns = true; 
    parseCSVFile(filePath, columns, onNewRecord, onError, done);

}

//this is the route handler with two middlewares. 
//First:  Multer middleware to download file. At some point,
//this middleware calls next() so process continues on to next middleware
//Second: use the file as you need

app.post('/upload', [Multer({dest:'./uploads'}), parseFile]);

I hope this helped. Make sure to understand how routes middlewares work in node: they are a key to good quality code.

Marcel

Upvotes: 25

Related Questions