ysd
ysd

Reputation: 301

Parsing CSV with common data preceding other data with NodeJS

I'm trying to read in CSV files with nodejs and the code is like below.

  fs.createReadStream(file)
    .pipe(csv.parse({from_line: 6, columns: true, bom: true}, (err, data) => {
      data.forEach((row, i) => {

As I am using from_line parameter, the data starts at line 6 with header. The issue is that the line #3 has the date which is also used with other data.

What is the best way to resolve this?

Data file looks like below:

Genre: ABC
Date: 2020-01-01, 2020-12-31
Number of Data: 300


No., Code, Name, sales, delivery, return, stock
1, ......
2, ......

Additional question

I have inserted iconv.decodeStream in the second part of function. How could I apply the same decoder for header read-in process?

  fs.createReadStream(file)
    .pipe(iconv.decodeStream("utf-8"))
    .pipe(csv.parse({from_line: 6, columns: true, bom: true}, (err, data) => {
      data.forEach((row, i) => {

Upvotes: 0

Views: 993

Answers (1)

Terry Lennox
Terry Lennox

Reputation: 30685

I'd suggest reading the header data first, then you can access this data in your processing callback(s), something like the example below:

app.js

// Import the package main module

const csv = require('csv')
const fs = require("fs");
const { promisify } = require('util');
const parse = promisify(csv.parse);
const iconv = require('iconv-lite');

async function readHeaderData(file, iconv) {
    let buffer = Buffer.alloc(1024);
    const fd = fs.openSync(file)
    fs.readSync(fd, buffer);
    fs.closeSync(fd);
    buffer = await iconv.decode(buffer, "utf-8");
    const options = { to_line: 3, delimiter: ':', columns: false, bom: true, trim: true };
    const rows = await parse(buffer, options);
    // Convert array to object
    return Object.fromEntries(rows);
}

async function readFile(file, iconv) {
    const header = await readHeaderData(file, iconv);
    console.log("readFile: File header:", header);

    fs.createReadStream(file)
    .pipe(iconv.decodeStream("utf-8"))
    .pipe(csv.parse({ from_line: 6, columns: true, bom: true, trim: true }, (err, data) => {
        // We now have access to the header data along with the row data in the callback.
        data.forEach((row, i) => console.log( { line: i, header, row } ))
    }));
}

readFile('stream-with-skip.csv', iconv)

This will give an output like:

readFile: File header: {
  Genre: 'ABC',
  Date: '2020-01-01, 2020-12-31',
  'Number of Data': '300'
}

and

{
  line: 0,
  header: {
    Genre: 'ABC',
    Date: '2020-01-01, 2020-12-31',
    'Number of Data': '300'
  },
  row: {
    'No.': '1',
    Code: 'Code1',
    Name: 'Name1',
    sales: 'sales1',
    delivery: 'delivery1',
    return: 'return1',
    stock: 'stock1'
  }
}
{
  line: 1,
  header: {
    Genre: 'ABC',
    Date: '2020-01-01, 2020-12-31',
    'Number of Data': '300'
  },
  row: {
    'No.': '2',
    Code: 'Code2',
    Name: 'Name2',
    sales: 'sales2',
    delivery: 'delivery2',
    return: 'return2',
    stock: 'stock2'
  }
}

example.csv

Genre: ABC
Date: 2020-01-01, 2020-12-31
Number of Data: 300


No., Code, Name, sales, delivery, return, stock
1, Code1, Name1, sales1, delivery1, return1, stock1
2, Code2, Name2, sales2, delivery2, return2, stock2

Upvotes: 1

Related Questions