Anil Kumar Moharana
Anil Kumar Moharana

Reputation: 374

Parse text content into particluar json format

I have a project requirement.I am getting the data in text format as given below.

SL NO  POLICY NO  AMOUNT  NAME            CGST TAX
02     33051090   195.0   D BL ESSENTIAL  9.00%
03     33051091   195.1   D HRFL COD      9.00%

But I need to process text content and form json out of that.

[{
"SL NO":"02",
"POLICY NO":"33051090",
"AMOUNT":"195.0",
"NAME":"D BL ESSENTIAL",
"CGST TAX":"9.00%"
},
{
"SL NO":"03",
"POLICY NO":"33051091",
"AMOUNT":"195.1",
"NAME":"D HRFL COD",
"CGST TAX":"9.00%"
}]

I am unable to think of any logic as how to diffrentiate the values and map to json property as there are lot of whitespaces in between.

There is no unique separator between the contents I am getting.So It is not like CSV data.

Upvotes: 2

Views: 242

Answers (2)

Barmar
Barmar

Reputation: 781300

Since all fields except the name are numeric, you can match them with a regular expression. The name is everything between the amount and tax percentage.

let re = /^(\d+)\s+(\d+)\s+([\d.]+)\s+(.*?)\s+([\d.]+%)$/;
let data = `SL NO POLICY NO AMOUNT NAME CGST TAX
02   33051090  195.0  D BL ESSENTIAL 9.00%
03  33051091  195.1    D HRFL COD  9.00%`;
let obj = [];
data.split('\n').forEach(line => {
  let match = line.match(re);
  if (match) {
    obj.push({
      "SL NO": match[1],
      "POLICY NO": match[2],
      "AMOUNT": match[3],
      "NAME": match[4],
      "CGST TAX": match[5]
    });
  }
});
console.log(obj);

Or instead of depending on the other fields to be numeric, you could just hope that none of them contain any embedded whitespace.

let re = /^(\S+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)$/;
let data = `SL NO POLICY NO AMOUNT NAME CGST TAX
02   33051090  195.0  D BL ESSENTIAL 9.00%
03  33051091  195.1    D HRFL COD  9.00%`;
let obj = [];
data.split('\n').slice(1).forEach(line => {
  let match = line.match(re);
  if (match) {
    obj.push({
      "SL NO": match[1],
      "POLICY NO": match[2],
      "AMOUNT": match[3],
      "NAME": match[4],
      "CGST TAX": match[5]
    });
  }
});
console.log(obj);

.slice(1) is to skip over the header line.

Upvotes: 2

Jamiec
Jamiec

Reputation: 136124

You could solve this with regex, something like (\d+)\s+(\d+)\s+([\d\.]+)\s+([\w\s]+)\s+([\d\.]+\%)

var re = /^(\d+)\s+(\d+)\s+([\d\.]+)\s+([\w\s]+)\s+([\d\.]+\%)$/;
var data = `SL NO POLICY NO AMOUNT NAME CGST TAX
02   33051090  195.0  D BL ESSENTIAL 9.00%
03  33051091  195.1    D HRFL COD  9.00%`;
var result = data.split("\n").slice(1).map(item => {
    var match = item.match(re);
    return {
       "SL NO": match[1],
       "POLICY NO": match[2],
       "AMOUNT": match[3],
       "NAME": match[4],
       "CGST TAX": match[5]
    };
});
console.log(result);

But this is prone to errors - as soon as the format varies slightly this all breaks. I'd echo what others said in the comments - get a better data format which is less ambiguous.

Upvotes: 1

Related Questions