Regex : How to return an array of matches that need different regex?

I have to transform txt files in json files, so I use regex to parse different type of datas. I want to record data such as first name, last name, birthday and other stuff.

The data are formatted this way :

/Indicator /
/[A-Z][a-z]+ /
/[A-Z][a-z]+ /
/[0-9]{2}\/[0-9]{2}\/[0-9]{4}/

More specific example :

Indicator Tom Smith 01/01/2001

So I know how where my info begins (it always starts with "Indicator ") and that after that there is the first name then the last name then the birthday, I also know what regex to use for theses types of data individually but not how to implement them all together.

This is what I do for the moment and I doubt it is optimal or recommended :

let first_name = "";
let last_name = "";
let birthday = "";
let j = 10; // Length of "Indicator "
let regex = /Indicator /;
let match = regex.exec(data);

j += match.index;
while (data[j] !== ' ')
    first_name += data[j++];
j++;
while (data[j] !== ' ')
    last_name += data[j++];
j++;
while (data[j] !== '<')
    birthday += data[j++];
console.log(first_name);
console.log(last_name);
console.log(birthday);

My question is, what regex rule should I use to get the array ['Tom', 'Smith', '01/01/2001'] with one execution of regex.exec ?

Upvotes: 0

Views: 64

Answers (2)

As Dhaval Chaudhary said in his answer, in this case, you don't even need to use regex.

But let's say you want to do that (maybe because the entries are more complicated and you have more than one type of character between words, I don't know).

Then, one simple approach that works if you know the order of the elements in each entry is

line="string1<element1>string2<element2>...stringN<elementN>"
strArray = line.match(/(regex1)|(regex2)|...|(regexM)/g)

where each regexI may match more than one element (so M may be different from N).

In your simple example, it would be like that:

line="Indicator Tom Smith 01/01/2001" /* four elements */
strArray = line.match(/(Indicator)|([A-Z][a-z]*)|([0-9]{2}\/[0-9]{2}\/[0-9]{4})/g)  /* three regexes */
console.log(strArray)

which prints

Array [ "Indicator", "Tom", "Smith", "01/01/2001" ]

Upvotes: 0

Dhaval Chaudhary
Dhaval Chaudhary

Reputation: 5815

The first question is why you want to use regex?

My suggestion: You can directly go for str.split(" ") which will return you array of ['Indicator','Tom', 'Smith', '01/01/2001'] and you can process as you want.

For big file with such data you can do it like:

Indicator Tom Smith 01/01/2001 Indicator xyz abc 11/02/2002

you will do some thing like this :

var str = "Indicator Tom Smith 01/01/2001 Indicator xyz abc 11/02/2002";
var strArray = str.splice(" ");
var 
for(var i = 0; i < strArray.length;i++){
   if(strArray[i] === 'Indicator'){
   var firstname = strArray[i+1];
   var lastname = strArray[i+2];
   var dob = strArray[i+3];
   //use them as you want
   i += 2; 
  }
} 

Upvotes: 1

Related Questions