Dumitru Daniel
Dumitru Daniel

Reputation: 549

Javascript parse chat text

I'm trying to find a way to parse chat text, and I'm having some issues. My purpose is to split the text into the fields (Date, Time, Name, Text) and get a statistic by Name, Date, and get number of words/letter in total for each person.

A sample would be this :

 10/06/2019, 23:17 - Nasu Alex Taranu Gilmeanu: De iesit iesim cu siguranta. Dar tre sa cadem de acord la o varianta
 10/06/2019, 23:17 - Dura Stefanel: Serios acum
 10/06/2019, 23:18 - Dura Stefanel: E cea mai frumoasa cazare de pe site: din câte am văzut pana acum
 11/06/2019, 00:04 - Nasu Alex Taranu Gilmeanu: http://www.booking.com/Share-CJY
 11/06/2019, 18:31 - Danutz: Sa îl mănânci - cu Botu :)

The code I'm using is the one below, but I can't figure what the regex should be in order for it to:

I load the stringData variable form a text file using Ajax, I just added str as a small sample of the data:

var stringData = $.ajax({
                    url: "http://localhost/_FunStuff/_ChatCounter/2021.01.04_textFile.txt",
                    async: false
                 }).responseText;

const str = `10/06/2019, 23:17 - Nasu Alex Taranu Gilmeanu: De iesit iesim cu siguranta. Dar tre sa cadem de acord la o varianta
 10/06/2019, 23:17 - Dura Stefanel: Serios acum
 10/06/2019, 23:18 - Dura Stefanel: E cea mai frumoasa cazare de pe site: din câte am văzut pana acum
 11/06/2019, 00:04 - Nasu Alex Taranu Gilmeanu: http://www.booking.com/Share-CJY
 11/06/2019, 18:31 - Danutz: Sa îl mănânci - cu Botu :)`;

function splitImportedData(stringData) {
  $arrRows = stringData.split("\n");
  for (rowi = 0; rowi < $arrRows.length; rowi++) {
    var $strRow = $arrRows[rowi]
    var $arrRow = $strRow.split(/[,:\s+\-]/);
    if (rowi == 1) {
      console.log($arrRow);
      //alert($arrRow);
    }
  }
}

splitImportedData(str);

Upvotes: 2

Views: 203

Answers (3)

D. Seah
D. Seah

Reputation: 4592

you need not have to do a split on the row.

const str = `- 10/06/2019, 23:17 - Nasu Alex Taranu Gilmeanu: De iesit iesim cu siguranta. Dar tre sa cadem de acord la o varianta
 - 10/06/2019, 23:17 - Dura Stefanel: Serios acum
 - 10/06/2019, 23:18 - Dura Stefanel: E cea mai frumoasa cazare de pe site: din câte am văzut pana acum
 - 11/06/2019, 00:04 - Nasu Alex Taranu Gilmeanu: http://www.booking.com/Share-CJY
 - 11/06/2019, 18:31 - Danutz: Sa îl mănânci - cu Botu :)`;

const splitImportedData = (stringData) => {
  return stringData.split("\n").map(row => {
    const m = row.match(/^\s*?- (.+?), (.+?) - (.+?): (.+)/);
    return {
      date: m[1],
      time: m[2],
      name: m[3],
      text: m[4],
    }
  });
}

console.log(splitImportedData(str));

Upvotes: 4

Randy Casburn
Randy Casburn

Reputation: 14175

Might I encourage you to use the .match() method instead splitting and more parsing. As you'll see, you get the direct result you are seeking:

const str = `- 10/06/2019, 23:17 - Nasu Alex Taranu Gilmeanu: De iesit iesim cu siguranta. Dar tre sa cadem de acord la o varianta\n
 - 10/06/2019, 23:17 - Dura Stefanel: Serios acum\n
 - 10/06/2019, 23:18 - Dura Stefanel: E cea mai frumoasa cazare de pe site: din câte am văzut pana acum\n
 - 11/06/2019, 00:04 - Nasu Alex Taranu Gilmeanu: http://www.booking.com/Share-CJY\n
 - 11/06/2019, 18:31 - Danutz: Sa îl mănânci - cu Botu :)\n`;

function splitImportedData(stringData) {
  const result = [];
  const regex = /(?<date>\d{2}\/\d{2}\/\d{4}), (?<time>\d{2}:\d{2}) - (?<author>.*(?=:)):\s(?<comment>.*)/;
  stringData.split("\n").filter(s=>s!='').forEach(r => {
    result.push(r.match(regex).groups);
  });
  return result;
}

let res = splitImportedData(str);
console.log(res);

Upvotes: 2

Ignacio Mart&#237;nez
Ignacio Mart&#237;nez

Reputation: 891

You can use non-capturing groups in your split regex eg: /(?:, )|(?:- )|(?:: )/ obviously you can make something smarter but this could help as a basic example.

You can use as ref: Regular_Expressions/Groups_and_Ranges

Upvotes: 1

Related Questions