Reputation: 494
I have a log file that is very long, each entry begins on a new line. But some entries have new line breaks in it. So I am splitting my log file using this code, and then I run different Regex rules on it, and everything works fine:
var str = data.split('\n');
.
Once I have some more complex text, that included line breaks in the string. My code breaks. Below is the sample of log file. First line is normal, second line end at (ends here).
3708 07:11:59 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {1846518641516}
908 07:11:40 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {148815184185}, ** [Content]: new: Please note the following when using this app:
▪ Some text
▪ Some text
▪ Some text
▪ Some more and more text., old: Please note the following when using this app:
▪ Some text
▪ Some text
▪ Some text
▪ Some text
▪ Some text
▪ Some text
ends here
Hopefully my question is clear.
How should I refactor my var str = data.split('\n');
in order for it to work for both kind of entries?
Thank you for help
Upvotes: 2
Views: 3567
Reputation: 626758
You need to split at \n
that is followed with a string of digits, a space, and a time-like string:
s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/)
See the regex demo
Details:
\n
- a newline followed with...(?=\d+ \d{2}:\d{2}:\d{2}\b)
- (a positive lookahead that only requires that the string immediately to the right meets the pattern, else fail occurs)
\d+
- 1 or more digits
- a space\d{2}:\d{2}:\d{2}
- 2 digits, :
twice and again 2 diigts \b
- trailing word boundaryvar s = "3708 07:11:59 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {1846518641516} \r\n908 07:11:40 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {148815184185}, ** [Content]: new: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some more and more text., old: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\nends here";
var res = s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/);
console.log(res);
Upvotes: 3