Reputation: 191
I want to count the number of words in a particular line which contains a specific ID (e.g. *AUY). So far I have tried using the below regex for finding the line but it does not consider the "*" at the start
^ *(.*\b(?:\\*AUY)\b.*) *$
I have below test string
*AUY: today is holiday so Peter and Mary do not need to go to work .
%mor: n|today cop|be&3s n|holiday conj|so n:prop|Peter conj|and n:prop|Mary v|do neg|not v|need inf|to v|go prep|to n|work .
%snd: <00:00:00><00:07:37>
%AUY: ok_pfp (0.40) er today is holiday errfr ::: so er Peter and Mary {is} ~ er do not need errfr ::: to go to work . errfr :;:a |
The result should be only first string but it returns first and last string in result matches. See this Rubular
Upvotes: 0
Views: 94
Reputation: 10466
Try that:
/^.*?\*AUY:(.*?)$/gmi
Code Sample:
function countWord(){
const regex = /^.*?\*AUY:(.*?)$/gmi;
const str = `*AUY: today is holiday so Peter and Mary do not need to go to work .
%mor: n|today cop|be&3s n|holiday conj|so n:prop|Peter conj|and n:prop|Mary v|do neg|not v|need inf|to v|go prep|to n|work .
%snd: <00:00:00><00:07:37>
%AUY: ok_pfp (0.40) er today is holiday errfr ::: so er Peter and Mary {is} ~ er do not need errfr ::: to go to work . errfr :;:a |`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
alert(m[1].match(/\b(\w+)\b/g).length);
}
}
Upvotes: 2
Reputation: 41
use the following regex,
(^.*\*AUY.*$)
You can check it here
Upvotes: 0
Reputation: 6552
Let x
be your string. Then
(x.match(/(^|\n)\*AUY[^\r\n]*/g) || [])
.map(
function(s) { return s.match(/\S+/g).length; }
);
Will return an array of the number of word-like constructs within the respective lines which begin with the string '*AUY'.
Explanation:
The regular expression looks for the string *AUY at the beginning of the string or directly after any newline (i.e., at the beginning of a line even if that line is not at the beginning of the string), as well as any non-CRLF characters following that first token of *AUY (i.e., the rest of that line).
The idiom || []
after a match is performed will return an empty array if the match value is null
, thus preventing an error when an array is expected instead of a null value.
The final step .map
operates on each element of the matched array and counts the non-whitespace matches and returns these counts as a new array. Note that we do not need to protect this match with the || []
idiom because a null match is impossible, due to the fact that the line contains at minimum the non-whitespace string *AUY.
You can work with this code as a starting point to do what you actually want to do. Good luck!
Upvotes: 3