Reputation: 188
I have been getting all the phrases that have numbers with words beside them by using this expression.
[\d+](?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,1}
Input:
50 cases are confirmed.
On January 30, there are confirmed cases of the virus.
1,300 women are suscpected.
Matches:
50 Men are involved.
On January 30, there are confirmed cases of the virus.
1,300 women are suscpected.
The problem is, there are cases where dates are involved, which I'm not interested in getting. So my expected output are actually just these two
Expected:
50 Men are involved.
On January 30, there are confirmed cases of the virus.
1,300 women are suscpected.
How do I ignore the numbers that ends with commas?
I have tried doing the following expression by adding , to ignore but it just counts the numbers individually.
Attempt:
[\d+](?:[^a-zA-Z'-,]+[a-zA-Z'-]+){0,1}
Output:
50 Men are involved.
On January 3 0, there are confirmed cases of the virus.
1,300 women** are suscpected.
Upvotes: 1
Views: 40
Reputation: 37347
Try \d+(?:,\d+)?\s+[a-zA-Z]+
Explanation:
\d+
- match 1+ digits
(?:...)
- non-capturing group
,\d+
- match comma ,
and 1+ digits
?
- match preceeidng pattern 0 or 1 time ({0,1}
equivalent)
\s+
- match 1+ whitespaces
[a-zA-Z]+
- match 1+ lowercase or uppercase characters
Upvotes: 2