M.Huntz
M.Huntz

Reputation: 253

Extract digit after or before a specific word

I have several articles about terrorist attacks which include info of the number of people killed and wounded. I am trying to extract the number concerning the people wounded.

This is a sample of the sentences to target:

at least 22 others were wounded
additional 20 soldiers were wounded
more than 40 people had been wounded
wounding at least six people
injuring at least 60 others
wounding more than 25
27 others were wounded 
wounding 14
wounding 33
185 people were wounded
28 people wounded

As you can see the wordS wounded, wounding,injuring are either before or after the digit I want to extract, ususally within 3 or 4 words of distance from the number.

In this link you can find a sample of the articles and the regualr expression that I am trying to apply without success: [Regex] (https://regex101.com/r/0DRayP/10)

Upvotes: 1

Views: 1225

Answers (1)

Dalorzo
Dalorzo

Reputation: 20014

You need to use capturing groups to get into groups your desired matches like:

(\d+)?.*?(wound(?:ed|ing)|injured).*?(\d+)

You are interested in groups $1, $2 and $3

Here is an example:

Online Demo

Upvotes: 1

Related Questions