Reputation: 41
Regular expression to extract all prices in a text, where prices will use "," as decimal separator. There are no thousands separator and they will be followed by " UDS". For example:
1500 USD
9 USD
0,53 USD
12,01 USD
^[^0]\d+(\,)?[0-9]{0,2} USD
It works for:
1500 USD
12,01 USD
but it does not work for:
9 USD
0,53 USD
Upvotes: 1
Views: 239
Reputation: 163277
In your pattern ^[^0]\d+(\,)?[0-9]{0,2} USD
in this part ^[^0]
the first ^ is an anchor asserting the start of the string.
The second ^ is at the start inside a character class and its meaning is different. It creates a negated character class meaning that it can not start with 0.
The following part (\,)?[0-9]{0,2}
is an optional group to match a comma(note that you don't have to escape it) and 0-2 digits. This way a value like 1,
would also match.
There is no language tagged, but if a positive lookahead and a negative lookbehind are supported you might use this pattern to extract prices in a text using word boundaries to prevent the digits and USD being part of a larger word. (?<!\S)
asserts that what is directly on the left is not a non whitespace character.
If you want the whole match instead of only the prices, you can match USD instead of using the positive lookahead.
(?<!\S)\d+(?:,\d{1,2})?(?= USD\b)
Another option is to use a capturing group instead of a lookahead. (?:^|\s)
asserts the start of the string or match a whitespace character.
(?:^|\s)(\d+(?:,\d{1,2})?) USD\b
Upvotes: 2
Reputation: 1325
In JavaScript
/^\d{1,}(,\d{2}){0,1} USD$/
var regex = /^\d{1,}(,\d{2}){0,1} USD$/;
// true result
console.log(regex.test('9 USD'));
console.log(regex.test('0,53 USD'));
console.log(regex.test('12,01 USD'));
console.log(regex.test('1500 USD'));
// false result
console.log(regex.test(' USD'));
console.log(regex.test('0,5,3 USD'));
console.log(regex.test('12,0124 USD'));
console.log(regex.test('1s500 USD'));
OR sed in action:
% echo "1500 USD 9 USD 0,53 USD 12,01 USD" |sed -E 's/[0-9]+(,[0-9][0-9]){0,1} USD/TRUE/g'
TRUE TRUE TRUE TRUE
option -E enables extended regular expressions
Upvotes: 2
Reputation: 27723
My guess is that this simple expression would return what we might want:
([0-9,.]+)
regardless of other text contents that we might have, since validation is not required here, assuming that our prices are valid.
jex.im visualizes regular expressions:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"([0-9,.]+)";
string input = @"500 USD 9 USD 0,53 USD 12,01 USD
1500 USD 12,01 USD 9 USD 0,53 USD 1500 USD 12,01 USD 9 USD 0,53 USD ";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
const regex = /([0-9,.]+)/gm;
const str = `500 USD 9 USD 0,53 USD 12,01 USD
1500 USD 12,01 USD 9 USD 0,53 USD 1500 USD 12,01 USD 9 USD 0,53 USD `;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Upvotes: -1