Reputation: 79
i have a text in Arabic and i want to use Regex to extract numbers from it. here is my attempt.
String :
"ما المجموع:
1+2"
Match match = Regex.Match(text, "المجموع: ([^\\r\\n]+)", RegexOptions.IgnoreCase);
it will always return false. and groups.value will always return null.
expected output:
match.Groups[1].Value //returns (1+2)
Upvotes: 3
Views: 220
Reputation: 627343
The regex you wrote matches a word, then a colon, then a space and then 1 or more chars other than backslash, r
and n
.
You want to match the whole line after the word, colon and any amount of whitespace chars:
var text = "ما المجموع:\n1+2";
var result = Regex.Match(text, @"المجموع:\s*(.+)")?.Groups[1].Value;
Console.WriteLine(result); // => 1+2
See the C# demo
Other possible patterns:
@"المجموع:\r?\n(.+)" // To match CRLF or LF line ending only
@"المجموع:\n(.+)" // To match just LF ending only
Also, if you run the regex against a long multiline text with CRLF endings, it makes sense to replace .+
wit [^\r\n]+
since .
in a .NET regex matches any chars but newlines, LF, and thus matches CR symbol.
Upvotes: 1