Reputation: 31
Sample text is:
Statement_10125229_20170807.pdf
I would like to get the date, 20170807.
I was able to extract the statement ID using (?<=_).*?(?=_)
= 10125229.
Now I would like to extract the date, I have tried (?<=_)\d*
but I am still getting back also the Statement ID.
Upvotes: 2
Views: 418
Reputation: 626747
In general, (?<=_)\d+(?=\.)
and (?<=_)[^_]*(?=\.pdf)
would solve your issue. The (?<=_)\d+(?=\.)
pattern matches one or more digits that are immediately preceded with a _
and immediately followed with .
. The (?<=_)[^_]*(?=\.pdf)
pattern matches any zero or more chars other than _
that are immediately preceded with a _
and immediately followed with .pdf
.
However, in C#, you can actually get the substring you need without a regex. You can use
var text = "Statement_10125229_20170807.pdf";
var result = Path.GetFileNameWithoutExtension(text).Split('_').LastOrDefault();
With a regex, you can also go for a capturing approach:
var result = Regex.Match(text, @"_(\d+)\.pdf$")?.Groups[1].Value;
See the C# demo online, both approaches yield 20170807
.
Upvotes: 1