Alex
Alex

Reputation: 38529

RegEx for extracting number from a string

I have a bunch of files in a directory, mostly labled something like...

PO1000000100.doc or .pdf or .txt Some of them are PurchaseOrderPO1000000109.pdf

What i need to do is extract the PO1000000109 part of it. So basically PO with 10 numbers after it... How can I do this with a regex?

(What i'll do is a foreach loop on the files in the directory, get the filename, and run it through the regex to get the PO number...)

I'm using C# - not sure if this is relevant.

Upvotes: 0

Views: 785

Answers (7)

erikkallen
erikkallen

Reputation: 34421

var re = new System.Text.RegularExpressions.Regex("(?<=^PurchaseOrder)PO\\d{10}(?=\\.pdf$)");
Assert.IsTrue(re.IsMatch("PurchaseOrderPO1234567890.pdf"));
Assert.IsFalse(re.IsMatch("some PurchaseOrderPO1234567890.pdf"));
Assert.IsFalse(re.IsMatch("OrderPO1234567890.pdf"));
Assert.IsFalse(re.IsMatch("PurchaseOrderPO1234567890.pdf2"));

Upvotes: 0

YOU
YOU

Reputation: 123917

string data="PurchaseOrderPO1000000109.pdf\nPO1000000100.doc";
MatchCollection matches = Regex.Matches(data, @"PO[0-9]{10}");
foreach(Match m in matches){
    print(m.Value);
}

Results

PO1000000109
PO1000000100

Upvotes: 1

StuffHappens
StuffHappens

Reputation: 6557


Regex.Replace(fileName, @"^.?PO(\d{10}).$", "$1");
Put stars after dots.

Upvotes: 1

Konamiman
Konamiman

Reputation: 50323

If the PO part is always the same, you can just get the number without needing to use a regex:

new string(theString.Where(c => char.IsDigit(c)).ToArray());

Later you can prepend the PO part manually.

NOTE: I'm assuming that you have only one single run of numbers in your strings. If you have for example "abc12345def678" you will get "12345678", which may not be what you want.

Upvotes: 2

Don
Don

Reputation: 9661

Try this

String data = 
  Regex.Match(@"PO\d{10}", "PurchaseOrderPO1000000109.pdf", 
    RegexOptions.IgnoreCase).Value;

Could add a Regex.IsMatch with same vars above ofc :)

Upvotes: 2

DerKlops
DerKlops

Reputation: 1249

A possible regexp could be:

^.*(\d{10})\.\D{3}$

Upvotes: 0

Oded
Oded

Reputation: 499352

This RegEx will pick up all numbers from a string \d*.

As described here.

Upvotes: 0

Related Questions