John
John

Reputation: 35

Extracting String, number group, next string, next number group

I am a newbie to Regex and I am struggling to find a solution to my problem. I have a file with multiple entries. Here is an example:

1)Hello my is blah blah blah. Blah blah Building 5677 - Door 98 blah blah blah.

2)Hi, the name of my dog is blah blah Building 36767 & Door 898900 blah blah blah.

3)Hey now, blah blah Building 345 DR 898. Blah Blah blag Building 333 - Door 89797 blah.

I need to extract each instance of Building number and Door number from each line. The only pattern that is constant throughout each entry is:

1) the word "Building" is always present.

2) "Building" is always followed by a group of integers...the letter "D | d"...and a second group of integers (followed by a non-integer).

All I want is to pull is Building number and Door number and print to console, but I am having trouble translating this into a regex pattern. I am using Java.

Upvotes: 1

Views: 377

Answers (2)

Mark Peters
Mark Peters

Reputation: 81064

I think this should work:

Building.+?(\d+).+?[Dd].+?(\d+)

Your numbers will be in groups 1 and 2.

Building //start by matching "Building"
.+?      //then skip over the least number of characters that allows the match
(\d+)    //then read as many digits as possible and put them in group one
.+?      //then skip over the least number of characters that allows the match
[Dd]     //then match an upper- or lower-case 'D' 
.+?      //then skip over the least number of characters that allows the match
(\d+)    //then read as many digits as possible and put them in group two

So in Java:

Pattern pat = Pattern.compile("Building.+?(\\d+).+?[Dd].+?(\\d+)");
Matcher matcher = 
pat.matcher("Hello my is blah blah blah. Blah blah Building 5677 - Door 98 blah blah blah. ");
if (matcher.find()) {
   System.out.println(matcher.group(1));
   System.out.println(matcher.group(2));
}

Edit

To extract more than one set of numbers from one input, as in your third example, you can use

while (matcher.find()) {

Instead of using if to find it just once.

Upvotes: 2

Amol Katdare
Amol Katdare

Reputation: 6760

Regex to find Building numbers -

(?<=Building\\s)[0-9]+

Similarly for Door numbers -

(?<=Door\\s)[0-9]+

To put it together -

public static void main(String[] args) {
    String inputStr = "Hello my is blah blah blah. Blah blah Building 5677 - Door 98 blah blah blah";

    Pattern patternBuilding = Pattern.compile("(?<=Building\\s)[0-9]+");
    Pattern patternDoor = Pattern.compile("(?<=Door\\s)[0-9]+");
    Matcher matcherBuilding = patternBuilding.matcher(inputStr);
    Matcher matcherDoor = patternDoor.matcher(inputStr);
    if (matcherBuilding.find())
        System.out.println("Building number is " + matcherBuilding.group());
    if (matcherDoor.find())
        System.out.println("Door number is " + matcherDoor.group());
}

Upvotes: 0

Related Questions