Omnia
Omnia

Reputation: 887

Matching two or three words after Different Arabic Regex Patterns in Java

Greetings All;

I am a beginner in using regex. What I want to do is to extract 2 or 3 arabic words after a certain pattern.

for example:

If I have an arabic string

inputtext = "تكريم الدكتور احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية "

I need to extract the names after

الدكتور

and

والدكتورة

so the output shall be:

احمد زويل
سميرة موسى

what i have done so far is the following:

inputtext = "تكريم الدكتور احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية "
Pattern pattern = Pattern.compile("(?<=الدكتور).*");
            Matcher matcher = pattern.matcher(inputtext);
            boolean found = false;
            while (matcher.find()) {
                // Get the matching string
                String match = matcher.group();
                System.out.println("the match is: "+match);
                found = true;
            }
            if (!found)
    {
        System.out.println("I didn't found the text");
    }

but it returns:

احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية

I don't know how to add another pattern and how to stop after 2 words?

Would you please help me with any ideas?

Upvotes: 3

Views: 1563

Answers (1)

stema
stema

Reputation: 93036

To match only the following two words try this one:

(?<=الدكتور)\s[^\s]+\s[^\s]+

.* will match everything till the end of the string so that is not what you want

\s is a whitespace character

[^\s] is a negated character group, that will match anything but a whitespace

So my solution will match a whitespace, then at least one non whitespace (the first word), then again a whitespace and once more at least one non whitespace (the second word).

To match your second pattern I would just do a second regex (just exchange the part inside the lookbehind) and match this pattern in a second step. The regular expression is easier to read that way.

Or you can try this

(?<=الدكتور)\s[^\s]+\s[^\s]+|(?<=والدكتورة)\s[^\s]+\s[^\s]+

Upvotes: 2

Related Questions