Muhammad Umar
Muhammad Umar

Reputation: 11782

Get an array of Strings matching a pattern from a String

I have a long string let's say

I like this #computer and I want to buy it from #XXXMall.

I know the regular expression pattern is

Pattern tagMatcher = Pattern.compile("[#]+[A-Za-z0-9-_]+\\b");

Now i want to get all the hashtags in an array. How can i use this expression to get array of all hash tags from string something like

ArrayList hashtags = getArray(pattern, str)

Upvotes: 1

Views: 8642

Answers (5)

Nikhil Kumar K
Nikhil Kumar K

Reputation: 1117

you can use the following code for getting the names

    String saa = "#{akka}nikhil#{kumar}aaaaa";
    Pattern regex = Pattern.compile("#\\{(.*?)\\}");
    Matcher m = regex.matcher(saa);
    while(m.find()) {
        String s = m.group(1); 
        System.out.println(s);
    }

It will print

akka
kumar

Upvotes: 0

Sujith PS
Sujith PS

Reputation: 4864

You can use :

String val="I like this #computer and I want to buy it from #XXXMall.";
String REGEX = "(?<=#)[A-Za-z0-9-_]+";
List<String> list = new ArrayList<String>();
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(val);
while(matcher.find()){
    list.add(matcher.group());
}

(?<=#) Positive Lookbehind - Assert that the character # literally be matched.

Upvotes: 0

Justin
Justin

Reputation: 25327

Here is one way, using Matcher

Pattern tagMatcher = Pattern.compile("#+[-\\w]+\\b");
Matcher m = tagMatcher.matcher(stringToMatch);

ArrayList<String> hashtags = new ArrayList<>();

while (m.find()) {
    hashtags.add(m.group());
}

I took the liberty of simplifying your regex. # does not need to be in a character class. [A-Za-z0-9_] is the same as \w, so [A-Za-z0-9-_] is the same as [-\w]

Upvotes: 0

sanbhat
sanbhat

Reputation: 17622

You can write like?

private static List<String> getArray(Pattern tagMatcher, String str) {
    Matcher m = tagMatcher.matcher(str);
    List<String> l = new ArrayList<String>();
    while(m.find()) {
        String s = m.group(); //will give you "#computer"
        s = s.substring(1); // will give you just "computer"
        l.add(s);
    }
    return l;
}

Also you can use \\w- instead of A-Za-z0-9-_ making the regex [#]+[\\w]+\\b

Upvotes: 2

Chintan Soni
Chintan Soni

Reputation: 25267

This link would surely be helpful for achieving what you want.

It says:

The find() method searches for occurrences of the regular expressions in the text passed to the Pattern.matcher(text) method, when the Matcher was created. If multiple matches can be found in the text, the find() method will find the first, and then for each subsequent call to find() it will move to the next match.

The methods start() and end() will give the indexes into the text where the found match starts and ends.

Example:

String text    =
        "This is the text which is to be searched " +
        "for occurrences of the word 'is'.";

String patternString = "is";

Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);

int count = 0;
while(matcher.find()) {
    count++;
    System.out.println("found: " + count + " : "
            + matcher.start() + " - " + matcher.end());
}

You got the hint now.

Upvotes: 0

Related Questions