user656710
user656710

Reputation: 21

Extracting two strings from quotations in Java using regex?

I'm new to using patterns and looked everywhere on the internet for an explanation to this problem.

Say I have a string: String info = "Data I need to extract is 'here' and 'also here'";

How would I extract the words:

here
also here

without the single quotes using a pattern?

This is what I have so far...

Pattern p = Pattern.compile("(?<=\').*(?=\')");

But it returns ( here and 'also here ) minus the brackets, that is just for viewing. It skips over the second piece of data and goes straight to the last quote...

Thank you!

EDIT:

Thank you for your replies everyone! How would it be possible to alter the pattern so that here is stored in matcher.group(1) and also here is stored in matcher.group(2)? I need these values for different reasons, and splitting them from 1 group seems inefficient...

Upvotes: 1

Views: 808

Answers (4)

limc
limc

Reputation: 40176

This should work for you:

    Pattern p = Pattern.compile("'([\\w\\s]+)'");
    String info = "Data I need to extract is 'here' and 'also here'";
    Matcher m = p.matcher(info);
    while (m.find()) {
        System.out.println(m.group(1));
    }

Here's the printout:-

here
also here

If you want the data in 2 separate groups, you could do something like this:-

    Pattern p = Pattern.compile("^[\\w\\s]*?'([\\w\\s]+)'[\\w\\s]*?'([\\w\\s]+)'$");
    String info = "Data I need to extract is 'here' and 'also here'";
    Matcher m = p.matcher(info);
    while (m.find()) {
        System.out.println("Group 1: " + m.group(1));
        System.out.println("Group 2: " + m.group(2));
    }

Here's the printout:

Group 1: here
Group 2: also here

Upvotes: 1

codaddict
codaddict

Reputation: 455282

Try making your regex non-greedy:

Pattern p = Pattern.compile("(?<=')(.*?)(?=')");

EDIT:

This does not work. It gives the following matches:

here
 and 
also here

This is because the lookahead/lookbehind do not consume the '.

To fix this use the regex:

Pattern p = Pattern.compile("'(.*?)'");

or even better (& faster):

Pattern p = Pattern.compile("'([^']*)'");

Upvotes: 3

Johan Sj&#246;berg
Johan Sj&#246;berg

Reputation: 49197

I think you're making it to complicated, try

Pattern.compile("'([^']+)'");

or

Pattern.compile("'(.*?)'");

They will both work. Then you can extract the result from the first group matcher.group(1) after performing a matcher.find().

Upvotes: 1

CAFxX
CAFxX

Reputation: 30331

Why not using simply the following?

'.*?'

Upvotes: 0

Related Questions