Bob
Bob

Reputation: 33

Extract words between double quotes based on position

I have a single string that contains several quotes, i.e:

"Bruce Wayne" "43" "male" "Gotham"

I want to create a method using regex that extracts certain values from the String based on their position.

So for example, if I pass the Int values 1 and 3 it should return a String of: "Bruce Wayne" "male"

Please note the double quotes are part of the String and are escaped characters (\")

Upvotes: 0

Views: 1183

Answers (3)

Quinn
Quinn

Reputation: 4504

The function to extract words based on position:

import java.util.ArrayList;
import java.util.regex.*;

public String getString(String input, int i, int j){
    ArrayList <String> list = new ArrayList <String> ();
    Matcher m = Pattern.compile("(\"[^\"]+\")").matcher(input);
    while (m.find()) {
        list.add(m.group(1));
    }
    return list.get(i - 1) + list.get(j - 1);
}

Then the words can be extracted like:

String input = "\"Bruce Wayne\" \"43\" \"male\" \"Gotham\"";
String res = getString(input, 1, 3);
System.out.println(res);

Output:

"Bruce Wayne""male"

Upvotes: 0

arc
arc

Reputation: 4701

My function starts at 0. You said that you want 1 and 3 but usually you start at 0 when working with arrays. So to get "Bruce Wayne" you'd ask for 0 not 1. (you could change that if you'd like though)

String[] getParts(String text, int... positions) {
    String results[] = new String[positions.length];

    Matcher m = Pattern.compile("\"[^\"]*\"").matcher(text);

    for(int i = 0, j = 0; m.find() && j < positions.length; i++) {
        if(i != positions[j]) continue;
        results[j] = m.group();
        j++;
    }

    return results;
}

// Usage
public Test() {

     String[] parts = getParts(" \"Bruce Wayne\" \"43\" \"male\" \"Gotham\" ", 0, 2);
     System.out.println(Arrays.toString(parts));
     // = ["Bruce Wayne", "male"]

}

The method accepts as many parameters as you like.

getParts(" \"a\" \"b\" \"c\" \"d\" ", 0, 2, 3); // = a, c, d
// or 
getParts(" \"a\" \"b\" \"c\" \"d\" ", 3); // = d

Upvotes: 0

Thomas
Thomas

Reputation: 88757

If the number of (possible) groups is known you could use a regular expression like "(.*?)"\s*"(.*?)"\s*"(.*?)"\s*"(.*?)" along with Pattern and Matcher and access the groups by number (group 0 will always be the first match, group 1 will be the first capturing group in the expression and so on).

If the number of groups is not known you could just use expression "(.*?)" and use Matcher#find() too apply the expression in a loop and collect all the matches (group 0 in that case) into a list. Then use your indices to access the list element (element 1 would be at index 0 then).

Another alternative would be to use string.replaceAll("^[^\"]*\"|\"[^\"]*$","").split("\"\\s*\""), i.e. remove the leading and trailing double quotes with any text before or after and then split on quotes with optional whitespace in between.

Example:

  • assume the string optional crap before "Bruce Wayne" "43" "male" "Gotham" optional crap after
  • string.replaceAll("^[^\"]*\"|\"[^\"]*$","") will result in Bruce Wayne" "43" "male" "Gotham
  • applying split("\"\\s*\"") on the result of the step before will yield the array [Bruce Wayne, 43, male, Gotham]
  • then just access the array elements by index (zero-based)

Upvotes: 1

Related Questions