Filipe R.
Filipe R.

Reputation: 43

Parse a String of key=value to a Map

I'm using an API that gives me a XML and I need to get a map from one tag which is actually a string. Example:

Having

Billable=7200,Overtime=false,TransportCosts=20$

I need

["Billable"="7200","Overtime=false","TransportCosts"="20$"]

The problem is that the string is totally dynamic, so, it can be like

Overtime=true,TransportCosts=one, two, three
Overtime=true,TransportCosts=1= 1,two, three,Billable=7200

So I can not just split by comma and then by equal sign. Is it possible to convert a string like those to a map using a regex?

My code so far is:

private Map<String, String> getAttributes(String attributes) {
    final Map<String, String> attr = new HashMap<>();
    if (attributes.contains(",")) {
        final String[] pairs = attributes.split(",");
        for (String s : pairs) {
            if (s.contains("=")) {
                final String pair = s;
                final String[] keyValue = pair.split("=");
                attr.put(keyValue[0], keyValue[1]);
            }
        }
        return attr;
    }
    return attr;
}

Thank you in advance

Upvotes: 4

Views: 2937

Answers (4)

Yan Khonski
Yan Khonski

Reputation: 13083

I saw this code using Guava

import com.google.common.base.Splitter;


/**
 *  parse string 'prop1=val1; prop2=val2' to map
 */
 public static Map<String, String> parseMap(final String keyValueString) {
     if (StringUtils.isEmpty(keyValueString)) return Collections.emptyMap();

      return Splitter.on(";")
            .trimResults()
            .withKeyValueSeparator('=')
            .split(keyValueString);
}

One note, Idea shows a warning because Splitter is annotated with com.google.common.annotations.Beta It is not bad, but can require some working during the guava library version update.

Upvotes: 0

walen
walen

Reputation: 7273

Alternative, IMHO simpler regex: ([^,]+=[^=]+)(,|$)

([^,]+=[^=]+) → Groups of: anything but a comma, followed by 1 equals sign, followed by anything but an equals sign...
(,|$) → ... separated by either a comma or end-of-line

Tests:

public static void main(String[] args) {
    Pattern pattern = Pattern.compile("([^,]+=[^=]+)(,|$)");

    String test1 = "abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982";
    System.out.println("Test 1: "+test1);
    Matcher matcher = pattern.matcher(test1);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
    System.out.println();
    String test2 = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
    System.out.println("Test 2: "+test2);
    matcher = pattern.matcher(test2);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
}

Output:

Test 1: abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982
abc=def,jkl
nm=ghi
egrh=jh=22,kdfka,92
kjasd=908@0982

Test 2: Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
Overtime=true
TransportCosts=1= 1,two, three
Billable=7200

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You may use

(\w+)=(.*?)(?=,\w+=|$)

See the regex demo.

Details

  • (\w+) - Group 1: one or more word chars
  • = - an equal sign
  • (.*?) - Group 2: any zero or more chars other than line break chars, as few as possible
  • (?=,\w+=|$) - a positive lookahead that requires a ,, then 1+ word chars, and then =, or end of string immediately to the right of the current location.

Java code:

public static Map<String, String> getAttributes(String attributes) {
    Map<String, String> attr = new HashMap<>();
    Matcher m = Pattern.compile("(\\w+)=(.*?)(?=,\\w+=|$)").matcher(attributes);
    while (m.find()) {
        attr.put(m.group(1), m.group(2));
    }
    return attr;
}

Java test:

String s = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
Map<String,String> map = getAttributes(s);
for (Map.Entry entry : map.entrySet()) {
    System.out.println(entry.getKey() + "=" + entry.getValue());
}

Result:

Overtime=true
Billable=7200
TransportCosts=1= 1,two, three

Upvotes: 3

Shar1er80
Shar1er80

Reputation: 9041

First thing I noticed is that a delimiter is not easily identifiable with the data you're giving, but what appears to be identifiable is that a comma followed by a capital letter separates each field.

This allows for an approach to change the delimiter to something that easily identifiable with regex using String.replaceAll("(?<=,)([A-Z])", ",$1"). Now you'll have a delimiter that you can identify (,,) and split the data to insert the quotes where needed.

Something like:

public class StackOverflow {
    public static void main(String[] args) {
        String [] data = {
                "Overtime=true,TransportCosts=one, two, three",
                "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200"
        };

        for (int i = 0; i < data.length; i++) {
            data[i] = data[i].replaceAll("(?<=,)([A-Z])", ",$1");
            String[] pieces = data[i].split(",,");
            for (int j = 0; j < pieces.length; j++) {
                int equalIndex = pieces[j].indexOf("=");
                StringBuilder sb = new StringBuilder(pieces[j]);
                // Insert quotes around the = sign
                sb.insert(equalIndex, "\"");
                sb.insert(equalIndex + 2, "\"");
                // Insert quotes at the beginning and end of the string
                sb.insert(0, "\"");
                sb.append("\"");
                pieces[j] = sb.toString();              
            }

            // Join the pieces back together delimited by a comma
            data[i] = String.join(",", pieces);
            System.out.println(data[i]);
        }
    }
}

Results

"Overtime"="true","TransportCosts"="one, two, three"
"Overtime"="true","TransportCosts"="1= 1,two, three","Billable"="7200"

Upvotes: 1

Related Questions