Reputation: 43
I'm using an API that gives me a XML and I need to get a map from one tag which is actually a string. Example:
Having
Billable=7200,Overtime=false,TransportCosts=20$
I need
["Billable"="7200","Overtime=false","TransportCosts"="20$"]
The problem is that the string is totally dynamic, so, it can be like
Overtime=true,TransportCosts=one, two, three
Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
So I can not just split by comma and then by equal sign. Is it possible to convert a string like those to a map using a regex?
My code so far is:
private Map<String, String> getAttributes(String attributes) {
final Map<String, String> attr = new HashMap<>();
if (attributes.contains(",")) {
final String[] pairs = attributes.split(",");
for (String s : pairs) {
if (s.contains("=")) {
final String pair = s;
final String[] keyValue = pair.split("=");
attr.put(keyValue[0], keyValue[1]);
}
}
return attr;
}
return attr;
}
Thank you in advance
Upvotes: 4
Views: 2937
Reputation: 13083
I saw this code using Guava
import com.google.common.base.Splitter;
/**
* parse string 'prop1=val1; prop2=val2' to map
*/
public static Map<String, String> parseMap(final String keyValueString) {
if (StringUtils.isEmpty(keyValueString)) return Collections.emptyMap();
return Splitter.on(";")
.trimResults()
.withKeyValueSeparator('=')
.split(keyValueString);
}
One note, Idea shows a warning because Splitter
is annotated with
com.google.common.annotations.Beta
It is not bad, but can require some working during the guava library version update.
Upvotes: 0
Reputation: 7273
Alternative, IMHO simpler regex: ([^,]+=[^=]+)(,|$)
([^,]+=[^=]+)
→ Groups of: anything but a comma, followed by 1 equals sign, followed by anything but an equals sign...
(,|$)
→ ... separated by either a comma or end-of-line
Tests:
public static void main(String[] args) {
Pattern pattern = Pattern.compile("([^,]+=[^=]+)(,|$)");
String test1 = "abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982";
System.out.println("Test 1: "+test1);
Matcher matcher = pattern.matcher(test1);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
System.out.println();
String test2 = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
System.out.println("Test 2: "+test2);
matcher = pattern.matcher(test2);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Output:
Test 1: abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982
abc=def,jkl
nm=ghi
egrh=jh=22,kdfka,92
kjasd=908@0982
Test 2: Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
Overtime=true
TransportCosts=1= 1,two, three
Billable=7200
Upvotes: 0
Reputation: 626758
You may use
(\w+)=(.*?)(?=,\w+=|$)
See the regex demo.
Details
(\w+)
- Group 1: one or more word chars=
- an equal sign(.*?)
- Group 2: any zero or more chars other than line break chars, as few as possible(?=,\w+=|$)
- a positive lookahead that requires a ,
, then 1+ word chars, and then =
, or end of string immediately to the right of the current location.Java code:
public static Map<String, String> getAttributes(String attributes) {
Map<String, String> attr = new HashMap<>();
Matcher m = Pattern.compile("(\\w+)=(.*?)(?=,\\w+=|$)").matcher(attributes);
while (m.find()) {
attr.put(m.group(1), m.group(2));
}
return attr;
}
String s = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
Map<String,String> map = getAttributes(s);
for (Map.Entry entry : map.entrySet()) {
System.out.println(entry.getKey() + "=" + entry.getValue());
}
Result:
Overtime=true
Billable=7200
TransportCosts=1= 1,two, three
Upvotes: 3
Reputation: 9041
First thing I noticed is that a delimiter is not easily identifiable with the data you're giving, but what appears to be identifiable is that a comma followed by a capital letter separates each field.
This allows for an approach to change the delimiter to something that easily identifiable with regex using String.replaceAll("(?<=,)([A-Z])", ",$1")
. Now you'll have a delimiter that you can identify (,,)
and split the data to insert the quotes where needed.
Something like:
public class StackOverflow {
public static void main(String[] args) {
String [] data = {
"Overtime=true,TransportCosts=one, two, three",
"Overtime=true,TransportCosts=1= 1,two, three,Billable=7200"
};
for (int i = 0; i < data.length; i++) {
data[i] = data[i].replaceAll("(?<=,)([A-Z])", ",$1");
String[] pieces = data[i].split(",,");
for (int j = 0; j < pieces.length; j++) {
int equalIndex = pieces[j].indexOf("=");
StringBuilder sb = new StringBuilder(pieces[j]);
// Insert quotes around the = sign
sb.insert(equalIndex, "\"");
sb.insert(equalIndex + 2, "\"");
// Insert quotes at the beginning and end of the string
sb.insert(0, "\"");
sb.append("\"");
pieces[j] = sb.toString();
}
// Join the pieces back together delimited by a comma
data[i] = String.join(",", pieces);
System.out.println(data[i]);
}
}
}
Results
"Overtime"="true","TransportCosts"="one, two, three"
"Overtime"="true","TransportCosts"="1= 1,two, three","Billable"="7200"
Upvotes: 1