Reputation: 1068
I am trying to extract a url from the string. But I am unable to skip the double quotes in the output.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
String s1 = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href=\"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
//System.out.println(s1);
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))");
Matcher matcher = pattern.matcher(s1);
if(matcher.find()){
String url = matcher.group(1);
System.out.println(url);
}
}
}
My Output is:
"https://||domainName||/basketReviewPageLoadAction.do"
Expected Output is:
https://||domainName||/basketReviewPageLoadAction.do
I cannot do string replace. I have add few get param in this output and attach back it to original string.
Upvotes: 0
Views: 306
Reputation: 3405
Regex: (?<=href=")([^\"]*)
Substitution: $1?params...
Details:
(?<=)
Positive Lookbehind()
Capturing group[^]
Match a single character not present in the list*
Matches between zero and unlimited times$1
Group 1.Java code:
By using function replaceAll
you can add your params ?abc=12
to the end of the capturing group $1
in this case href
.
String text = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href=\"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
text = text.replaceAll("(?<=href=\")([^\"]*)", String.format("$1%s", "?abc=12"));
System.out.print(text);
Output:
<a id="BUTTON_LINK" style="%%BUTTON_LINK%%" target="_blank" href="https://||domainName||/basketReviewPageLoadAction.do?abc=12">%%CHECKOUT%%</a>
Upvotes: 1
Reputation: 1068
This solution worked for now.
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*\"([^\"]*)");
Upvotes: 0
Reputation: 1589
ugly, seems works.Hope this help.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.Stream;
class Main {
public static void main(String[] args) {
String s1 = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href= \"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
//System.out.println(s1);
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*(\"([^\"]*)\"|'([^']*)'|([^'\">\\s]+))");
Matcher matcher = pattern.matcher(s1);
if (matcher.find()) {
String url = Stream.of(matcher.group(2), matcher.group(3),
matcher.group(4)).filter(s -> s != null).collect(Collectors.joining());
System.out.print(url);
}
}
}
Upvotes: 0
Reputation: 6910
You can try one of these options:
System.out.println(url.replaceAll("^\"|\"$", ""));
System.out.println(url.substring(1, url.length()-1));
Upvotes: 1