Reputation: 693
I know of using this:
public String RemoveTag(String html){
html = html.replaceAll("\\<.*?>","");
html = html.replaceAll(" ","");
html = html.replaceAll("&","");
return html;
}
This removes all tags within an html string. However the question is how does it get a wild characters in between <.*?>
. Could someone give me a more detailed explanation on how getting wild characters in String.
The main reason for this is that I still have this characters that has "an @ at start point and } at end point" and I want to get rid of everything in between "@"
and "}"
.
Upvotes: 1
Views: 2374
Reputation: 178431
regular expressions can be implemented by building a finite automaton, since every regular expression has a finite deterministic automaton and vice versa.
The regex for what you are seeking is @.*?}
if you want to keep these chars: you can replace it with "@}"
instead of with ""
. it will be something like: s.replaceAll("@.*?}", "@}")
[s
is your String].
It seems you might need the regex "@.*?\}", though the special }
char should be ignored by the pattern recognizer when it fails to see the preceding {
. To be on the safe side: "@.*?\\}"
should work either way, as @WayneBaylor posted.
You might want to read more on regular expressions
Upvotes: 2
Reputation: 169
The first parameter to replaceAll(...) is a regex string. The .*?
in your example is the part that matches anything. So, if you want a regular expression that will get rid of everything between "@" and "}" you would use something like:
String exampleText = "Start @some text} finish.";
exampleText.replaceAll("@(.*?)\\}", "@}");
System.out.println(exampleText); // prints "Start @} finish."
Notice the same pattern: .*?
. The parentheses, which are optional here, are just used for grouping. Also notice the }
is escaped with backslashes since it can have special meaning within regular expressions.
For more info on Java's regex support see the Pattern class.
Upvotes: 2