Reputation: 23
I would like to remove the character sequences like "htsap://"
or "ftsap://"
from a String
. Is it possible?
Let me illustrate my needs with an example.
Actual input String:
"Every Web page has a http unique address called a URL (Uniform Resource Locator) which identifies where it is located on the Web. For "ftsap://"example, the URL for CSM Library's home page is: "htsap://"www.smccd.edu/accounts/csmlibrary/index.htm The basic parts of a URL often provide \"clues\" to htsap://where a web page originates and who might be responsible for the information at that page or site."
Expected resulting String:
"Every Web page has a http unique address called a URL (Uniform Resource Locator) which identifies where it is located on the Web. For example, the URL for CSM Library's home page is: www.smccd.edu/accounts/csmlibrary/index.htm The basic parts of a URL often provide \"clues\" to where a web page originates and who might be responsible for the information at that page or site."
Patterns I tried: (not very sure it is a right way)
((.*?)(?=("htsap://|ftsap://")))
and:
((.*?)(?=("htsap://|ftsap://")))(.*)
Could anyone please suggest here?
Upvotes: 2
Views: 82
Reputation: 9606
Since you're escaping your quotes within your sample String
s, I'll assume you're working in Java.
You should try:
final String res = input.replaceAll("\"?\\w+://\"?", "");
Here is a link to a working example of what does this regex match exactly!
How it works:
It matches and removes any sequence of alphanumeric characters (and underscores), followed by ://
and possibly preceded and/or followed by "
.
EDIT: How to achieve the same result using a Matcher
?
final String input = "Every Web page has a http unique address called a URL (Uniform Resource Locator) which identifies where it is located on the Web. For \"ftsap://\"example, the URL for CSM Library's home page is: \"htsap://\"www.smccd.edu/accounts/csmlibrary/index.htm The basic parts of a URL often provide \"clues\" to htsap://where a web page originates and who might be responsible for the information at that page or site.";
final Pattern p = Pattern.compile("\"?\\w+://\"?");
final StringBuilder b = new StringBuilder(input);
Matcher m;
while((m = p.matcher(b.toString())).find()) {
b.replace(m.start(), m.end(), "");
}
System.out.println(b.toString());
Upvotes: 1
Reputation: 59282
Use this regex:
"(ftsap|htsap).//"
And replace it with ''
Regex explained:
"(ftsap|htsap).//" with flag g
Upvotes: 0