Reputation: 687
I'm trying to clean up text fetched from Wikipedia via the API. I want to remove the words 'Template:sfn' but I'm having trouble doing that with regular expression:
The text is something like: ...a private boarding school, after his parents discovered that he had made frequent trips into Manhattan without their permission.Template:sfn
With the help of: https://regex101.com I found out that my regexp needs to be something like: \Template:.*\s
If I copy and paste that to Android studio:
plainStr = plainStr.replaceAll("\\Template:.*\\s", "");
It tells me the two backslashes are not correct (illegal/unsupported escape sequence)
How do I rewrite my expression so AS accepts it?
And on top of that I want to remove all words after the colon. Wikipedia sometimes has Template:Nowrap, Template:Main, etc. That's my 'output' if I use the Bliki library (ConvertWikiToHtml) I want to remove all combinations. To make it more complex it could also be more than one word like:Template:dead link or Template:cite press release but I don't think this can be handled with a regex.
Kind regards,
Mike
Upvotes: 0
Views: 815
Reputation: 3191
In my case it was Android Studio issue. At least I couldn't explain it otherwise because the code below was compiled and ran well in Eclipse:
Pattern p = Pattern.compile(".*\\R|.+\\z");
However after I copied it and pasted in Android Studio \\R
was underlined with the error message "illegal/unsupported escape sequence". However the code was compiled and executed without any problems.
Upvotes: 1
Reputation: 98398
You are mistaken; a backslash before the T accomplishes nothing.
Just remove it:
plainStr = plainStr.replaceAll("Template:.*\\s", "");
Upvotes: 1