Reputation: 59
My String:
BByTTheWay
.I want to split the string as B By T The Way BByTheWay
.That means I want to split string if I get any capital letters and last put the main string as it is. As far I tried in java:
public String breakWord(String fileAsString) throws FileNotFoundException, IOException {
String allWord = "";
String allmethod = "";
String[] splitString = fileAsString.split(" ");
for (int i = 0; i < splitString.length; i++) {
String k = splitString[i].replaceAll("([A-Z])(?![A-Z])", " $1").trim();
allWord = k.concat(" " + splitString[i]);
allWord = Arrays.stream(allWord.split("\\s+")).distinct().collect(Collectors.joining(" "));
allmethod = allmethod + " " + allWord;
// System.out.print(allmethod);
}
return allmethod;
}
It givs me the output: B ByT The Way BByTTheWay
. I think stackoverflow community help me to solve this.
Upvotes: 2
Views: 116
Reputation: 6460
As per requirements, you can write in this way checking if a character is an alphabet or not:
char[] chars = fileAsString.toCharArray();
StringBuilder fragment = new StringBuilder();
for (char ch : chars) {
if (Character.isLetter(ch) && Character.isUpperCase(ch)) { // it works as internationalized check
fragment.append(" ");
}
fragment.append(ch);
}
String.join(" ", fragment).concat(" " + fileAsString).trim(); // B By T The Way BByTTheWay
Upvotes: 1
Reputation: 18621
Use
String s = "BByTTheWay";
Pattern p = Pattern.compile("[A-Z][a-z]*");
Matcher m = p.matcher(s);
String r = "";
while (m.find()) {
r = r + m.group(0) + " ";
}
System.out.println(r + s);
See Java proof.
Results: B By T The Way BByTTheWay
EXPLANATION
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or more
times (matching the most amount possible))
Upvotes: 1
Reputation: 785521
You may use this code:
Code 1
String s = "BByTTheWay";
Pattern p = Pattern.compile("\\p{Lu}\\p{Ll}*");
String out = p.matcher(s)
.results()
.map(MatchResult::group)
.collect(Collectors.joining(" "))
+ " " + s;
//=> "B By T The Way BByTTheWay"
RegEx \\p{Lu}\\p{Ll}*
matches any unicode upper case letter followed by 0 or more lowercase letters.
Or use String.split
using same regex and join it back later:
Code 2
String out = Arrays.stream(s.split("(?=\\p{Lu})"))
.collect(Collectors.joining(" ")) + " " + s;
//=> "B By T The Way BByTTheWay"
Upvotes: 4