Sanzida
Sanzida

Reputation: 59

split String If get any capital letters

My String: BByTTheWay .I want to split the string as B By T The Way BByTheWay .That means I want to split string if I get any capital letters and last put the main string as it is. As far I tried in java:

public String breakWord(String fileAsString) throws FileNotFoundException, IOException {

    String allWord = "";
    String allmethod = "";
    String[] splitString = fileAsString.split(" ");
    for (int i = 0; i < splitString.length; i++) {
        String k = splitString[i].replaceAll("([A-Z])(?![A-Z])", " $1").trim();
        allWord = k.concat(" " + splitString[i]);
        allWord = Arrays.stream(allWord.split("\\s+")).distinct().collect(Collectors.joining(" "));
        allmethod = allmethod + " " + allWord;
        //  System.out.print(allmethod);
    }
    return allmethod;

}

It givs me the output: B ByT The Way BByTTheWay . I think stackoverflow community help me to solve this.

Upvotes: 2

Views: 116

Answers (3)

invzbl3
invzbl3

Reputation: 6460

As per requirements, you can write in this way checking if a character is an alphabet or not:

char[] chars = fileAsString.toCharArray();
StringBuilder fragment = new StringBuilder();
for (char ch : chars) {
    if (Character.isLetter(ch) && Character.isUpperCase(ch)) { // it works as internationalized check
        fragment.append(" ");
    }
    fragment.append(ch);
}
String.join(" ", fragment).concat(" " + fileAsString).trim(); // B By T The Way BByTTheWay

Upvotes: 1

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

String s = "BByTTheWay";
Pattern p = Pattern.compile("[A-Z][a-z]*");
Matcher m = p.matcher(s);
String r = "";
while (m.find()) {
    r = r + m.group(0) + " ";
}
System.out.println(r + s);

See Java proof.

Results: B By T The Way BByTTheWay

EXPLANATION

--------------------------------------------------------------------------------
  [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
  [a-z]*                   any character of: 'a' to 'z' (0 or more
                           times (matching the most amount possible))

Upvotes: 1

anubhava
anubhava

Reputation: 785521

You may use this code:

Code 1

String s = "BByTTheWay";
Pattern p = Pattern.compile("\\p{Lu}\\p{Ll}*");

String out = p.matcher(s)
     .results()
     .map(MatchResult::group)
     .collect(Collectors.joining(" "))
     + " " + s;

//=> "B By T The Way BByTTheWay"

RegEx \\p{Lu}\\p{Ll}* matches any unicode upper case letter followed by 0 or more lowercase letters.

CODE DEMO


Or use String.split using same regex and join it back later:

Code 2

String out = Arrays.stream(s.split("(?=\\p{Lu})"))
    
.collect(Collectors.joining(" ")) + " " + s;
//=> "B By T The Way BByTTheWay"

Upvotes: 4

Related Questions