Yahya Uddin
Yahya Uddin

Reputation: 28851

Java Finding all words begining with a letter

I am trying to get all words that begin with a letter from a long string. How would you do this is java? I don't want to loop through every letter or something inefficient.

EDIT: I also can't use any in built data structures (except arrays of course)- its for a cs class. I can however make my own data structures (which i have created sevral).

Upvotes: 0

Views: 15485

Answers (7)

Stathis Andronikos
Stathis Andronikos

Reputation: 1259

You can use split() method. Here is an example :

String string = "your string";
String[] parts = string.split(" C");

 for(int i=0; i<parts.length; i++) {
   String[] word = parts[i].split(" ");

    if( i > 0 ) {
          // ignore the rest words because don't starting with C
      System.out.println("C" + word[0]); 
    }
else {    // Check 1st excplicitly
          for(int j=0; j<word.length; j++) {

        if ( word[j].startsWith("c") || word[j].startsWith("C"))
              System.out.println(word[j]); 
            }   
        }

     }

where "C" is you letter. Just then loop around the array. For parts[0] you have to check if it starts with "C". It was my mistake to start looping from i=1. The correct is from 0.

Upvotes: 0

stema
stema

Reputation: 92986

You need to be clear about some things. What is a "word"? You want to find only "words" starting with a letter, so I assume that words can have other characters too. But what chars are allowed? What defines the start of such a word? Whitespace, any non letter, any non letter/non digit, ...?

e.g.:

String TestInput = "test séntènce îwhere I'm want,to üfind 1words starting $with le11ers.";
String regex = "(?<=^|\\s)\\pL\\w*";

Pattern p = Pattern.compile(regex, Pattern.UNICODE_CHARACTER_CLASS);

Matcher matcher = p.matcher(TestInput);
while (matcher.find()) {
    System.out.println(matcher.group());
}

The regex (?<=^|\s)\pL\w* will find sequences that starts with a letter (\pL is a Unicode property for letter), followed by 0 or more "word" characters (Unicode letters and numbers, because of the modifier Pattern.UNICODE_CHARACTER_CLASS).
The lookbehind assertion (?<=^|\s) ensures that there is the start of the string or a whitespace before the sequence.

So my code will print:

test
séntènce ==> contains non ASCII letters
îwhere   ==> starts with a non ASCII letter
I        ==> 'm is missing, because `'` is not in `\w`
want
üfind    ==> starts with a non ASCII letter
starting
le11ers  ==> contains digits

Missing words:

,to     ==> starting with a ","
1words  ==> starting with a digit
$with   ==> starting with a "$"

Upvotes: 2

Andr&#233;s Oviedo
Andr&#233;s Oviedo

Reputation: 1428

Regexp way:

public static void main(String[] args) {
    String text = "my very long string to test";
    Matcher m = Pattern.compile("(^|\\W)(\\w*)").matcher(text);
    while (m.find()) {
      System.out.println("Found: "+m.group(2));
    }
 }

Upvotes: 0

mig
mig

Reputation: 152

Scanner scan = new Scanner(text); // text being the string you are looking in
char test = 'x'; //whatever letter you are looking for
while(scan.hasNext()){
   String wordFound = scan.next();
   if(wordFound.charAt(0)==test){
       //do something with the wordFound
   }
}

this will do what you are looking for, inside the if statement do what you want with the word

Upvotes: 0

Gaurav Gupta
Gaurav Gupta

Reputation: 4691

You can get the first letter of the string and check with API method that if it is letter or not.

String input = "jkk ds 32";
String[] array = input.split(" ");
for (String word : array) {
    char[] arr = word.toCharArray();
    char c = arr[0];
    if (Character.isLetter(c)) {
        System.out.println( word + "\t isLetter");
    } else {
        System.out.println(word + "\t not Letter");
    }
}

Following are some sample output:

jkk  isLetter
ds   isLetter
32   not Letter

Upvotes: 0

Levenal
Levenal

Reputation: 3806

You could try obtaining an array collection from your String and then iterating through it:

String s = "my very long string to test";

for(String st : s.split(" ")){
    if(st.startsWith("t")){
        System.out.println(st);
    }
}

Upvotes: 2

Ankit Rustagi
Ankit Rustagi

Reputation: 5637

You could build a HashMap -

HashMap<String,String> map = new HashMap<String,String>();

example -

ant, bat, art, cat

Hashmap
a -> ant,art
b -> bat
c -> cat

to find all words that begin with "a", just do

map.get("a")

Upvotes: 0

Related Questions