Omar
Omar

Reputation: 98

Split string into array of words

I want to split a string into an array of words without using string.Split. I tried already this code and it is working but cant assign the result into the array

string str = "Hello, how are you?";
string tmp = "";
int word_counter = 0;
for (int i = 0; i < str.Length; i++)
{
     if (str[i] == ' ')
     {
         word_counter++;
     }
}
string[] words = new string[word_counter+1];

for (int i = 0; i < str.Length; i++)
{
    if (str[i] != ' ')
    {
        tmp = tmp + str[i];
        continue;
    }
    // here is the problem, i cant assign every tmp in the array
    for (int j = 0; j < words.Length; j++)
    {
        words[j] = tmp;
    }
    tmp = "";
}

Upvotes: 2

Views: 4046

Answers (3)

Lorenzo
Lorenzo

Reputation: 3387

You can also use a List to create your words list:

    string str = "Hello, how are you?";
    string tmp = "";
    List<string> ListOfWords = new List<string>();

    int j = 0;

    for (int i = 0; i < str.Length; i++)
    {
        if (str[i] != ' ')
        {
            tmp = tmp + str[i];
            continue;
        }
        // here is the problem, i cant assign every tmp in the array

        ListOfWords.Add(tmp);
        tmp = "";
    }
    ListOfWords.Add(tmp);

In this way you can avoid to count the number of word and the code is more simple. Use ListOfWord[x] to read any word

Upvotes: 0

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186668

Try using regular expressions, like this:

  string str = "Hello, how are you?";

  // words == ["Hello", "how", "are", "you"] 
  string[] words = Regex.Matches(str, "\\w+")
    .OfType<Match>()
    .Select(m => m.Value)
    .ToArray();

String.Split is not a good option since there are too many characters to split on: ' ' (space), '.', ',', ';', '!' etc.

Word is not just a stuff between spaces, there are punctuations to consider, non-breaking spaces etc. Have a look at the input like this:

  string str = "Bad(very bad) input to test. . ."

Note

  1. Absence of space after "Bad"
  2. Non-breaking space
  3. Addition spaces after full stops

And the right output should be

  ["Bad", "very", "bad", "input", "to", "test"] 

Upvotes: 4

Ian
Ian

Reputation: 30813

You just need a kind of index pointer to put up your item one by one to the array:

string str = "Hello, how are you?";
string tmp = "";
int word_counter = 0;
for (int i = 0; i < str.Length; i++) {
    if (str[i] == ' ') {
        word_counter++;
    }
}
string[] words = new string[word_counter + 1];
int currentWordNo = 0; //at this index pointer
for (int i = 0; i < str.Length; i++) {
    if (str[i] != ' ') {
        tmp = tmp + str[i];
        continue;
    }
    words[currentWordNo++] = tmp; //change your loop to this
    tmp = "";
}
words[currentWordNo++] = tmp; //do this for the last assignment

In my example the index pointer is named currentWordNo

Upvotes: 5

Related Questions