Silvestru
Silvestru

Reputation: 13

Extracting words from a string into an array

I want to extract all the words from a string and out them in an array.

By "word" I mean a series of consecutive letters. If I have a space or other characters, the word ends there.

For example, if I have this string:

"my name is sil/ves tru, what?is." 

I want an array like this:

arr[0] = "my";
arr[1] = "name";
arr[2] = "is";
arr[3] = "sil";
arr[4] = "ves";
arr[5] = "tru";
arr[6] = "what";
arr[7] = "is";

This is what I currently have:

var str = "my name is sil/ves tru, what?is."; //my string
var i;
var arr = []; 
for (i = 0; i < str.length; i++) { //check all positions
    if ((str[i] >= "a") && (str[i] <= "z")) { //check where string don't have letter and put the word in arr
       //help me here
    } 
}
console.log(arr); //my array with words

Upvotes: 1

Views: 1227

Answers (6)

robinCTS
robinCTS

Reputation: 5886

The most elegant/shortest solution is, of course, to use a regex.

A better "loop" solution, based on your answer, with the ifs refactored, using the standard else if style, and using the recommended strict comparison operators === & !==, would be:

var str = "what am/i doing.      .   j  here 9what";
var i;
var j = 0;
var arr = [];
var temp = "";

for (i = 0; i < str.length; i++) {

    if ((str[i] >= "a") && (str[i] <= "z")) {
        temp = temp + str[i];
        if (i === str.length - 1) { arr[j] = temp; } 
    } else if (temp !== "") {
        arr[j] = temp;
        temp = "";
        j++;
    }
}
console.log(str.length);
console.log(arr);

Note that the check for the end of the input string (with the addition of the last word to the array if it has been reached) is only required if the last character is a letter. If the last character is not a letter, then the last word has already been added to the array, and temp is empty.


An even better loop solution using the shorter assignment operator +=, the string toLowerCase() method to allow for uppercase letters, and the array push() method, thus eliminating the second index variable j, is:

var str = "What am/I doing.      .   J  here 9what";
var i;
var arr = [];
var temp = "";

for (i = 0; i < str.length; i++) {

    if ((str[i].toLowerCase() >= "a") && (str[i].toLowerCase() <= "z")) {
        temp += str[i];
        if (i === str.length - 1) { arr.push(temp); } 
    } else if (temp !== "") {
        arr.push(temp);
        temp = "";
    }
}
console.log(str.length);
console.log(arr);

Upvotes: 0

Silvestru
Silvestru

Reputation: 13

Thank you all!!!

Here is how i solved this:

var str = "what am/i doing.      .   j  here 9what";
var i;
var j = 0;
var arr = [];
var temp = "";

for (i = 0; i < str.length; i++) {

    if ((str[i] >= "a") && (str[i] <= "z")) {
        temp = temp + str[i];
    }

    else {
        if (temp == "") {
            continue;
        }

        else {

            arr[j] = temp;
            temp = "";
            j++;
        }
    }

    if ((i == str.length - 1) && ((str[i] >= "a") && (str[i] <= "z"))) {
        arr[j] = temp;
    }
}
console.log(str.length);
console.log(arr);

Upvotes: 0

Scott Marcus
Scott Marcus

Reputation: 65806

You just need regular expressions used in conjunction with String.replace(), String.trim() and String.split():

var str =  "my name is sil/ves tru, what?is.";
var ary = str.replace(/[^a-zA-Z ]/g, " ").trim().split(/\s+/);
console.log(ary);

/*
  ^              not
  a-zA-z         a through z or A through Z
  g              global find/replace
  
  .trim()        remove leading/trailing space from string
  .split(/\s+/)  split the string where there is one or more spaces and return an array 
*/

Upvotes: 0

Ele
Ele

Reputation: 33726

Use this regex /[^a-z]+/gi to remove the undesired chars and then split.

var str = "my name is sil/ves tru, what?is.".replace(/[^a-z]+/gi, " ").trim().split(" ");
console.log(str);
.as-console-wrapper { max-height: 100% !important; top: 0; }

Upvotes: 1

user9366559
user9366559

Reputation:

You can use a regex to split on any contiguous sequence of non-letters.

var str = "my name is sil/ves tru, what?is.";
var arr = str.split(/[^a-z]+/i).filter(Boolean); // or .filter(s=>s)
console.log(arr);

Upvotes: 2

Giannis Mp
Giannis Mp

Reputation: 1299

This will do it. With the regex you split by everything except the letters and with the filter(Boolean) you remove the last empty item in the array if the sentence does not end in a letter.

let string = "my name is sil/ves tru, what?is.";

let array = string.split(/[^A-Za-z]+/).filter(Boolean);
console.log(array)

Upvotes: 0

Related Questions