Reputation: 1593
What's a good strategy to get full words into an array with its succeeding character.
Example.
This is an amazing sentence.
Array(
[0] => This
[1] => is
[2] => an
[3] => amazing
[4] => sentence.
)
Elements 0 - 3 would have a succeeding space, as a period succeeds the 4th element.
I need you to split these by spacing character, Then once width of element with injected array elements reaches X, Break into a new line.
Please, gawd don't give tons of code. I prefer to write my own just tell me how you would do it.
Upvotes: 54
Views: 134974
Reputation: 4297
The following solution splits words, not only by space, but also other types of spaces and punctuation characters. In addition, it works with non ASCII characters.
It matches words by considering only characters that belong to certain categories of characters. It allows letters (L), numbers (N), symbols (S) and marks (M) so it matches quite a broad set but you can adjust if you need a different set of characters. Other categories such as punctuations (P) and separators (Z) are not included and will therefore not match.
input.match(/[\p{L}\p{N}\p{S}\p{M}]+/gu)
Example
' \t a 件数 😀 ,;-asd'.match(/[\p{L}\p{N}\p{S}\p{M}]+/gu)
Returns ['a', '件数', '😀', 'asd']
Upvotes: 4
Reputation: 703
It can be done with split
function:
"This is an amazing sentence.".split(' ')
Upvotes: 1
Reputation: 17408
Use split
and filter
to remove leading and trailing whitespaces.
let str = ' This is an amazing sentence. ',
words = str.split(' ').filter(w => w !== '');
console.log(words);
Upvotes: 8
Reputation: 718
If you need spaces and the dots the easiest would be.
"This is an amazing sentence.".match(/.*?[\.\s]+?/g);
the result would be
['This ','is ','an ','amazing ','sentence.']
Upvotes: 9
Reputation: 17408
This can be done with lodash _.words
:
var str = 'This is an amazing sentence.';
console.log(_.words(str, /[^, ]+/g));
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.11/lodash.min.js"></script>
Upvotes: 2
Reputation: 10834
Similar to Ravi's answer, use match
, but use the word boundary \b
in the regex to split on word boundaries:
'This is a test. This is only a test.'.match(/\b(\w+)\b/g)
yields
["This", "is", "a", "test", "This", "is", "only", "a", "test"]
or
'This is a test. This is only a test.'.match(/\b(\w+\W+)/g)
yields
["This ", "is ", "a ", "test. ", "This ", "is ", "only ", "a ", "test."]
Upvotes: 80
Reputation: 28196
try this
var words = str.replace(/([ .,;]+)/g,'$1§sep§').split('§sep§');
This will
§sep§
after every chosen delimiter [ .,;]+
Upvotes: 20
Reputation: 3406
Here is an option if you wanted to include the space and complete in O(N)
var str = "This is an amazing sentence.";
var words = [];
var buf = "";
for(var i = 0; i < str.length; i++) {
buf += str[i];
if(str[i] == " ") {
words.push(buf);
buf = "";
}
}
if(buf.length > 0) {
words.push(buf);
}
Upvotes: 3
Reputation: 39532
Just use split
:
var str = "This is an amazing sentence.";
var words = str.split(" ");
console.log(words);
//["This", "is", "an", "amazing", "sentence."]
and if you need it with a space, why don't you just do that? (use a loop afterwards)
var str = "This is an amazing sentence.";
var words = str.split(" ");
for (var i = 0; i < words.length - 1; i++) {
words[i] += " ";
}
console.log(words);
//["This ", "is ", "an ", "amazing ", "sentence."]
Oh, and sleep well!
Upvotes: 69