R-J
R-J

Reputation: 936

Regex result skips multiple spaces

Here is a minimal example of my problem:

http://jsfiddle.net/pm913emb/5/

var string = 'Question 6 of 7 '
+'Three, the patient suddenly develops shortness of breath and becomes hypotensive.    His heart rate is 100/min, with a normaI PR and    QRS intervaI.'

var sentencesMatch = string.match(/([\sa-zA-Z\d]){1}.+?[\.!\?]{1}([\s ]+|$)/g);

console.log(sentencesMatch);

As you can see, this string contains multiple sentences and there are two places where I have added multiple spaces: one is at the end of the sentence, the other in middle of the sentence. There is regex, which I run on this string.

The problem is: As you can see in the console, the matched results does not contain these multiple spaces.

What could be the reason of this problem. And possible solution?

Please help.. :/

Upvotes: 3

Views: 140

Answers (3)

Deftwun
Deftwun

Reputation: 1232

Your code is working

Its just when you try and print the array itself, the browser trims the extra white space in the console. Try printing the individual array elements and (depending on your browser) you'll see that they do contain the extra spaces.

//You'll need to have the console open to see the results here

var string = 'Question 6 of 7 '
+'Three, the patient suddenly develops shortness of breath and becomes hypotensive.    His heart rate is 100/min, with a normaI PR and    QRS intervaI.'

var sentencesMatch = string.match(/([\sa-zA-Z\d]){1}.+?[\.!\?]{1}([\s ]+|$)/g);
console.log(sentencesMatch);

for (var i in sentencesMatch){
    //Add quotes so we can see trailing whitespace
    console.log('"' + sentencesMatch[i] + '"'); 
}

Extra white space is trimmed by default in HTML

If you want to actually put that string into an element then you will have the same issue. Here's how to fix it:

Use CSS

Probably the simplest solution. Style the elements using the white-space property

var string = 'Question 6 of 7 '
+'Three, the patient suddenly develops shortness of breath and becomes hypotensive.    His heart rate is 100/min, with a normaI PR and    QRS intervaI.'

var sentencesMatch = string.match(/([\sa-zA-Z\d]){1}.+?[\.!\?]{1}([\s ]+|$)/g);
for (var i in sentencesMatch){
  var p = document.createElement("p");
  document.body.appendChild(p);
  p.innerHTML = '"' + sentencesMatch[i] + '"';
  p.className = "keep-spaces";  
}
.keep-spaces{
  white-space: pre;
}

Or..Replace white space with a non-breaking-space

This solution replaces all whitespace characters with a 'non-breaking-space'. This is represented by the HTML entity  ,  , or &xa0;.

var string = 'Question 6 of 7 '
    +'Three, the patient suddenly develops shortness of breath and becomes hypotensive.    His heart rate is 100/min, with a normaI PR and    QRS intervaI.'
var sentencesMatch = string.match(/([\sa-zA-Z\d]){1}.+?[\.!\?]{1}([\s ]+|$)/g);

for (var i in sentencesMatch){
  var p = document.createElement("p");
  document.body.appendChild(p);
  //Replace spaces with   to preserve consecutive white space
  var str = sentencesMatch[i].replace(/\s/g,' ');
  p.innerHTML = '"' + str + '"';
}

Upvotes: 1

chris85
chris85

Reputation: 23892

Browsers don't show consecutive white-spaces. If you were to use entities they spaces would be displayed. So for example

<-- 2 spaces

would display as

<-- one space

in a browser.

If you used entities for the spaces

&#160;&#160;

you would get

(2 white-spaces (note even here it is one spaced).

Here's a longer write up on it.

Browser white space rendering

I think this accomplishes what you want (probably not the cleanest, I don't write JS often)..

<script type="text/javascript">
var string = 'Question 6 of 7 '
+'Three, the patient suddenly develops shortness of breath and becomes hypotensive.    His heart rate is 100/min, with a normaI PR and    QRS intervaI.'
var sentencesMatch = string.match(/([\sa-zA-Z\d]){1}.+?[\.!\?]{1}([\s ]+|$)/g);
var output = '';
for(var x= 0; x < sentencesMatch.length; x++){
    output += sentencesMatch[x].replace(/ /g, '&#160;');
}
document.write(output);
</script>

Upvotes: 2

Just Curious
Just Curious

Reputation: 67

It's not the problem in your regex nor the string you have, If you tried putting a '\n'. you'd see it basically just replace it with one space, thus the problem is in you're browser. you might want to add a header like this to fix it:

content-type: text/html

or try base64-encode it and whenever you need it. decode it.

Upvotes: -1

Related Questions