Reputation: 111

regex - replace 3rd dot and space with line break

I have a regular expression below that works fine. What it does is it looks for every 3rd "." and inserts a break line...

some_string.replace(/((?:\.[^\.]*){2})\./g, '$1\.<br/><br/>')

so this text:

some test. some other test. other 2 test. test nice text.

becomes:

some test. some other test. other 2 test.  
test nice text.

I need to change this to look for a dot and space too. In other words, currently:

some test. some other test. other 2.3 test. test nice text.

will look like:

some test. some other test. other 2.
3 test. test nice text.

and I need this text to look like this:

some test. some other test. other 2.3 test. 
test nice text.

Upvotes: 0

Answers (5)

3limin4t0r

Reputation: 21130

I would do something like this:

replace(/(([^.]|\.(?! ))*\. ){3}/g, '$&<br/><br/>')

Explanation

/([^.]|\.(?! ))*\. /

This matches a non . character (/[^.]/) or . character that is not followed by a space (/\.(?! )/). It continues matching (/*/) until it encounters a . followed by a space (in this case both /[^.]/ and /\.(?! )/ don't match, thus continuing to /\. /).

The reason I use a negative look ahead /(?! )/ is because I want to evaluate this character by character. If I where to replace it with /[^ ]/ than it would also match the "not a space" character. This means that if I end a line with two dots Test sentence.. Test 2. it would not match Test sentence..<space> because the second dot is included in the /\.[^ ]/, thus already passed by.

The /{3}/ makes sure that the group matches 3 times.

'$& ' Will replace the entire match with itself followed by 2 line breaks.

Note

I'm using capture groups although I'm not using them. If you like to use non capture groups you can safely replace all capture groups with non capture groups.

Edit

For a more readable solution see ctwheels answer. This does exactly the same, but uses a lazy solution. Keep in mind that lazy regex is often a bit slower when used on large text (I haven't tested the speed of both). But if readability is more important I would go for his solution.

Upvotes: 1

ctwheels

Reputation: 22837

Brief

For viewing purposes, I've used the replacement of $1\n. In reality, you would change this to $1  ($&  with the edit).

Code

Original

See regex in use here

((?:.*?\. ){2}.*?)\.

Note: There's a space at the end of the pattern above.

Edit

Thanks to Johan Wentholt for the edit below.

(.*?\. ){3}

Replacement

$&\n

Usage

var s = [
  "some test. some other test. other 2 test. test nice text.",
  "some test. some other test. other 2.3 test. test nice text."
];

s.forEach(function(e) {
  var x = e.replace(/(.*?\. ){3}/g, "$&\n");
  console.log(x);
});

Explanation

((?:.*?\. ){2}.*?) Capture the following into capture group 1
- (?:.*?\. ){2} Match the following exactly twice
  - .*? Match any character any number of times, but as few as possible
  - \. Match the dot character . literally, followed by the space character literally
- .*? Match any character any number of times, but as few as possible
\. Match the dot character . literally, followed by the space character literally

Upvotes: 1

Isti115

Reputation: 2748

Instead of overcomplicating the RegEx, you could use a little known feature of .replace in javascript.

It's second argument can be a function as well, instead of a string. For the full documentation on this look here: Funtion as second parameter to replace

For a working example for your problem try this:

let i = 0
some_string.replace(/\. /g, () => {
    return i++ == 2 ? '. <br /><br />' : '. '
})

If you are unfamiliar with arrow functions (the () => {} thing), you can read about them here,
or in case you do not know what a ? b : c means, it's the ternary operator.

It does the job perfectly for your given examples as you can see in this demo:

const break_lines = (some_string) => {
  let i = 0
  return some_string.replace(/\. /g, () => {
    return i++ == 2 ? '. <br />' : '. '
  })
}

const texts = [
  'some test. some other test. other 2 test. test nice text.',
  'some test. some other test. other 2.3 test. test nice text.'
]

for (const text of texts) {
  document.body.innerHTML += `${text}<br /> --> <br />${break_lines(text)}<br /><br />`
}

body {
  font-family: Consolas;
}

Upvotes: 1

Regis

Reputation: 55

Can't you just add \s ? Like this:

(/((?:\.[^\.]*){2})\.\s/g, '$1\.<br/><br/>')

Upvotes: 0

Barr J

Reputation: 10927

All of those are for dot and spaces, I keep my snippets just in case:

/^(\s{0,1}\.{0,1}[a-zA-Z]+)+$/.test('space ..hello space')
false
/^(\s{0,1}\.{0,1}[a-zA-Z]+)+$/.test('space .hello space')
true
v2:

/^(\s?\.?[a-zA-Z]+)+$/.test('space .hello space')
true
/^(\s?\.?[a-zA-Z]+)+$/.test('space ..hello space')
false
v3: if you need some thisn like one space or dot between

/^([\s\.]?[a-zA-Z]+)+$/.test('space hello space')
true
/^([\s\.]?[a-zA-Z]+)+$/.test('space.hello space')
true
/^([\s\.]?[a-zA-Z]+)+$/.test('space .hello space')
false
v4:

/^([ \.]?[a-zA-Z]+)+$/.test('space hello space')
true
/^([ \.]?[a-zA-Z]+)+$/.test('space.hello space')
true
/^([ \.]?[a-zA-Z]+)+$/.test('space .hello space')
false
/^([ ]?\.?[a-zA-Z]+)+$/.test('space .hello space')
true

If you want to test them with your regex i'd recommend Rubular

Upvotes: 0

regex - replace 3rd dot and space with line break

Answers (5)

Explanation

Note

Edit

Brief

Code

Original

Edit

Usage

Explanation

Related Questions