Reputation: 111
I have a regular expression below that works fine. What it does is it looks for every 3rd "." and inserts a break line...
some_string.replace(/((?:\.[^\.]*){2})\./g, '$1\.<br/><br/>')
so this text:
some test. some other test. other 2 test. test nice text.
becomes:
some test. some other test. other 2 test.
test nice text.
I need to change this to look for a dot and space too. In other words, currently:
some test. some other test. other 2.3 test. test nice text.
will look like:
some test. some other test. other 2.
3 test. test nice text.
and I need this text to look like this:
some test. some other test. other 2.3 test.
test nice text.
Upvotes: 0
Views: 450
Reputation: 21130
I would do something like this:
replace(/(([^.]|\.(?! ))*\. ){3}/g, '$&<br/><br/>')
/([^.]|\.(?! ))*\. /
This matches a non .
character (/[^.]/
) or .
character that is not followed by a space (/\.(?! )/
). It continues matching (/*/
) until it encounters a .
followed by a space (in this case both /[^.]/
and /\.(?! )/
don't match, thus continuing to /\. /
).
The reason I use a negative look ahead /(?! )/
is because I want to evaluate this character by character. If I where to replace it with /[^ ]/
than it would also match the "not a space" character. This means that if I end a line with two dots Test sentence.. Test 2.
it would not match Test sentence..<space>
because the second dot is included in the /\.[^ ]/
, thus already passed by.
The /{3}/
makes sure that the group matches 3 times.
'$&<br/><br/>'
Will replace the entire match with itself followed by 2 line breaks.
I'm using capture groups although I'm not using them. If you like to use non capture groups you can safely replace all capture groups with non capture groups.
For a more readable solution see ctwheels answer. This does exactly the same, but uses a lazy solution. Keep in mind that lazy regex is often a bit slower when used on large text (I haven't tested the speed of both). But if readability is more important I would go for his solution.
Upvotes: 1
Reputation: 22837
For viewing purposes, I've used the replacement of $1\n
. In reality, you would change this to $1<br/><br/>
($&<br/><br/>
with the edit).
((?:.*?\. ){2}.*?)\.
Note: There's a space at the end of the pattern above.
Thanks to Johan Wentholt for the edit below.
(.*?\. ){3}
Replacement
$&\n
var s = [
"some test. some other test. other 2 test. test nice text.",
"some test. some other test. other 2.3 test. test nice text."
];
s.forEach(function(e) {
var x = e.replace(/(.*?\. ){3}/g, "$&\n");
console.log(x);
});
((?:.*?\. ){2}.*?)
Capture the following into capture group 1
(?:.*?\. ){2}
Match the following exactly twice
.*?
Match any character any number of times, but as few as possible\.
Match the dot character .
literally, followed by the space character
literally.*?
Match any character any number of times, but as few as possible\.
Match the dot character .
literally, followed by the space character
literallyUpvotes: 1
Reputation: 2748
Instead of overcomplicating the RegEx, you could use a little known feature of .replace
in javascript.
It's second argument can be a function as well, instead of a string.
For the full documentation on this look here:
Funtion as second parameter to replace
For a working example for your problem try this:
let i = 0
some_string.replace(/\. /g, () => {
return i++ == 2 ? '. <br /><br />' : '. '
})
If you are unfamiliar with arrow functions (the () => {}
thing), you can read about them here,
or in case you do not know what a ? b : c
means, it's the ternary operator.
It does the job perfectly for your given examples as you can see in this demo:
const break_lines = (some_string) => {
let i = 0
return some_string.replace(/\. /g, () => {
return i++ == 2 ? '. <br />' : '. '
})
}
const texts = [
'some test. some other test. other 2 test. test nice text.',
'some test. some other test. other 2.3 test. test nice text.'
]
for (const text of texts) {
document.body.innerHTML += `${text}<br /> --> <br />${break_lines(text)}<br /><br />`
}
body {
font-family: Consolas;
}
Upvotes: 1
Reputation: 55
Can't you just add \s ? Like this:
(/((?:\.[^\.]*){2})\.\s/g, '$1\.<br/><br/>')
Upvotes: 0
Reputation: 10927
All of those are for dot and spaces, I keep my snippets just in case:
/^(\s{0,1}\.{0,1}[a-zA-Z]+)+$/.test('space ..hello space')
false
/^(\s{0,1}\.{0,1}[a-zA-Z]+)+$/.test('space .hello space')
true
v2:
/^(\s?\.?[a-zA-Z]+)+$/.test('space .hello space')
true
/^(\s?\.?[a-zA-Z]+)+$/.test('space ..hello space')
false
v3: if you need some thisn like one space or dot between
/^([\s\.]?[a-zA-Z]+)+$/.test('space hello space')
true
/^([\s\.]?[a-zA-Z]+)+$/.test('space.hello space')
true
/^([\s\.]?[a-zA-Z]+)+$/.test('space .hello space')
false
v4:
/^([ \.]?[a-zA-Z]+)+$/.test('space hello space')
true
/^([ \.]?[a-zA-Z]+)+$/.test('space.hello space')
true
/^([ \.]?[a-zA-Z]+)+$/.test('space .hello space')
false
/^([ ]?\.?[a-zA-Z]+)+$/.test('space .hello space')
true
If you want to test them with your regex i'd recommend Rubular
Upvotes: 0