Reputation: 693
I want to separate sentences by inserting a space between each period and letter but not between anything else like a dot and a bracket or a dot and a comma.
Consider this:
This is a text.With some dots.Between words.(how lovely).
This probably has some solution in Perl or PHP but what I'm interested in is can it be done in a text editor that supports search/replace based on regexes? The problem is that it would match both the dot and the character and replace will completely obliterate both. In other words, is there a way to match "nothing" between those two characters?
Upvotes: 11
Views: 19089
Reputation: 25865
You could use back references in the replace string. Typically it would look something like:
Search regex:
(\.)(\w)
Replacement pattern (notice the space):
$1 $2
The back references are stand-ins for the corresponding groups.
Alternatively, you could use lookarounds:
(?<=\.)(?=\w)
This doesn't "capture" the text, it would only match the position between the period and the letter/number (a zero-length string). Replacing it would, essentially, insert some text.
Really, though, it depends on the capabilities of your text editor. Very few text editors have a "complete" regular expression engine built-in. I use TextPad which has its own flavor of regular expression which largely does not support lookarounds (forcing me to use the first approach).
Upvotes: 16
Reputation: 2662
In Perl:
$msg =~ s/\.([a-zA-Z])/\. \1/g
In vim (whole file):
:%s/\.([a-zA-Z])/\. \1/g
In Visual Studio it would be
\.([a-zA-Z])
in the "Find what:", and
\. \1
in "Replace with:".
In general most editors that support searching by regexs usually have capture groups that allow you to store part of the expression matched and use it in the replacement text. In the expressions above everything in the ()
is "captured" and I include it with \1
.
Upvotes: 2
Reputation: 480
This segment of code solves your problem:
preg_replace('/([a-zA-Z]{1})\.([a-zA-Z]{1})/', '$1. $2', 'This is a text.With some dots.Between words.(how lovely).');
You should detect any character before and after dot and replace with blanco.
Upvotes: 1
Reputation: 16949
Language is not indicated and i used PHP but expression is quite generic and can be reused in other environments:
<?php
$s = 'This is a text.With some dots.Between words.(how lovely).';
$r = '~(\w)(\.)(\w)~';
echo preg_replace($r, '$1 $3', $s);
this code results to following string output:
This is a text With some dots Between words.(how lovely).
first and third matches are refered in replacement string as $1 and $3
Upvotes: 2