Reputation: 145

Find `character with Regex and Replace it

I have certain text within a command \grk{} that looks like this:

\grk{s`u e@i `o qrist`os <o u<i`ws to~u jeo`u `ao~u z~wntos}

I need to find all instances where there is a white space followed by ` and replace it with white space followed by the word XLFY

The result from the above should be:

\grk{s`u e@i XLFYo qrist`os <o u<i`ws to~u jeo`u XLFYao~u z~wntos}

and all other instances of white space followed by ` outside \grk{} should be ignored.

I got this far:

(?<=grk\{)(.*?)(?=\})

This finds and selects all the text within \grk{}

Any idea how I can just select the white space followed by the ` that is inside and replace it?

Upvotes: 1

Answers (2)

Jan

Reputation: 43169

You could pretty easily do it with the help of a programming language (some PHP code to show the concept, could be achieved with other languages as well), here's a code which takes the file content into account as well:

<?php
foreach(glob(".*txt") as $filename) {
    // load the file content 
    $content = file_get_contents($filename);
    $regex = '#\\\grk{[^}]+}#';

    $newContent = preg_replace_callback(
        $regex, 
        function($matches) {
            $regex = '#\h{1}`#';
            return preg_replace($regex, ' XLFY', $matches[0]);
        },
        $content);

    // write it back to the original file
    file_put_contents($filename, $newContent);
}
?>

The idea is to grab the text between grk and the curly braces in the first step, then to replace every occurence of a whitespace followed by "`".

Upvotes: 1

kolejnik

Reputation: 136

If you have file with many \grk{} sections (and others), probably the fastest way to achieve the goal is what @Jan suggested. @noob regex is fine for single \grk{}.

The problem with (?<=grk\{)(.*?)(?=\}) is that you can't get fixed length lookbehind in most regex engines, so you can't ommit any text before " `". Take a look at this post.

You can also use bash script:

#!/bin/bash
file=$1
newFile=$file"_replaced"
val=`cat $file`
regex="\\\grk\{(.*?)\}"

cp $file $newFile

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $newFile
done

cat $newFile

which takes file as an argument and create file_replaced meeting your conditions.

EDIT: Run script for each file in directory:

for file in *; do ./replace.sh $file; done;

before that change the script, to it override existing file:

#!/bin/bash
file=$1
val=`cat $file`
regex="\\\grk\{(.*?)\}"

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $file
done

But if you don't use any VCS, please make a backup of your files!

EDIT2: debug

#!/bin/bash
file=$1
val=`cat $file`
echo '--- file ---'
echo $val
regex="\\\grk\{(.*?)\}"
echo 'regex: '$regex
grep -oP $regex $file | while read -r line; do
    echo 'LINE:        '$line
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    echo 'REPLACEMENT: '$replacement
    sed -i "s/$line/$replacement/g" $file
done
echo '--- file after ---'
cat $file

Upvotes: 1

Find `character with Regex and Replace it

Answers (2)

Related Questions