Kluther
Kluther

Reputation: 23

Find text enclosed by character multiples times

The problem:

Find pieces of text in a file enclosed by @

Input:

@abc@ abc @ABC@
cba @cba@ CBA

Output:

@abc@ @ABC@
@cba@

I've tried the following:

cat test.txt | perl -ne 'BEGIN { $/ = undef; } print $1 if(/(@.*@)/s)."\n"'

But this results in:

@abc@ abc @ABC@
cba @cba@

Additional: I was not complete. The goal of the above is the replace the characters between the @ with something else: a should become chr(0x430) b should become chr(0x431) c should become chr(0x446) A should become chr(0x410) B should become chr(0x411) C should become chr(0x426) so with the above input in mind it should result in: абц abc АБЦ cba цба CBA

Sorry for my imcompleteness. Thanks Kluther

Upvotes: 0

Views: 113

Answers (5)

kamituel
kamituel

Reputation: 35960

Use this regex:

cat test.txt | perl -pe 's/(?:(@ )|^[^@]).*?(?: (@)|$)/$1$2/g'

Upvotes: 0

Krishnachandra Sharma
Krishnachandra Sharma

Reputation: 1342

Use non-greedy search .+? or /(\@([^@]*)\@)/gsm.

cat test.txt | perl -ne 'BEGIN { $/ = undef; } print $1." " while(/(\@([^@]*)\@)/gsm); print "\n";'

Upvotes: 0

user1919238
user1919238

Reputation:

The problem with (@.*@) is that * is greedy: it matches the largest amount possible. Thus it will match everything between the first @ in the string and the last one.

You can make it non-greedy with (@.*?@). However, a better approach is to match everything that is not @ in between:

 (@[^@]*@)

If you want to match every occurrence instead of the first one, you also need to use the /g modifier and modify your code to use a loop:

perl -ne 'BEGIN { $/ = undef; } print $1 while(/(\@[^@]*\@)/gs); print "\n"'

Upvotes: 1

Guru
Guru

Reputation: 16994

One way:

$ perl -pe '@a=$_=~/@[^@]+@/g; $_="@a";' file
@abc@ @ABC@ @cba@

Upvotes: 0

Civa
Civa

Reputation: 2176

use pattern like this

@[a-zA-Z]+@

Upvotes: 0

Related Questions