itzy
itzy

Reputation: 11755

Regex to identify a character preceded or followed by a number

I have text that occasionally substitutes an l or I for a 1 (it's from OCR). I want to convert these to a 1 when they are part of a number, but leave them alone if they stand alone. By "part of a number" I mean adjacent to another digit or l or I. So I want to change 1I3 to 113 but leave 1 I 3 alone.

Here's what I'm doing:

$var =~ s/[lI](?=[lI\d])/1/g;
$var =~ s/(?<=[lI\d])[lI]/1/g;

Is there a more elegant way to do this in one step? In other words, what regex will match [Il] that is either preceded by [lI\d] or followed by [lI\d]?

Upvotes: 2

Views: 270

Answers (2)

shibumi
shibumi

Reputation: 378

Do you expect llla to be converted to 111a? Cause your regexp does that conversion also. The problem that you are trying to solve is context-free in nature (you can embed a number adjacent or in between a stream of [Il]'s and only then will you convert them to 1). I would write a loop if I were you. Correct me if I missed something.

Upvotes: 5

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

You can use the alternation metacharacter |:

$var =~ s/(?<=[lI\d])[lI]|[lI](?=[lI\d])/1/g;

Poor Kim Jong 11, though.

Upvotes: 6

Related Questions