Wendell Blatt
Wendell Blatt

Reputation: 177

How to delete everything before a variable number with Perl/Regex

I'm cleaning a file with Perl and I have one line that is a bit tough to work with.

It looks something like:

^L#$%@@$^%^3456 [rest of string]

but I need to get rid of everything before the 3456

the issue is that the 3456 change every single time, so I need to use a sed command that is non specific. I should also add that the stuff before the 3456 will never be numbers

now s/^.*$someString/$someString/ works when i'm working with strings, but the same line doesn't work when it's not a string.

anyway, please help!

Upvotes: 1

Views: 1699

Answers (3)

ikegami
ikegami

Reputation: 385565

I need to get rid of everything before the 3456

(?:(?!STRING).)* is to STRING as [^CHAR]* is to CHAR, so

s/^(?:(?!3456).)*//s;

It can also be done using the non-greedy modifier (.*?), but I dislike using it.

s/^.*?3456/3456/s;
s/^.*?(3456)/$1/s;  # Without duplication.
s/^.*?(?=3456)//s;  # Without the performance penalty of captures.

Upvotes: 0

mpapec
mpapec

Reputation: 50637

This will remove all non-numbers from beginning of the line,

s/^ \D+ //x;

Upvotes: 1

amon
amon

Reputation: 57590

You probably want a regular expression with a lookahead, plus non-greedy matching.

A lookahead is a pattern that would match at the current position, but doesn't consume characters:

my $str = "abc";
$str =~ s/a(?=b)//; # $str eq "bc"

Non-greedy matching modifies the * or + operator by appending a ?. It will now match as few characters as possible.

$str = "abab";
$str =~ s/.*(?=b)//; # $str eq "b"
$str = "abab";
$str =~ s/.*?(?=b)//; # $str eq "bab"

To interpolate a string that should never be treated as a pattern, protect it with \Q...\E:

$re = "^foo.?"
$str = "abc^foo.?baz";
$str =~ s/^.*?(?=\Q$re\E)//; # $str eq "baz"

Upvotes: 0

Related Questions