Reputation: 49
I have text files with lines like this:
U_town/u_LN0_pk_LN3_bnb_LN155/DD0 U_DESIGN/u_LNxx_pk_LN99_bnb_LN151_LN11_/DD5
U_master/u_LN999_pk_LN767888_bnb_LN9772/Dnn111 u_LN999_pk_LN767888_bnb_LN9772_LN9999_LN11/DD
...
I am trying to substitute any other character except for /
to nothing and keep a word with pattern _LN\d+_
with Perl one-liner.
So the edited version would look like:
/_LN0__LN3__LN155/ /_LN99__LN151_LN11_/
/_LN999__LN767888_/ _LN999__LN767888__LN9772_LN9999_/
I tried below which returned empty lines
perl -pe 's/(?! _LN\d+_)[^\/].+//g' file
Below returned only '/'.
perl -pe 's/(?! _LN\d+_)\w+//g' file
Is it perhaps not possible with a one-liner and I should consider writing a code to parse character by character and see if a matching word _LN\d+_
or a character /
is there?
Upvotes: 1
Views: 128
Reputation: 66964
To merely remove everything other than these patterns can simply match the patterns and join the matches back
perl -wnE'say join "", m{/ | _LN[0-9]+_ }gx' file
or perhaps, depending on details of the requirements
perl -wnE'say join "", m{/ | _LN[0-9]+(?=_) }gx' file
(See explanation in the last bullet below.)
Prints, for the first line (of the two) of the shown sample input
/_LN0__LN3_//_LN99__LN151_ ...
or, in the second version
/_LN0_LN3//_LN99_LN151_LN11/ ...
The _LN155
is not there because it is not followed by _
. See below.
Questions:
Why are there spaces after some /
in the "edited version" shown in the question?
The pattern to keep is shown as _LN\d+_
but _LN155
is shown to be kept even though it is not followed by a _
in the input (but by a /
) ...?
Are underscores optional by any chance? If so, append ?
to them in the pattern
perl -wnE'say join "", m{/ | _?LN[0-9]+_? }gx' file
with output
/_LN0__LN3__LN155//_LN99__LN151_LN11_/
(It's been clarified that the extra space in the shown desired output is a mistake.)
If the underscores "overlap," like in _LN155_LN11_
, in the regex they won't be both matched by the _LN\d+_
pattern, since the first one "takes" the underscore.
But if such overlapping instances nned be kept then replace the trailing _
with a lookahead for it, which doesn't consume it so it's there for the leading _
on the next pattern
perl -wnE'say join "", m{/ | _LN[0-9]+(?=_) }gx' file
(if the underscores are optional and you use _?LN\d+_?
pattern then this isn't needed)
Upvotes: 1