Reputation: 1068
For example, I have test.txt with the following line:
L1~00~00~00~00~00~Test~122113~00~L2~This~Is~A~Sample~Data~L1~00~00~00~00~00~Test1~123456~00
I want to get "Test" and "Test1", both are after L1~00~00~00~00~00~
with the following format.
Test, Test1
I already have this line in my bash script:
grep -oP 'L1(?:.[\w\s]*){5}.(\K[\w\s]*)' < test.txt
But it returns a different format:
Test
Test1
How can I achieve this by adding sed
in my script? I'm still a newbie. I hope somebody could help me. Thanks
Upvotes: 2
Views: 77
Reputation: 204731
Wth GNU awk for multi-char RS and RT:
$ awk -v RS='L1~00~00~00~00~00~' -F~ 'NF{ORS=(RT?", ":"\n"); print $1}' file
Test, Test1
The above just splits each line into records that contain whatever is between L1~00~00~00~00~00~
s, and splits each record into fields between ~
s and then prints the first field of each (which is the text that comes between each L1~00~00~00~00~00~
and the next ~
) followed by ,
if it's not the last record and \n
if it is.
Upvotes: 1
Reputation: 4950
If you are not inclined using perl
regex - you can cling to sed
alone:
sed -rn 's#(L1.)((\w+.){5})(\w+)(.*\1\2)(\w+)(.*)#\4, \6#p' < test.txt
Upvotes: 1
Reputation: 189958
Of course, if you are using Perl regex anyway, you might as well use Perl directly.
perl -nle '@m = m/L1(?:.[\w\s]*){5}.([\w\s]*)/g; print(join(",", @m)) if @m' test.txt
This collects matches into @m
, then prints them joined by a comma if there are matches in @m
. The -l
option is a convenience to supply the trailing newline on the print
, and the -n
option makes Perl loop over the input lines one at a time, like sed
.
Upvotes: 2
Reputation: 786359
You can use:
grep -oP 'L1(?:.[\w\s]*){5}.(\K[\w\s]*)' test.txt | sed 'N;s/\n/, /'
Test, Test1
Upvotes: 1