Reputation: 47
I have a file that begins with this kind of format
INFO|NOT-CLONED|/folder/another-folder/another-folder|last-folder-name|
What I need is to read the file and get this output:
INFO|NOT-CLONED|last-folder-name
I have this so far:
cat clone_them.log | grep 'INFO|NOT-CLONED' | sed -E 's/INFO\|NOT-CLONED\|(.*)/g'
But is not working as intended
NOTE: the last "another-folder" and "last-folder-name is the same
Upvotes: 1
Views: 253
Reputation: 113924
If you want a sed solution:
$ sed -En 's/(INFO\|NOT-CLONED\|).*\|([^|]*)\|$/\1\2/p' file
INFO|NOT-CLONED|last-folder-name
How it works:
-E
Use extended regex
-n
Don't print unless we explicitly tell it to.
s/(INFO\|NOT-CLONED\|).*\|([^|]*)\|$/\1\2/p
Look for lines that include INFO|NOT-CLONED|
(save this in group 1) followed by anything, .*
, followed by |
followed by any characters not |
, [^|]*
(saved in group 2), followed by |
at the end of the line. The replacement text is group 1 followed by group 2.
The p
option tells sed to print the line if the match succeeds. Since the substitution only succeeds for lines that contain INFO|NOT-CLONED|
, this eliminates the need for an extra grep
process.
To just get the last-folder-name
without the INFO|NOT-CLONED
, we need only remove \1
from the output:
$ sed -En 's/(INFO\|NOT-CLONED\|).*\|([^|]*)\|$/\2/p' file
last-folder-name
Since we no longer need the first capture group, we could simplify and remove the now unneeded parens so that the only capture group is the last folder name:
$ sed -En 's/INFO\|NOT-CLONED\|.*\|([^|]*)\|$/\1/p' file
last-folder-name
Upvotes: 1
Reputation: 18391
Its simpler in awk
as input file is properly delimited by |
symbol. You need to tell awk
that the input fields are separated by |
and output should also remain separated with |
symbol using IFS
and OFS
respectively.
awk 'BEGIN{FS=OFS="|"}/INFO\|NOT-CLONED/{print $1,$2,$(NF-1)}' clone_them.log
INFO|NOT-CLONED|last-folder-name
Upvotes: 1