Reputation: 116

Remove all characters existing between first n occurences of a specific character in each line in shell

Say I have txt file with characters as follows:

   abcd|123|kds|Name|Place|Phone
   ldkdsd|323|jkds|Name1|Place1|Phone1

I want to remove all the characters in each line that exist within first 3 occurences of | character in each line. I want my output as:

     Name|Place|Phone
     Name1|Place1|Phone1

Could anyone help me figure this out? How can I achieve this using sed?

Upvotes: 0

Answers (5)

Reputation: 41460

You can print the last 3 fields:

awk '{print $(NF-2),$(NF-1),$NF}' FS=\| OFS=\| file
Name|Place|Phone
Name1|Place1|Phone1

Upvotes: 0

Reputation: 58483

This might work for you (GNU sed):

sed 's/^\([^|]*|\)\{3\}//' file

or more readably:

sed -r 's/^([^|]*\|){3}//' file

Upvotes: 1

Reputation: 10039

sed 's/\(\([^|]*|\)\{3\}\)//' YourFile

this is a posix version, on GNU sed force --posix due to the use of | that is interpreted as "OR" and not in posix version.

Explaination

Replace the 3 first occurence (\{3\}) of [ any charcater but | followed by | (\([^|]*|\)) ] by nothing (// that is an empty pattern)

Upvotes: 0

Reputation: 174786

You could try the below sed commad,

$ sed -r 's/^(\s*)[^|]*\|[^|]*\|[^|]*\|/\1/g' file
   Name|Place|Phone
   Name1|Place1|Phone1

^(\s*) captures all the spaces which are at the start.
[^|]*\|[^|]*\|[^|]*\| Matches upto the third |. So this abcd|123|kds| will be matched.
All the matched characters are replaced by the chars which are present inside the first captured group.

Upvotes: 1

Reputation: 195199

This would be a typical task for cut

cut -d'|' -f4- file

output:

Name|Place|Phone
Name1|Place1|Phone1

the -f4- means you want from the forth field till the end. Adjust the 4 if you have a different requirement.

Upvotes: 4