harperville
harperville

Reputation: 7669

Remove slash from end of chained grep and awk output, echo to console

Here is my input file:

/home/sites/default/files/Maple%20board02%2019%2013%20.pdf
/home/sites/default/files/paintgrade/side-view.jpg
/home/sites/default/files/paintgrade/steps_2.jpg
/home/sites/default/files/Front%20sill-photo1.gif
/home/sites/default/files/Rear%20steps%20Feb.%209.2011.pdf

Here is my grep/awk statement:

grep /files/ 404s.txt | awk -F '/' '{print $6"/"$7}'

The output from that statement is:

Maple%20board02%2019%2013%20.pdf/
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif/
Rear%20steps%20Feb.%209.2011.pdf/

(See the trailing slash? Sometimes it's there, sometimes not. Sometimes there isn't a subdirectory and my awk statement prints a "/" none-the-less.)

I saw on another post how to remove a trailing slash, but I am not sure how to apply it here. The post said that ${1%/} will give you a string without a trailing slash. The post is here. The highest voted answer is: target=${1%/}

I'd like to add something to my grep/awk statement to have the output be:

Maple%20board02%2019%2013%20.pdf
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif
Rear%20steps%20Feb.%209.2011.pdf

What can I add to my original statement to have the output be as above? Maybe cut could help or my awk could be adjusted not to print a trailing "/"?

Upvotes: 0

Views: 2423

Answers (2)

John1024
John1024

Reputation: 113934

The problem is that, if there is no $7, then a trailing slash appears. Here is a quick solution:

$ grep /files/ 404s.txt | awk -F/ '{s=$6} NF==7 {s=s"/"$7} {print s}'
Maple%20board02%2019%2013%20.pdf
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif
Rear%20steps%20Feb.%209.2011.pdf

More generally, if there is the possibility that there are any deeper directories than paintgrade, then use:

$ grep /files/ 404s.txt | awk -F/ '{s=$6; for (i=7;i<=NF;i++) {s=s"/"$i}} {print s}'

Further, a separate grep process is unnecessary:

awk -F/ '!/\/files\//{next} {s=$6; for (i=7;i<=NF;i++) {s=s"/"$i}} {print s}' 404s.txt

A More General Yet Simpler Solution: use sed

This replaces both the grep and awk commands:

sed -n 's|.*/files/||p' 404s.txt

Upvotes: 3

Etan Reisner
Etan Reisner

Reputation: 81022

To remove just that prefix something like this will work (substitute out the string and then use a truth-y pattern to get the default print action).

awk '{sub("^/home/sites/default/files/", "")}7'

If you need to remove X fields from the start of the line then using cut generally makes that simpler than awk.

cut -d/ -f6-

Upvotes: 2

Related Questions