Reputation: 7669
Here is my input file:
/home/sites/default/files/Maple%20board02%2019%2013%20.pdf
/home/sites/default/files/paintgrade/side-view.jpg
/home/sites/default/files/paintgrade/steps_2.jpg
/home/sites/default/files/Front%20sill-photo1.gif
/home/sites/default/files/Rear%20steps%20Feb.%209.2011.pdf
Here is my grep/awk statement:
grep /files/ 404s.txt | awk -F '/' '{print $6"/"$7}'
The output from that statement is:
Maple%20board02%2019%2013%20.pdf/
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif/
Rear%20steps%20Feb.%209.2011.pdf/
(See the trailing slash? Sometimes it's there, sometimes not. Sometimes there isn't a subdirectory and my awk statement prints a "/" none-the-less.)
I saw on another post how to remove a trailing slash, but I am not sure how to apply it here. The post said that ${1%/}
will give you a string without a trailing slash. The post is here. The highest voted answer is: target=${1%/}
I'd like to add something to my grep/awk statement to have the output be:
Maple%20board02%2019%2013%20.pdf
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif
Rear%20steps%20Feb.%209.2011.pdf
What can I add to my original statement to have the output be as above? Maybe cut
could help or my awk
could be adjusted not to print a trailing "/"?
Upvotes: 0
Views: 2423
Reputation: 113934
The problem is that, if there is no $7
, then a trailing slash appears. Here is a quick solution:
$ grep /files/ 404s.txt | awk -F/ '{s=$6} NF==7 {s=s"/"$7} {print s}'
Maple%20board02%2019%2013%20.pdf
paintgrade/side-view.jpg
paintgrade/steps_2.jpg
Front%20sill-photo1.gif
Rear%20steps%20Feb.%209.2011.pdf
More generally, if there is the possibility that there are any deeper directories than paintgrade
, then use:
$ grep /files/ 404s.txt | awk -F/ '{s=$6; for (i=7;i<=NF;i++) {s=s"/"$i}} {print s}'
Further, a separate grep
process is unnecessary:
awk -F/ '!/\/files\//{next} {s=$6; for (i=7;i<=NF;i++) {s=s"/"$i}} {print s}' 404s.txt
sed
This replaces both the grep
and awk
commands:
sed -n 's|.*/files/||p' 404s.txt
Upvotes: 3
Reputation: 81022
To remove just that prefix something like this will work (substitute out the string and then use a truth-y pattern to get the default print action).
awk '{sub("^/home/sites/default/files/", "")}7'
If you need to remove X fields from the start of the line then using cut
generally makes that simpler than awk
.
cut -d/ -f6-
Upvotes: 2