Reputation: 1196
I am trying to parse a filename from a modified apache web access log entry that is tab delimited:
/common/common/img/pictos/klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404
I would like it to come out like this:
klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404
I have tried something like this in sed:
's:.*/::'
However, it is too greedy, and it eats the rest of my line. I have been looking through posts, but so far no luck. Any hints?
Upvotes: 4
Views: 22118
Reputation: 81
None of the given answers seem to be correct completely when only the extraction of a filename from a given absolute path is desired. Therefore I give here the solution. Let's consider in variable filename we have the complete path, e.g., filename=/ABC/DEF/GHI Then,
echo $filename | awk 'BEGIN{FS="/"}{print $NF}'
will result in the filename GHI.
Upvotes: 8
Reputation: 54392
One way using GNU grep
:
grep -oP "[^/]*\t.*" file
Results:
klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404
Upvotes: 0
Reputation: 12514
You can do this pretty easily with just sed, as long as you tell it not to be too greedy:
% echo '/img/pictos/klArrowRight.gif 03/Dec/2012' | sed 's,^[^ ]*/,,'
klArrowRight.gif 03/Dec/2012
%
(that is, "starting at the beginning of the line, find the longest-possible list of non-space characters, followed by a slash")
Upvotes: 4
Reputation: 8059
using perl regexp and basename (i not think you stuck on sed/awk):
perl -p -e 'use File::Basename;s/([^\s]+\s+)[^\s]+\s+/$1/;print basename($1)'
example:
echo "/common/common/img/pictos/klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404" |
perl -p -e 'use File::Basename;s/([^\s]+\s+)[^\s]+\s+/$1/;print basename($1)'
klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404
Upvotes: 2
Reputation: 195049
the input/output in your question is not well formatted. do you need this?
awk '{gsub(/\/.*\//,"",$1); print}' file
test
kent$ echo "/common/common/img/pictos/klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404"|awk '{gsub(/\/.*\//,"",$1); print}'
output:
klArrowRight.gif /common/common/img/pictos/klArrowRight.gif 03/Dec/2012:00:00:00 127.0.0.1 03/Dec/2012:00:00:00 us 404
Upvotes: 2