Reputation: 4729
I have an input stream of many lines which look like this:
path/to/file: example: 'extract_me.proto'
path/to/other-file: example: 'me_too.proto'
path/to/something/else: example: 'and_me_2.proto'
...
I'd like to just extract the *.proto
filenames from these lines, and I have tried:
[INPUT] | sed 's/^.*\([a-zA-Z0-9_]+\.proto\).*$/\1/'
I know that part of my problem is that .*
is greedy and I'm going to get things like e.proto
and o.proto
and 2.proto
, but I can't even get that far... it just outputs with the same lines as the input. Any help would be greatly appreciated.
Upvotes: 0
Views: 83
Reputation: 246807
Since you tag your command with linux, I'll assume you have GNU grep. Pick one of
grep -oP '\w+\.proto' file
grep -o "[^']+\\.proto" file
Upvotes: 2
Reputation: 14949
Use this sed
:
sed "s/^.*'\([a-zA-Z0-9_]\+\.proto\).*$/\1/"
+
- Extended-RegEx. So, you need to escape to get special meaning. The preceding item will be matched one or more times.
Another way:
sed "s/^.*'\([^']\+\.proto\)'.*$/\1/"
Upvotes: 1
Reputation: 1369
I find it helpful to use extended regex for this purpose (-r
) in which case you need not escape your brackets.
sed -r 's/^.*[^a-zA-Z0-9_]([a-zA-Z0-9_]+\.proto).*$/\1/'
The addition of [^a-zA-Z0-9_]
forces the .*
to not be greedy.
Upvotes: 2
Reputation: 140168
one way to do it:
sed 's/^.*[^a-zA-Z0-9_]\([a-zA-Z0-9_]\+\.proto\).*$/\1/'
+
charanother way: use single quote delimitation, after all it's here for that:
sed "s/^.*'\([a-zA-Z0-9_]\+\.proto\)'.*\$/\1/"
Upvotes: 1