Reputation: 141
I have the following linux cmd:
grep -o file.txt "\"uri\":\".{1,}\""
The text i have is the following:
"some characters here","uri":"some_URI*Here.^%$#!", "again a set of irrelevant characters"
Of course the output i want to have is:
"uri":"some_URI*Here.^%$#!"
Why dont i have the correct output? Because of the " required by the grep which mix with " in my text? How to fix it?
Upvotes: 2
Views: 11371
Reputation: 17041
Either
grep -oE "\"uri\":\"[^\"]{1,}\"" file.txt
or
grep -o "\"uri\":\"[^\"]\\{1,\\}\"" file.txt
will leave out the trailing irrelevant characters.
Explanation:
grep
command was listing file.txt
before the pattern, but grep
requires pattern first, then files..
, you need [^\"]
to match the characters between the quotes. That is because .
will match a "
itself, so .{1,}
matches right through the intervening double quotes ("greedy matching").The two options are:
-E
, grep
uses extended regular expressions, in which {}
are automatically range operators.-E
, you need to use backslashes to mark the {}
as range operators instead of literal characters. \{1,\}
is the regex syntax. Since you are in a shell double-quoted string, you need to escape the backslashes, whence \\{1,\\}
.To test shell quoting, an easy way is to use echo
. For example, in bash:
$ echo grep -o "\"uri\":\"[^\"]\\{1,\\}\"" file.txt
grep -o "uri":"[^"]\{1,\}" file.txt
That shows, for example, that the \\
in the pattern have been collapsed to a single \
.
Upvotes: 1
Reputation: 7627
You could use the following regex:
grep -oE '"uri":".[^"]+"' inputFile
Original poster provided a regex that is almost correct but have some flaws, below is his/her version and a corrected one:
grep -o inputFile "\"uri\":".{1,}\"" # wrong
grep -oE '"uri":"[^"]{1,}"' inputFile # correct
The problems with the first use of grep are:
-E
flag for {1,}
to work[^"]
character class instead of .
Upvotes: 3