kaileena
kaileena

Reputation: 141

grep command in linux using " in regex

I have the following linux cmd:

grep -o file.txt "\"uri\":\".{1,}\""

The text i have is the following:

"some characters here","uri":"some_URI*Here.^%$#!", "again a set of irrelevant characters"

Of course the output i want to have is:

"uri":"some_URI*Here.^%$#!"

Why dont i have the correct output? Because of the " required by the grep which mix with " in my text? How to fix it?

Upvotes: 2

Views: 11371

Answers (2)

cxw
cxw

Reputation: 17041

Either

 grep -oE "\"uri\":\"[^\"]{1,}\"" file.txt

or

grep -o "\"uri\":\"[^\"]\\{1,\\}\"" file.txt

will leave out the trailing irrelevant characters.

Explanation:

  • Your grep command was listing file.txt before the pattern, but grep requires pattern first, then files.
  • Instead of ., you need [^\"] to match the characters between the quotes. That is because . will match a " itself, so .{1,} matches right through the intervening double quotes ("greedy matching").

The two options are:

  • with -E, grep uses extended regular expressions, in which {} are automatically range operators.
  • without -E, you need to use backslashes to mark the {} as range operators instead of literal characters. \{1,\} is the regex syntax. Since you are in a shell double-quoted string, you need to escape the backslashes, whence \\{1,\\}.

To test shell quoting, an easy way is to use echo. For example, in bash:

$ echo grep -o "\"uri\":\"[^\"]\\{1,\\}\"" file.txt
grep -o "uri":"[^"]\{1,\}" file.txt

That shows, for example, that the \\ in the pattern have been collapsed to a single \.

Upvotes: 1

builder-7000
builder-7000

Reputation: 7627

You could use the following regex:

grep -oE '"uri":".[^"]+"' inputFile

Original poster provided a regex that is almost correct but have some flaws, below is his/her version and a corrected one:

grep -o  inputFile "\"uri\":".{1,}\""   # wrong
grep -oE '"uri":"[^"]{1,}"' inputFile   # correct

The problems with the first use of grep are:

  • inputFile should come after the regex, not before
  • Needs -E flag for {1,} to work
  • Better use single quotes outside so that double quotes need no be escaped
  • Need to use [^"] character class instead of .

Upvotes: 3

Related Questions