Santhosh
Santhosh

Reputation: 901

Regex to extract/output quoted strings from a file

I wrote a simple regular expression to output quoted strings from a file

cat mobydick.txt |  while read line; do echo -n "$line "; done | grep -oP '[^"]*"\K[^"]*'

This is what I have so far

For example, when I run this one-liner on this file mobydick.txt I get the output in a single line instead of new line separated strings.

Could someone help me with my script?

Expected Output --> when the above script is run on mobydick.txt
"From my twenty-fifth year I date my life."
"Call me Ishmael."

Above input file can be downloaded from this URL

Upvotes: 0

Views: 85

Answers (1)

lcd047
lcd047

Reputation: 5861

Using GNU grep(1) (other incarnations of grep(1) don't have -P):

tr '\n' ' ' <mobydick.txt | grep -P -o '(?<=\s)"[^"]+"(?=\s)'

More accurate, using pcregrep(1):

pcregrep -M -o '(?<=^|\s)"[^"]+"(?=$|\s)' mobydick.txt

Upvotes: 1

Related Questions