user487257
user487257

Reputation: 1141

How to get all the quoted strings in a file?

I need to extract all the strings surrounded with single quotes in a file. For instance, if a file contains the following line:

"Julius" was not "Ceaser"

It should output

Julius 
Ceaser

I want to do it using bash (Sed/Awk). Using Awk I can extract one occurrence but how do I get all the strings?

Upvotes: 1

Views: 1086

Answers (4)

ghoti
ghoti

Reputation: 46876

If you don't mind your output including the quotes, a simple grep -o might work:

$ egrep -o '"[[:alnum:]]+"'  <<<'"Julius" was not "Ceaser"'
"Julius"
"Ceaser"

And it you want no quotes, grep -P (mostly on Linux) or pcregrep (FreeBSD, macOS and other BSDs) might work, using a negative lookbehind and lookahead:

$ pcregrep -o '(?<=")[[:alnum:]]+(?=")'  <<<'"Julius" was not "Ceaser"'
Julius 
Ceaser

Upvotes: 0

stack0114106
stack0114106

Reputation: 8721

If you want to print all the double quoted string in the same lines, then try this Perl one-liner

perl -ne ' while(/("\S+")/g) { print "$1 " } print "\n" '

with given inputs

$ cat  doubleq.txt
"Julius" was not "Ceaser"
"request" map url
"Ceaser"


$ perl -ne ' while(/("\S+")/g) { print "$1 " } print "\n" ' doubleq.txt
"Julius" "Ceaser"
"request"
"Ceaser"

$

Upvotes: 0

codeforester
codeforester

Reputation: 43039

grep -Eo '"[a-zA-Z]+"' file

would print the matching strings on separate lines, even if they were on the same line in the original file. If you want to fold the matches, you could do this:

grep -nEo '"[a-zA-Z]+"' file | awk -F: '
BEGIN { p=1 }
      {
         gsub("\"", "", $2)
         n=$1;
         if (p != n) {
           print s; s = $2; p=n
         } else {
           if(s) { s = s" "$2 } else { s=$2 }
         }
      }
END   {
         print s
      }'
  • grep -nEo extracts only the matched parts, with the line number prefixed
  • awk parses grep's output and produces the desired result

Upvotes: 0

karakfa
karakfa

Reputation: 67537

awk to the rescue!

$ awk -v RS='"' '!(NR%2)' file

Julius
Ceaser

using this contents

$ cat file

I need to extract all the strings surrounded with single quotes in a file. For instance, if a file contains the following line: "Julius" was not "Ceaser" It should output Julius Ceaser

assumes there are no escaped quotes.

Upvotes: 7

Related Questions