Reputation: 1141
I need to extract all the strings surrounded with single quotes in a file. For instance, if a file contains the following line:
"Julius" was not "Ceaser"
It should output
Julius
Ceaser
I want to do it using bash (Sed/Awk). Using Awk I can extract one occurrence but how do I get all the strings?
Upvotes: 1
Views: 1086
Reputation: 46876
If you don't mind your output including the quotes, a simple grep -o
might work:
$ egrep -o '"[[:alnum:]]+"' <<<'"Julius" was not "Ceaser"'
"Julius"
"Ceaser"
And it you want no quotes, grep -P
(mostly on Linux) or pcregrep
(FreeBSD, macOS and other BSDs) might work, using a negative lookbehind and lookahead:
$ pcregrep -o '(?<=")[[:alnum:]]+(?=")' <<<'"Julius" was not "Ceaser"'
Julius
Ceaser
Upvotes: 0
Reputation: 8721
If you want to print all the double quoted string in the same lines, then try this Perl one-liner
perl -ne ' while(/("\S+")/g) { print "$1 " } print "\n" '
with given inputs
$ cat doubleq.txt
"Julius" was not "Ceaser"
"request" map url
"Ceaser"
$ perl -ne ' while(/("\S+")/g) { print "$1 " } print "\n" ' doubleq.txt
"Julius" "Ceaser"
"request"
"Ceaser"
$
Upvotes: 0
Reputation: 43039
grep -Eo '"[a-zA-Z]+"' file
would print the matching strings on separate lines, even if they were on the same line in the original file. If you want to fold the matches, you could do this:
grep -nEo '"[a-zA-Z]+"' file | awk -F: '
BEGIN { p=1 }
{
gsub("\"", "", $2)
n=$1;
if (p != n) {
print s; s = $2; p=n
} else {
if(s) { s = s" "$2 } else { s=$2 }
}
}
END {
print s
}'
grep -nEo
extracts only the matched parts, with the line number prefixedUpvotes: 0
Reputation: 67537
awk
to the rescue!
$ awk -v RS='"' '!(NR%2)' file
Julius
Ceaser
using this contents
$ cat file
I need to extract all the strings surrounded with single quotes in a file. For instance, if a file contains the following line: "Julius" was not "Ceaser" It should output Julius Ceaser
assumes there are no escaped quotes.
Upvotes: 7