Reputation: 6018
I have the following string that I get:
new Field("count").del("query_then_fetch");
new Field("scan").del("query_then_fetch sorting on `_doc`");
new Field("compress").del("no replacement, implemented at the codec level");
new Field("compress_threshold").del("no replacement");
new Field("filter").del("query");
I run the following script on command line where the regex matches the strings that are in double quotes.:
awk -F '.del' '{match($1, "\".*\"", a); match($2, "\".*\"", b)}END{print a[0]; print b[0]}'
expecting this king of output:
"count" "query_then_fetch"
"scan" "query_then_fetch sorting on `_doc`"
"compress" "no replacement, implemented at the codec level"
"compress_threshold" "no replacement"
"filter" "query"
but instead I get this output:
"filter"
"query"
How to resolve this issue?
Upvotes: 0
Views: 228
Reputation: 80951
Your awk
script is only printing once during the END
block at the end of processing all the input.
At which point you are printing a[0]
and b[0]
on separate lines (because you are using two print
statements).
What you want, with your current awk
script, is to print a[0]
and b[0]
in a single printf
statement while processing each line.
awk -F '.del' '{match($1, "\".*\"", a); match($2, "\".*\"", b); printf "%s %s\n",a[0], b[0]}' sample.csv
Alternatively you could use the much simpler awk
script below which splits the input on (
and )
characters.
awk -F '[()]' '{print $2,$4}' sample.csv
Upvotes: 1
Reputation: 103884
Given:
$ echo "$tgt"
new Field("count").del("query_then_fetch");
new Field("scan").del("query_then_fetch sorting on `_doc`");
new Field("compress").del("no replacement, implemented at the codec level");
new Field("compress_threshold").del("no replacement");
new Field("filter").del("query");
You can do:
$ echo "$tgt" | awk '{split($0, a, "\""); print a[2]"\t"a[4]}'
count query_then_fetch
scan query_then_fetch sorting on `_doc`
compress no replacement, implemented at the codec level
compress_threshold no replacement
filter query
Add quotes around the fields as needed.
Or, you can do:
$ echo "$tgt" | awk '{split($0, a, /[()]/); print a[2],a[4]}'
"count" "query_then_fetch"
"scan" "query_then_fetch sorting on `_doc`"
"compress" "no replacement, implemented at the codec level"
"compress_threshold" "no replacement"
"filter" "query"
Upvotes: 1
Reputation: 31895
cat sample.csv
new Field("count").del("query_then_fetch");
new Field("scan").del("query_then_fetch sorting on `_doc`");
new Field("compress").del("no replacement, implemented at the codec level");
new Field("compress_threshold").del("no replacement");
new Field("filter").del("query");
awk -F'"' -v q="\"" '{print q $2 q,q $4 q}' sample.csv
"count" "query_then_fetch"
"scan" "query_then_fetch sorting on `_doc`"
"compress" "no replacement, implemented at the codec level"
"compress_threshold" "no replacement"
"filter" "query"
I am using double quotes as field separator and print out the 2nd and 4th fields
Upvotes: 1