Reputation: 2336
I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -
name
things: "water bottle","40","new phone cover",10
place
I just need to return the value in first double quotes.
water bottle
The value in first double quotes can be one word/two words. That is, water bottle
can be sometimes replaced with pen
.
I tried -
awk '/:/ {print $2}'
But this just gives
water
I wanted to comma separate it, but there's colon(:)
after things
. So, I'm not sure how to separate it.
How do i get the value present in first double quotes?
EDIT:
SOLUTION: I used the below code since I particularly wanted to use awk -
awk '/:/' test.txt | cut -d\" -f2
Upvotes: 5
Views: 1035
Reputation: 133538
With your shown samples, please try following awk
code.
awk '/^things:/ && match($0,/"[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Explanation: In awk
program checking if line starts with things: AND using match
function to match everything between 1st and 2nd "
and printing them accordingly.
Upvotes: 4
Reputation: 626920
Solution 1: awk
You can use a single awk
command:
awk -F\" 'index($1, ":"){print $2}' test.txt > outfile
See the online demo.
The -F\"
sets the field separator to a "
char, index($1, ":")
condition makes sure Field 1 contains a :
char (no regex needed) and then {print $2}
prints the second field value.
Solution 2: awk
+ cut
You can use awk
+ cut
:
awk '/:/' test.txt | cut -d\" -f2 > outfile
With awk '/:/' test.txt
, you will extract line(s) containing :
char, and then the piped cut -d\" -f2
command will split the string with "
as a separator and return the second item. See the online demo.
Solution 3: sed
Alternatively, you can use sed
:
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile
See the online demo:
#!/bin/bash
s='name
things: "water bottle","40","new phone cover",10
place'
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle
The command means
-n
- the option suppresses the default line output^[^"]*"\([^"]*\)".*
- a POSIX BRE regex pattern that matches
^
- start of string[^"]*
- zero or more chars other than "
"
- a "
char\([^"]*\)
- Group 1 (\1
refers to this value): any zero or more chars other than "
".*
- a "
char and the rest of the string.\1
replaces the match with Group 1 valuep
- only prints the result of a successful substitution.Upvotes: 1
Reputation: 163362
Using gnu awk
you could make use of a capture group, and use a negated character class to not cross the ,
as that is the field delimiter.
awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file
Output
water bottle
The pattern matches
^
Start of string[^",:]*:
Optionally match any value except "
and ,
and :
, then match :
[^",]*
Optionally match any value except "
and ,
"([^"]*)"
Capture in group 1 the value between double quotesIf the value is always between double quotes, a short option to get the desired result could be setting the field separator to "
and check if group 1 contains a colon, although technically you can also get water bottle
if there is only a leading double quote and not closing one.
awk -F'"' '$1 ~ /:/ {print $2}' file
Upvotes: 4
Reputation: 10123
A solution using the cut
utility could be
cut -d\" -f2 infile > outfile
Upvotes: 4