user1835630
user1835630

Reputation: 261

extracting text between two words in a text file, discarding all the rest in shell script

I have a file of the form:

    blablabla var="value_var1" blabla
    blablabla var="value_var2" blabla

and so on. I would like to obtain a text file like:

    value_var1
    value_var2
    ...

Any ideas?

thanks in advance!

Upvotes: 0

Views: 142

Answers (4)

Cyrus
Cyrus

Reputation: 88999

You could try this cut command:

cut -d \" -f 2 filename

or:

grep -oP '"\K[^" ]*' filename

Upvotes: 2

Michaël Le Barbier
Michaël Le Barbier

Reputation: 6478

With sed you can remove the text up to the first " and after the second " with:

sed -e 's/.*"//;s/".*//' < infile > outfile

This is a bit more complicated than the cut version but it might be easier to fix, if it process certain lines in an inappropriate manner.

Upvotes: 0

clt60
clt60

Reputation: 63974

The perl variant

  • will match only var="something" and not var2="other"
  • will match multiple occurences in the line
perl -nE 'say $1 while m/\bvar\s*=\s*"(.*?)"/g'

from the next input

blabl somevar="some" abla var="value_var1" blabla var = "value2" blabal
blablabla var="value_var2" blabla

produces

value_var1
value2
value_var2

Get teh value from any something="value" the next grep will work

grep -oP '=\s*"\K(.*?)(?=")'

for the same input prints

some
value_var1
value2
value_var2

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174874

You could try the below sed command,

sed 's/.*"\(.*\)".*/\1/' infile > outfile

If you want to get the preceeding spaces also then use the below regex.

sed 's/^\( *\).*"\(.*\)".*/\1\2/g' infile > outfile

Upvotes: 0

Related Questions