Mihir
Mihir

Reputation: 557

grep a string inside double quotes

I'm trying to grep a path string, everything inside double quotes. For loop goes through test.txt file, it searches new1.xml for a match. If found, it prints out a string inside of path.

Expected output

abc/test/test
abc/test/test
cd/test1/test2
cdf

test.txt

test/test
test/test
test1/test2
test1

new1.xml

<abc name="test/test" path="abc/test/test"  />
<abc name="test/test1" path="abc/test/test1"  />
<abc name="test1/test2" path="cd/test1/test2" />
<path="cdf" name="test1" />

Script

for f in test.txt
    do
    echo "Processing $f"
        paste $f | while read lines; do
            results=`cat new1.xml | grep -o "name.*$lines.*" | grep -o 'path.*' | sed 's/[^"]*"\([^"]*\)".*/\1/'`
        done
    done

output

abc/test/test
abc/test/test1

Upvotes: 1

Views: 1268

Answers (2)

George Vasiliou
George Vasiliou

Reputation: 6345

In your code, if you adjust the last part to :

....|grep -o 'path=\".*\"' |sed 's/[^"]*"\([^"]*\)".*/\1/' 

Should work. I didn't test your whole code, only the grep + sed.

Also i can see that there are some backticks around sed command. If so, they need to be removed.

In my test this worked:

echo -e "<abc name="test/test" path=\"abc/test/test\"  />" |grep -o 'path=\".*\"' |sed 's/[^"]*"\([^"]*\)".*/\1/'
abc/test/test

An alternative way to isolate what you need without a loop but with a single command would be

grep -F -f test.txt new1.xml |grep -o 'path=\".*\"' |sed 's/[^"]*"\([^"]*\)".*/\1/' #or a simpler sed like |sed 's/path=//; s/\"//g'

grep -F : search for fixed strings, not regex
-f : load patterns from file

Another alternative:

echo -e "<abc name="test/test" path=\"abc/test/test\"  />" |sed -e 's/^.*path=\"//; s/\" .*$//g'
#in your case:
grep -F -f test.txt new1.xml |sed -e 's/^.*path=\"//; s/\" .*$//'

Update: Testing with one-liner:

$ cat file3
test/test
test/test
test1/test2
test1

$ cat file4
<abc name="test/test" path="abc/test/test"  />
<abc name="test/testsdk" path="abc/test/testsdk" />
<abc name="test/test" path="abc2/test/test"  />
<abc name="test1/test2" path="ggg/test1/test2"  />
<abc name="test2/test2" path="vvv/test2/test2"  />
<path="cdf" name="test1" />

$ grep -F -f file3 file4 |sed 's/^.*path=//; s/\"//g; s/ .*$//g'
abc/test/test
abc/test/testsdk
abc2/test/test
ggg/test1/test2
cdf

Upvotes: 1

codeforester
codeforester

Reputation: 43039

You can write your loop a little more efficiently and use sed instead of multiple greps to get what you want:

for f in test.txt; do
  echo "Processing $f"
  while read line; do
    grep 'name="'$line'"' new1.xml 
  done < "$f" | sed -E 's/.+path="([^"]+)".+/\1/'
done

For your example, the above script gives this output:

Processing test.txt
abc/test/test

If you are just processing one file, you don't need the outer loop:

  while read line; do
    grep 'name="'$line'"' new1.xml
  done < "test.txt" | sed -E 's/.+path="([^"]+)".+/\1/'

Upvotes: 1

Related Questions