pankaj ghadge
pankaj ghadge

Reputation: 945

sed regular expression doesn't consider double quote or white space

Following sed command with regex does not work properly, I want to remove the host element but it also remove the next element

sed command

sed -i 's+^\(.*SERVER.*\)\(host=.*\)[[:blank:]]\(.*/>.*\)$+\1\3+' /tmp/file_tmp.xml

/tmp/file_tmp.xml

  <SERVER port="2001" buildg="group1" host="host1" slices="1" search="st0"/>
  <SERVER port="2003" buildg="group2" host="" slices="1" search="st1"/>

expected output:

  <SERVER port="2001" buildg="group1" slices="1" search="st0"/>
  <SERVER port="2003" buildg="group2" slices="1" search="st1"/>

Actual output

  <SERVER port="2001" buildg="group1" search="st0"/>
  <SERVER port="2003" buildg="group2" search="st1"/>

Upvotes: 1

Views: 102

Answers (2)

Dudi Boy
Dudi Boy

Reputation: 4900

Here is simple sed solution.

  1. Find the line of interest using grep pattern ``

  2. Remove the required RegExp string in found line.

    sed -i '/^[[:space:]]*<SERVER/s| host="[^"]*"||' input.txt
    

Explanation

/^[[:space:]]*<SERVER/ Filter only line staring with spaces and SERVER

s| host="[^"]*"|| Substitute RegExp host="[^"]*" with empty string.

host="[^"]*" RegExp to match name"*"

Upvotes: 0

Quas&#237;modo
Quas&#237;modo

Reputation: 4004

.* is greedy (it matches the longest possible string), so the one in host=.*[[:blank:]] catches all up to (not inclusive) search. Try this instead:

sed 's+^\(.*SERVER.*\)\(host=[^ ]*\)[[:blank:]]\(.*/>.*\)$+\1\3+'

Upvotes: 3

Related Questions