Parse xml and a text file to remove wildcards in shell

Question

I have an xml file with input like this. I am trying to write a shell script to remove the wildcards in the host.

I have a text file that has hostnames for each of the groupnames as below.

aM
hostname1
hostname2

ESB
hostname3
hostname4

Omega
hostname5
hostname6
hostname7
hostname8
hostname1

I am trying to parse/go through the text file and change the xml file to remove the wildcards. So, the result i am trying to get is

I tried with sed and awk as the below example

sed '/GroupSubjectEntry host="\*"/p' omegatest.xml|sed '0,/\*/s//host/' but that's just changing the first line.

I thought of running this through a for loop and using sed p option but there's too many varaibles involved. I am basically trying to remove the wildcards in the xml to add appropriate hostnames. Can someone please help?

RavinderSingh13 · Accepted Answer

Could you please try following, written and tested with GNU awk. Fair warning tools eg--> xmlstarlet are recommended to deal with xmls since OP couldn't use those and doesn't have those so coming with this one but there is no guarantee that this will work with all kind of xmls, this has written strictly for shown samples only.

1st solution: As per OP's expected output:

awk '
!NF{  next  }
FNR==NR{
  if($0 ~ /GroupEntry groupname="/){
     match($0,/"[^"]*/)
     val=substr($0,RSTART+1,RLENGTH-1)
     match($0,/^ +/)
     spaces[val]=substr($0,RSTART,RLENGTH)
     namesVal[val]=$0
     next
  }
  if($0 ~ // || $0~//){
    rest[++count1]=$0
  }
  next
}
!/hostname/{
  if($0 in names){
    nameVal=namesVal[$0]
    check=$0
    if(FNR==1){ print rest[++count2];found="" }
    print namesVal[$0]
    num=split(names[$0],arr,"
")
  }
  if(found){ print rest[++count2];found="" }
}
/^hostname/{
  found=1
  for(i=1;i<=num;i++){
    print spaces[check] ""
  }
  next
}
END{
  print rest[count2]
}
'  Input_file groupnames

2nd solution: If OP is NOT bothering of name sequence from actual Input-file then one could try following.

awk '
FNR==NR{
  if(!NF){ next }
  if($0!~/^hostname/){ val=$0 }
  else               { arr[val]=(arr[val]?arr[val] ORS:"")$0 }
  next
}
/"
  }
  next
}
1' groupnames  Input_file

Also this gives output in order of hostnames with respective entry of groupname, I hope OP is ok wit it.

Parse xml and a text file to remove wildcards in shell

Answers (1)

Related Questions