Reputation: 8136
I am parsing XML with regex. It is well known so there is no need to worry about escaping etc and proper XML parsing.
grep is returning multiple lines and I want to store each match to a file.
However, I either get each line in between my tags in my array array=( $list )
or I get the whole output array=( "$list" )
.
How can I loop over each match from grep?
My script currently looks like this:
#!/bin/bash
list=$(cat result.xml|grep -ozP '(?s)<tagname.*?tagname>')
array=( "$list" )
arraySize=${#array[@]}
for ((i = 0; i <= $arraySize; i += 1)); do
match="${array[$i]}"
echo "$match" > "$i".xml
done
Upvotes: 0
Views: 1304
Reputation: 4504
You could use grep
to grab all the matches first, and then use awk
to save each matched pattern into separate files (e.g. file1.xml, file2.xml, etc):
cat result.xml | grep -Pzo '(?s)(.)<tagname.*?tagname>(.)' | awk '{ print $0 > "file" NR ".xml" }' RS='\n\n'
Upvotes: 0
Reputation: 241851
According to this answer, the upcoming version of grep
will change the meaning of the -z
flag so that both input and output are NUL-terminated. So that will automatically do what you want, but it's only available today by downloading and building grep from the git repository.
Meanwhile, a rather hackish alternative is to use the -Z
flag which terminates the file name with a NUL character. That means you need to print a "filename", which you can do by using -H --label=
. That will print an empty filename followed by a NUL before each match, which is not quite ideal since you really want the NUL after each match. However, the following should work:
grep -ozZPH --label= '(?s)<tagname.*?tagname>' < result.xml | {
i=0
while IFS= read -rd '' chunk || [[ $chunk ]]; do
if ((i)); then
echo "$chunk" > $i.xml
fi
((++i))
done
}
Upvotes: 1
Reputation: 876
Directly cat you lines to a while loop
my_spliting_command | grep something | while read line
do
echo $line >myoutputfile.txt
done
Upvotes: 0