Reputation: 45646
I have a file with format like :
[PATTERN]
line1
line2
line3
.
.
.
line
[PATTERN]
line1
line2
line3
.
.
.
line
[PATTERN]
line1
line2
line3
.
.
.
line
I want to extract the following blocks from above file :
[PATTERN]
line1
line2
line3
.
.
.
line
Note: Number of lines between 2 [PATTERN] may varies, so can't rely on number of lines.
Basically, I want to store each pattern and the lines following it to Database, so I wil have to iterate all such blocks in my file.
How do this with Shell Scripting ?
Upvotes: 0
Views: 3889
Reputation: 44331
This assumes you are using bash as your shell. For other shells, the actual solution can be different.
Assuming your data is in data
:
i=0 ; cat data | while read line ; do \
if [ "$line" == "[PATTERN]" ] ; then \
i=$(($i + 1)) ; touch file.$i ; continue ; \
fi ; echo "$line" >> file.$i ; \
done
Change [PATTERN]
by your actual separation pattern.
This will create files file.1
, file.2
, etc.
Edit: responding to request about an awk solution:
awk '/^\[PATTERN\]$/{close("file"f);f++;next}{print $0 > "file"f}' data
The idea is to open a new file each time the [PATTERN]
is found (skipping that line - next
command), and writing all successive lines to that file. If you need to include [PATTERN]
in your generated files, delete the next
command.
Notice the escaping of the [
and ]
, which have special meaning for regular expressions. If your pattern does not contain those, you do not need the escaping. The ^
and $
are advisable, since they tie your pattern to the beginning and end of line, which you will usually need.
Upvotes: 1
Reputation: 3470
This can be for sure improved, but if you want to store lines in an array here is something I did in past:
#!/bin/bash
file=$1
gp_cnt=-1
i=-1
while read line
do
# Match pattern
if [[ "$line" == "[PATTERN]" ]]; then
let "gp_cnt +=1"
# If this is not the first match process group
if [[ $gp_cnt -gt 0 ]]; then
# Process the group
echo "Processing group #`expr $gp_cnt - 1`"
echo ${parsed[*]}
fi
# Start new group
echo "Pattern #$gp_cnt catched"
i=0
unset parsed
parsed[$i]="$line"
# Other lines (lines before first pattern are not processed)
elif [[ $gp_cnt != -1 ]]; then
let "i +=1"
parsed[$i]="$line"
fi
done < <(cat $file)
# Process last group
echo "Processing group #$gp_cnt"
echo ${parsed[*]}
I don't like the processing of the last group out of the loop...
Upvotes: 0