Reputation: 51
I have a raw file with IP ranges (xx.xx.xx.xx-yy.yy.yy.yy) I want to create a new list with the range converted into single IP addresses. (All ranges are in a 1-255 range)
conditions
(1) If the difference between the fourth IP octet on each line is less or equal to the max
variable (say 5) It will loop and report each iteration as a single /32 address.
(2) IP address with more than the max variable will be reported as ip address with /24
The following bash script works fine but it is slow on files of 50,000 lines? Any help would be appreciated. Its part of a script that does other functions so I need to stay in BASH.
for i in $data; do
A=$(echo $i | sed 's/-.*//'); B=$(echo $i | sed 's/^.*-//')
A1=$(echo $A | cut -d '.' -f 4); B1=$(echo $B | cut -d '.' -f 4)
diff=`expr $B1 - $A1`
if [ "$diff" == "0" ]; then
echo $A >> $outfile
elif [ "$diff" -gt "0" -a "$diff" -le $max ]; then
echo $A >> $outfile
for a in $(jot "$diff"); do
count=`expr $A1 + $a`
echo $A | sed "s/\.[0-9]*$/.$count/" >> $outfile
done
else
echo $A | sed 's/\.[0-9]*$/.0\/24/' >> $outfile
fi
done
Upvotes: 2
Views: 2732
Reputation: 15996
The likely reason your script is so slow for 50000 lines is that you having bash
call a lot of external programs (sed
, cut
, jot
, expr
), several times in each iteration of your inner and outer loops. Forking external processes adds a lot of time overhead, when compounded over multiple iterations.
If you want to do this in bash, and improve performance, you'll need to make use of the equivalent features that are built into bash. I took a stab at this for your script and came up with this. I have tried to keep the functionality the same:
for i in $data; do
A="${i%-*}"; B="${i#*-}"
A1="${A##*.}"; B1="${B##*.}"
diff=$(($B1 - $A1))
if [ "$diff" == "0" ]; then
echo $A >> $outfile
elif [ "$diff" -gt "0" -a "$diff" -le $max ]; then
echo $A >> $outfile
for ((a=1; a<=$diff; a++)); do
count=$(($A1 + $a))
echo "${A%.*}.$count" >> $outfile
done
else
echo "${A%.*}.0/24" >> $outfile
fi
done
In particular I've made a lot of use of parameter expansions and arithmetic expansions. I'd be interested to see what kind of speedup (if any) this has over the original. I think it should be significantly faster.
Upvotes: 1
Reputation: 3121
If you are okay with using python, install (download, extract and run sudo python setup.py install
) ipaddr library https://pypi.python.org/pypi/ipaddr, then write something like this
import ipaddr
for ip in (ipaddr.IPv4Network('192.0.2.0/24')):
print ip
Upvotes: 0