Reputation: 435
I need to remove subdomains from file:
.domain.com
.sub.domain.com -- this must be removed
.domain.com.uk
.sub2.domain.com.uk -- this must be removed
so i have used sed :
sed '/\.domain.com$/d' file
sed '/\.domain.com.uk$/d' file
and this part was simple, but when i try to do it in the loop problems appears:
while read line
do
sed '/\$line$/d' filename > filename
done < filename
I suppose it is "." and $ problem , have tried escaping it in many ways but i am out of ideas now.
Upvotes: 0
Views: 119
Reputation: 7571
A solution inspired by NeronLeVelu's idea:
#!/bin/bash
#set -x
domains=($(rev domains | sort))
for i in `seq 0 ${#domains[@]}` ;do
domain=${domains[$i]}
[ -z "$domain" ] && continue
for j in `seq $i ${#domains[@]}` ;do
[[ ${domains[$j]} =~ $domain.+ ]] && domains[$j]=
done
done
for i in `seq 0 ${#domains[@]}` ;do
[ -n "${domains[$i]}" ] && echo ${domains[$i]} | rev >> result.txt
done
For cat domains
:
.domain.com
.sub.domain.com
.domain.co.uk
.sub2.domain.co.uk
sub.domain.co.uk
abc.yahoo.com
post.yahoo.com
yahoo.com
You get cat result.txt
:
.domain.co.uk
.domain.com
yahoo.com
Upvotes: 2
Reputation: 7571
Your loop is a bit confusing because you're trying to use sed
to delete patterns from a file but you take the patterns from the same file.
If you really want to remove subdomains from filename
then I suppose you need more something like the following:
#!/bin/bash
set -x
cp domains domains.tmp
while read domain
do
sed -r -e "/[[:alnum:]]+${domain//./\\.}$/d" domains.tmp > domains.tmp2
cp domains.tmp2 domains.tmp
done < dom.txt
Where cat domains
is:
.domain.com
.sub.domain.com
.domain.co.uk
.sub2.domain.co.uk
sub.domain.co.uk
abc.yahoo.com
post.yahoo.com
and cat dom.txt
is:
.domain.com
.domain.co.uk
.yahoo.com
Running the script on these inputs results in:
$ cat domains.tmp
.domain.com
.domain.co.uk
Each iteration will remove subdomains of domain
currently read from dom.txt
, store it in a temporary file the contents of which is used in the next iteration for additional filtering.
It's good to try your scripts with set -x
, you'll see some of the substitutions, etc.
Upvotes: 0
Reputation: 10039
sed -n 's/.*/²&³/;H
$ {x;s/$/\
/
: again
s|\(\n\)²\([^³]*\)³\(.*\)\1²[^³]*\2³|\1\2\3|
t again
s/[²³]//g;s/.\(.*\)./\1/
p
}' YourFile
Load the file in working buffer then remove (iterative) any line that end with an earlier one, finally priont the result. Use of temporary edge delimiter easier to manage than \n in pattern
--posix -e
for GNU sed (tested from AIX)
Upvotes: 2