Reputation: 12228
I'm creating graph in graphViz and I need every connection to be display only once, how to transform this input using linux commands?
INPUT
aa -- bb[label=xyz]
ab -- bb[label=yzx]
aa -- bb[label=zxy]
ac -- ab[label=xyz]
bb -- aa[label=xzy]
DESIRED OUTPUT:
aa -- bb[label=xyz]
ab -- bb[label=yzx]
ac -- ab[label=xyz]
so aa -- bb
equals to bb -- aa
and needs to be removed.
I tried sort -k1,2 -u -t[
bot it didnt work with [
delimiter and don't know how to check for "reverse" entries ("xx -- yy" = "yy -- xx")
Upvotes: 2
Views: 107
Reputation: 62389
Here's one idea (not tested, but should be close):
sed -e 's/[[].*// -e 's/-- //' input.txt |
awk '{ if ((e[$1$2] != 1) && (e[$2$1] != 1))
{ print $1, $2
e[$1$2] = e[$2$1] = 1
}
}'
The sed ...
bit strips out the --
and the [label...]
portions, since you don't seem to care about them, then awk
keeps track of which pairs have been seen in either order and only prints them if they haven't been seen yet.
Upvotes: 0
Reputation: 85795
Here is a method using awk
:
$ awk -F'[[]| -- ' '!a[$1,$2]++&&!a[$2,$1]' file
aa -- bb[label=xyz]
ab -- bb[label=yzx]
ac -- ab[label=xyz]
Upvotes: 4
Reputation: 19093
You can specifify [ as the delimiter this way:
sort -k2 -u -t'['
Does that give you what you need ?
Upvotes: 0