Reputation: 21
I have a file in Unix like the follows
">hello"
"hello"
"newuser"
"<newuser"
"newone"
Now I want to find unique occurrences in the file (exluding the <
or >
only while searching) and the output as:
">hello"
"<newuser"
"newone"
Upvotes: 2
Views: 503
Reputation: 369164
$ awk '{ w = $1; sub(/[<>]/, "", w) } word[w] == 0 { word[w]++; print $1 }' file1
">hello"
"newuser"
"newone"
Upvotes: 2
Reputation: 369164
#!/usr/bin/env python
import sys
seen = set()
for line in sys.stdin:
word = line.strip().replace('>', '').replace('<', '')
if word not in seen:
seen.add(word)
sys.stdout.write(line)
$ ./uniq.py < file1
">hello"
"newuser"
"newone"
Upvotes: 3
Reputation: 61
Here's that associative array idea in Ruby.
2.0.0p195 :005 > entries= [">hello", "hello", "newuser", "<newuser", "newone"]
=> [">hello", "hello", "newuser", "<newuser", "newone"]
2.0.0p195 :006 > entries.reduce({}) { |hash, entry| hash[entry.sub(/[<>]/,'')]=entry; hash}.values
=> ["hello", "<newuser", "newone"]
Upvotes: 0