user1757703
user1757703

Reputation: 3015

Unix Shell Script: Remove duplicates from line ignore blank lines

How can I "remove" the duplicate lines from a txt file while ignoring blank lines? Rather than removing I want to add a prefix of // (comment) to the line.

I have found from a search that this command awk ' !x[$0]++' will remove all duplicate lines from a file as well as blank lines. Modification to that awk command if possible would be great.

Original Input:

foo
bar
cat

dog
turtle
cat
bar
lion
bear

bird
fish
cat

Output:

foo
bar
cat

dog
turtle
// cat
// bar
lion
bear

bird
// lion
bird

Just need to ignore blank newlines and

Upvotes: 2

Views: 731

Answers (1)

jaypal singh
jaypal singh

Reputation: 77075

Here is one way using awk:

$ awk 'NF{x[$0]++; print (x[$0]>1?"//"$0:$0); next}1' file
foo
bar
cat

dog
turtle
//cat
//bar
lion
bear

bird
fish
//cat

NF tells awk to perform action only on non-blank lines. We increment the array x storing each line as key. We print the line with // prefix if the count is greater than 1 else we just print the line as is. 1 allows us to retain the blank lines.

Upvotes: 8

Related Questions