Unix Shell Script: Remove duplicates from line ignore blank lines

Question

How can I "remove" the duplicate lines from a txt file while ignoring blank lines? Rather than removing I want to add a prefix of // (comment) to the line.

I have found from a search that this command awk ' !x[$0]++' will remove all duplicate lines from a file as well as blank lines. Modification to that awk command if possible would be great.

Original Input:

foo
bar
cat

dog
turtle
cat
bar
lion
bear

bird
fish
cat

Output:

foo
bar
cat

dog
turtle
// cat
// bar
lion
bear

bird
// lion
bird

Just need to ignore blank newlines and

jaypal singh · Accepted Answer

Here is one way using awk:

$ awk 'NF{x[$0]++; print (x[$0]>1?"//"$0:$0); next}1' file
foo
bar
cat

dog
turtle
//cat
//bar
lion
bear

bird
fish
//cat

NF tells awk to perform action only on non-blank lines. We increment the array x storing each line as key. We print the line with // prefix if the count is greater than 1 else we just print the line as is. 1 allows us to retain the blank lines.

Unix Shell Script: Remove duplicates from line ignore blank lines

Answers (1)

Related Questions