Tenah
Tenah

Reputation: 63

How to remove partial duplicates from text file?

How can I remove partial duplicates in bash using either awk, grep or sort?

Input:

"3","6"
"3","7"
"4","9"
"5","6"
"26","48"
"543","7"

Expected Output:

"3","6"
"3","7"
"4","9"
"26","48"

Upvotes: 2

Views: 126

Answers (1)

RavinderSingh13
RavinderSingh13

Reputation: 133518

Could you please try following and let me know if this helps you.

awk -F'[",]' '!a[$5]++'   Input_file

Output will be as follows.

"3","6"
"3","7"
"4","9"
"26","48"

EDIT: Adding explanation too here.

awk -F'[",]' '   ##Setting field separator as " or , for every line of Input_file.
!a[$5]++         ##creating an array named a whose index is $5(fifth field) and checking condition if 5th field is NOT present in array a, so when any 5th field comes in array a then increasing its count so next time it will not take any duplicates in it. Since awk works on condition and then action, since here no action is mentioned so by default print of current line will happen.
' Input_file     ##Mentioning the Input_file here too.

Upvotes: 2

Related Questions