Reputation: 4750
I have a text file like below.
1 1223 abc
2 4234 weroi
0 3234 omsder
1 1111 abc
2 6666 weroi
I want to have unique values for the column 3. So I want to have the below file.
1 1223 abc
2 4234 weroi
0 3234 omsder
Can I do this using some basic commands in Linux? without using Java or something.
Upvotes: 0
Views: 121
Reputation: 5062
You could do this with some awk
scripting. Here is a piece of code I came up with to address your problem :
awk 'BEGIN {col=3; sep=" "; forbidden=sep} {if (match(forbidden, sep $col sep) == 0) {forbidden=forbidden $col sep; print $0}}' input.file
The BEGIN
keyword declares the forbidden
string, which is used to monitor the 3rd column values. Then, the match
keyword check if the 3rd column of the current line contains any forbidden
value. If not, it adds the content of the column to the forbidden
list and print the whole line.
Here, sep=" "
instantiate the separator. We use sep
between each forbidden
value in order to avoid words created by putting several values next to one another. For instance :
1 1111 ta
2 2222 to
3 3333 t
4 4444 tato
In this case, without a separator, t
and tato
would be considered a forbidden
value. We use " " as a separator as it is used by default to separate each column, thus a column cannot include a space in its name.
Note that if you want to change the number of the column in which you need to remove duplicate, just adapt col=3
with the number of the column you need (0 for the whole line, 1 for the first column, 2 for the second, ...)
Upvotes: 1