Reputation: 3343
I have a file where there are name and time. I want to keep the entry only with the latest time. How do I do it?
for example:
>cat user.txt
"a","03-May-13
"b","13-May-13
"a","13-Aug-13
"a","13-May-13
I am using command sort -u user.txt
. It is giving the following output:
"a","11-May-13
"a","13-Aug-13
"a","13-May-13
"b","13-May-13
but I want the following output.
"a","13-Aug-13
"b","13-May-13
Can someone help?
Thanks.
Upvotes: 0
Views: 1662
Reputation: 40398
How about this?
grep `cut -d'"' -f4 user.txt | sort -t- -k 3 -k 2M -k 1n | tail -1` user.txt
Explaining: using sort as you have done, get the latest entry with tail -1, extract that date (second column when cutting with a comma delimiter) and then sort and grep on that.
edit: fixed to sort via month.
Upvotes: 0
Reputation: 12138
Try this:
sort -t, -k2 user.txt | awk -F, '{a[$1]=$2}END{for(e in a){print e, a[e]}}' OFS=","
Explanation:
Sort the entries by the date field in ascending order, pipe the sorted result to awk, which simply uses the first field as a key, so only the last entry of the entries with an identical key will be kept and finally output.
EDIT
Okay, so I can't sort the entries lexicographically. the date need to be converted to timestamp so it can be compared numerically, use the following:
awk -F",\"" '{ cmd=" date --date " $2 " +%s "; cmd | getline ts; close(cmd); print ts, $0, $2}' user.txt | sort -k1 | awk -F"[, ]" '{a[$2]=$3}END{for(e in a){print e, a[e]}}' OFS=","
If you are using MacOS, use gdate
instead:
awk -F",\"" '{ cmd=" gdate --date " $2 " +%s "; cmd | getline ts; close(cmd); print ts, $0, $2}' user.txt | sort -k1 | awk -F"[, ]" '{a[$2]=$3}END{for(e in a){print e, a[e]}}' OFS=","
Upvotes: 3
Reputation: 660
I think you need to sort year, month and day.
Can you try this
awk -F"\"" '{print $2"-"$4}' data.txt | sort -t- -k4 -k3M -k2 | awk -F- '{kv[$1]=$2"-"$3"-"$4}END{for(k in kv){print k,kv[k]}}'
Upvotes: 1
Reputation: 3343
For me this is doing the job. I am sorting on the Month and then applying the logic that @neevek used. Till now I am unable to find a case that fails this. But I am not sure if this is a full proof solution.
sort -t- -k2 -M user1.txt | awk -F, '{a[$1]=$2}END{for(e in a){print e, a[e]}}' OFS=","
Can someone tell me if this solution has any issues?
Upvotes: 0