Rachit Agrawal
Rachit Agrawal

Reputation: 3343

Unix: Get the latest entry from the file

I have a file where there are name and time. I want to keep the entry only with the latest time. How do I do it?

for example:

>cat user.txt
"a","03-May-13
"b","13-May-13
"a","13-Aug-13
"a","13-May-13

I am using command sort -u user.txt. It is giving the following output:

"a","11-May-13
"a","13-Aug-13
"a","13-May-13
"b","13-May-13

but I want the following output.

"a","13-Aug-13
"b","13-May-13

Can someone help?

Thanks.

Upvotes: 0

Views: 1662

Answers (4)

vikingsteve
vikingsteve

Reputation: 40398

How about this?

grep `cut -d'"' -f4 user.txt | sort -t- -k 3 -k 2M -k 1n | tail -1` user.txt

Explaining: using sort as you have done, get the latest entry with tail -1, extract that date (second column when cutting with a comma delimiter) and then sort and grep on that.

edit: fixed to sort via month.

Upvotes: 0

neevek
neevek

Reputation: 12138

Try this:

sort -t, -k2 user.txt | awk -F, '{a[$1]=$2}END{for(e in a){print e, a[e]}}' OFS=","

Explanation:

Sort the entries by the date field in ascending order, pipe the sorted result to awk, which simply uses the first field as a key, so only the last entry of the entries with an identical key will be kept and finally output.

EDIT

Okay, so I can't sort the entries lexicographically. the date need to be converted to timestamp so it can be compared numerically, use the following:

awk -F",\"" '{ cmd=" date --date " $2 " +%s "; cmd | getline ts; close(cmd); print ts, $0, $2}' user.txt | sort -k1 | awk -F"[, ]" '{a[$2]=$3}END{for(e in a){print e, a[e]}}' OFS=","

If you are using MacOS, use gdate instead:

awk -F",\"" '{ cmd=" gdate --date " $2 " +%s "; cmd | getline ts; close(cmd); print ts, $0, $2}' user.txt | sort -k1 | awk -F"[, ]" '{a[$2]=$3}END{for(e in a){print e, a[e]}}' OFS=","

Upvotes: 3

leoleozhu
leoleozhu

Reputation: 660

I think you need to sort year, month and day.

Can you try this

awk -F"\"" '{print $2"-"$4}' data.txt | sort -t- -k4 -k3M -k2 | awk -F- '{kv[$1]=$2"-"$3"-"$4}END{for(k in kv){print k,kv[k]}}'

Upvotes: 1

Rachit Agrawal
Rachit Agrawal

Reputation: 3343

For me this is doing the job. I am sorting on the Month and then applying the logic that @neevek used. Till now I am unable to find a case that fails this. But I am not sure if this is a full proof solution.

sort -t- -k2 -M user1.txt | awk -F, '{a[$1]=$2}END{for(e in a){print e, a[e]}}' OFS=","

Can someone tell me if this solution has any issues?

Upvotes: 0

Related Questions