Reputation: 451
I have the following file called st.txt:
Item Type Amount Date
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -50 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
I want to sort the data by date and print all data except the first line. The following command works:
awk -F '\t' 'NR>1{print $4"\t"$1"\t"$2"\t"$3}' st.txt | sort -t"-" -n -k1 -k2 -k3
The output then is:
2020-01-23 Petrol expense -160
2020-03-24 Electricity expense -200
2020-04-24 Electricity expense -200
2020-05-30 Trim line expense -50
2021-03-11 Martha Burns income 150
2021-03-14 Highbury shops income 300
How can I write this command so I do not have to rearrange the columns so the date field remains at $4? I tried the following but it does not work:
awk -F '\t' 'NR>1{print $0}' st.txt | sort -t"-" -n -k 4,1 -k 4,2 -k 4,3
The dates are not sorted with this command.
The output should be:
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -500 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
Upvotes: 1
Views: 1259
Reputation: 104024
Given:
$ awk '{gsub(/\t/,"\\t")} 1' file
Item\tType\tAmount\tDate
Petrol\texpense\t-160\t2020-01-23
Electricity\texpense\t-200\t2020-03-24
Electricity\texpense\t-200\t2020-04-24
Trim line\texpense\t-50\t2020-05-30
Martha Burns\tincome\t150\t2021-03-11
Highbury shops\tincome\t300\t2021-03-14
You can either use Decorate / Sort / Undecorate pattern with POSIX awk:
awk 'BEGIN{FS=OFS="\t"} FNR>1{print $4, $0}' file | sort | cut -f 2-
Or use a proper CSV parser set to use a \t
instead of a comma. Ruby is the easiest:
ruby -r csv -e '
options={:col_sep=>"\t", :headers=>true, :return_headers=>true}
data=CSV.parse($<.read, **options).to_a
header=data.shift.to_csv(**options)
data.sort_by{|r| r[3]}.each{|r| puts r.to_csv(**options)}
' file
Either prints:
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -50 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
Upvotes: 0
Reputation: 88766
With GNU awk:
awk -F '\t' 'NR>1{a[$4]=$0} END{PROCINFO["sorted_in"] = "@ind_str_asc"; for(i in a){print a[i]}}' file
Output:
Petrol expense -160 2020-01-23 Electricity expense -200 2020-03-24 Electricity expense -200 2020-04-24 Trim line expense -50 2020-05-30 Martha Burns income 150 2021-03-11 Highbury shops income 300 2021-03-14
Upvotes: 3
Reputation: 203995
Assuming the fields in your input file are tab-separated as your code suggests they are:
$ tail -n +2 file | sort -t$'\t' -k4
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -50 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
Upvotes: 3