Reputation: 100
I'm trying to sort a text by date. My file format is:
...
[15/08/2019 - 01:58:49] some text here
[15/08/2019 - 02:21:23] more text here
[15/08/2019 - 02:56:11] blah blah blah
...
I've tried multiple different methods with the sort command.
One attempt: "sort -b --key=1n --debug Final_out.txt"
sort: using ‘en_US.UTF-8’ sorting rules
sort: key 1 is numeric and spans multiple fields
sort: option '-b' is ignored
^ no match for key
^ no match for key
...
__
.?
^ no match for key
__
.?
^ no match for key
__
sort: write failed: 'standard output': Input/output error
sort: write error
Second attempt: "sort -n -b --key=10,11 --debug Final_out.txt" Produced same output above
Just about to tear my hair out. This has to be possible, it's Linux! Come someone kindly give me pointers?
Upvotes: 2
Views: 338
Reputation: 65
I've the same issue with my HISTORY with HISTTIMEFORMAT="%d/%m/%y %T "
To sort according to year, month and day, I used this options in sort:
history | awk '/0[78]\/06/{print" "$1" "$2" "$3" command number "NR}'|head -20
1921 07/06/22 09:21:05 command number 925
1922 07/06/22 13:23:31 command number 926
1923 07/06/22 13:24:16 command number 927
1924 07/06/22 13:23:31 command number 928
1925 07/06/22 13:24:16 command number 929
1926 08/06/22 10:59:12 command number 930
1927 08/06/22 10:59:21 command number 931
1928 08/06/22 10:59:26 command number 932
1929 08/06/22 10:59:27 command number 933
1930 08/06/22 10:59:34 command number 934
1931 08/06/22 10:59:44 command number 935
1932 08/06/22 11:01:47 command number 936
1933 08/06/22 11:03:35 command number 937
1934 08/06/22 11:03:44 command number 938
1935 08/06/22 11:03:48 command number 939
1936 08/06/22 11:04:02 command number 940
1937 08/06/22 11:12:17 command number 941
1938 07/06/22 13:24:16 command number 942
1939 08/06/22 09:22:10 command number 943
1940 08/06/22 09:29:41 command number 944
history | awk '/0[78]\/06/{print" "$1" "$2" "$3" command number "NR}'|head -20|sort -bn -k2.7,2.8 -k2.4,2.5 -k2.1,2.2 -k3.1,3.2 -k3.4,3.5 -k3.7,3.8 -k1
1921 07/06/22 09:21:05 command number 925
1922 07/06/22 13:23:31 command number 926
1924 07/06/22 13:23:31 command number 928
1923 07/06/22 13:24:16 command number 927
1925 07/06/22 13:24:16 command number 929
1938 07/06/22 13:24:16 command number 942
1939 08/06/22 09:22:10 command number 943
1940 08/06/22 09:29:41 command number 944
1926 08/06/22 10:59:12 command number 930
1927 08/06/22 10:59:21 command number 931
1928 08/06/22 10:59:26 command number 932
1929 08/06/22 10:59:27 command number 933
1930 08/06/22 10:59:34 command number 934
1931 08/06/22 10:59:44 command number 935
1932 08/06/22 11:01:47 command number 936
1933 08/06/22 11:03:35 command number 937
1934 08/06/22 11:03:44 command number 938
1935 08/06/22 11:03:48 command number 939
1936 08/06/22 11:04:02 command number 940
1937 08/06/22 11:12:17 command number 941
Explainations in sort -bn -k2.7,2.8 -k2.4,2.5 -k2.1,2.2 -k3.1,3.2 -k3.4,3.5 -k3.7,3.8 -k1
command :
And, for @Ventus, the solution can be sort -n -k1.9,1.12 -k1.5,1.6 -k1.2,1.3 -k3.1,3.2 -k3.4,3.5 -k3.7,3.8
Upvotes: 0
Reputation: 785531
Here is an alternative but shorter sorting way using gnu awk
:
cat file
[10/01/2020 - 01:23:45] lorem ipsum
[15/08/2019 - 02:21:23] more text here
[15/08/2019 - 02:56:11] blah blah blah
[15/08/2019 - 01:58:49] some text here
[14/08/2019 - 12:34:56] dolor sit amet
Use this awk:
awk -v FPAT='[0-9:]+' '{ map[$3,$2,$1,$4] = $0 }
END { PROCINFO["sorted_in"]="@ind_str_asc"; for (k in map) print map[k] }' file
[14/08/2019 - 12:34:56] dolor sit amet
[15/08/2019 - 01:58:49] some text here
[15/08/2019 - 02:21:23] more text here
[15/08/2019 - 02:56:11] blah blah blah
[10/01/2020 - 01:23:45] lorem ipsum
Upvotes: 2
Reputation: 22032
As Shawnn suggests, how about a bash solution:
#!/bin/bash
pat='^\[([0-9]{2})/([0-9]{2})/([0-9]{4})[[:blank:]]+-[[:blank:]]+([0-9]{2}:[0-9]{2}:[0-9]{2})\]'
while IFS= read -r line; do
if [[ $line =~ $pat ]]; then
m=( "${BASH_REMATCH[@]}" ) # make a copy just to shorten the variable name
echo -e "${m[3]}${m[2]}${m[1]}_${m[4]}\t$line"
fi
done < file.txt | sort -t $'\t' -k1,1 | cut -f2-
pat
is a regular expression to match the date and time field
and assigns bash variable BASH_REMATCH[@]
to day, month, year and time
in order.sort
keyed on the 1st field.cut
off.The input file file.txt
:
[10/01/2020 - 01:23:45] lorem ipsum
[15/08/2019 - 02:21:23] more text here
[15/08/2019 - 02:56:11] blah blah blah
[15/08/2019 - 01:58:49] some text here
[14/08/2019 - 12:34:56] dolor sit amet
Output:
[14/08/2019 - 12:34:56] dolor sit amet
[15/08/2019 - 01:58:49] some text here
[15/08/2019 - 02:21:23] more text here
[15/08/2019 - 02:56:11] blah blah blah
[10/01/2020 - 01:23:45] lorem ipsum
Upvotes: 3