mynameisJEFF
mynameisJEFF

Reputation: 4239

Bash: sort rows within a file by timestamp

I am new to bash scripting and I have written a script to match regex and output lines to print to a file.

However, each line contains multiple columns, one of which is the timestamp column, which appears in the form YYYYMMDDHHMMSSTTT (to millisecond) as shown below.

20180301050630663,ABC,,,,,,,,,,
20180301050630664,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630666,ABC,,,,,,,,,,
20180301050630667,ABC,,,,,,,,,,
20180301050630668,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630661,ABC,,,,,,,,,,
20180301050630662,ABC,,,,,,,,,,

My code is written as follow:

awk -F "," -v OFS=","'{if($2=="ABC"){print}}' < $i>> "$filename"

How can I modify my code such that it can sort the rows by timestamp (YYYYMMDDHHMMSSTTT) in ascending order before printing to file?

Upvotes: 2

Views: 1834

Answers (3)

dawg
dawg

Reputation: 104014

If you are using gawk you can do:

$ awk -F "," -v OFS="," '$2=="ABC"{a[$1]=$0}         # Filter lines that have "ABC"
                         END{ # set the sort method
                             PROCINFO["sorted_in"] = "@ind_num_asc"   
                             for (e in a) print a[e] # traverse the array of lines
                         }' file

An alternative is to use sed and sort:

sed -n '/^[0-9]*,ABC,/p' file | sort -t, -k1 -n   

Keep in mind that both of these methods are unrelated to the shell used. Bash is just executing the tools (sed, awk, sort, etc) that are otherwise part of the OS.

Bash itself could do the sort in pure Bash but it would be long and slow.

Upvotes: 1

tripleee
tripleee

Reputation: 189628

Just add a pipeline.

awk -F "," '$2=="ABC"' < "$i" |
sort -n >> "$filename"

In the general case, to sort on column 234. try sort -t, -k234,234n

Notice alse the quoting around "$i", like you already have around "$filename", and the simplifications of the Awk script.

Upvotes: 1

David C. Rankin
David C. Rankin

Reputation: 84579

You can use a very simple sort command, e.g.

sort yourfile

If you want to insure sort only looks at the datestamp, you can tell sort to only use the first command separated field as your sorting criteria, e.g.

sort -t, -k1 yourfile

Example Use/Output

With your data save in a file named log, you could do:

$ sort -t, -k1 log
20180301050630661,ABC,,,,,,,,,,
20180301050630662,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630663,ABC,,,,,,,,,,
20180301050630664,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630665,ABC,,,,,,,,,,
20180301050630666,ABC,,,,,,,,,,
20180301050630667,ABC,,,,,,,,,,
20180301050630668,ABC,,,,,,,,,,

Let me know if you have any problems.

Upvotes: 5

Related Questions