Reputation: 105
I have a record file that stores the statuses of our systems by date. The script to generate it runs via cron, so the file is constantly getting longer. I wrote a script that iterated over every line to process it and this took a very long time to do. I've heard that awk is much faster at processing large text files. My problem is that I've never used it. Is it possible to use awk to get all entries within a date range? The dates are all in seconds as they were produced with date +%s
. Here is an example of output that I would like to be able to quickly find data in a range. So for example, how could I get all lines where the first column is between 1344279903 and 1344280204?
1344279903 | 0 | 0 | node | 1
1344279904 | 0 | 0 | node | 2
1344279905 | 0 | 0 | node | 3
1344280202 | 0 | 0 | node | 1
1344280203 | 0 | 0 | node | 2
1344280204 | 99 | 0 | node | 3
Upvotes: 3
Views: 1713
Reputation: 46856
Here's my take on this:
#!/usr/bin/awk -f
BEGIN {
start=ARGV[1]; ARGV[1]="";
end=ARGV[2]; ARGV[2]="";
}
$1 < start { next }
$1 > end { exit }
1
How does this work?
Awk uses a series of "condition { command }" blocks that are applied to each line of input. The BEGIN block is a "magic" one that runs before input starts. (There's a similar END block for the end of input, but we're not using it here.)
next
, we tell awk to read a new line of input and start processing its conditions all over again.Here it is in action, on your sample data:
ghoti@pc$ ./awkdate 1344279905 1344280203 data.txt
1344279905 | 0 | 0 | node | 3
1344280202 | 0 | 0 | node | 1
1344280203 | 0 | 0 | node | 2
ghoti@pc$
Upvotes: 2
Reputation: 77127
With awk?
awk -F'|' '1344279903 <= $1 && $1 <= 1344280204' file
With sed?
sed -n '/1344279903/,/1344280204/p' file
You can make the awk expression even more efficient by explicitly exiting after the last print statement:
awk -F'|' '1344279903 <= $1 && $1 <= 1344280204{ print $0; } $1 == 1344280204{ exit; }' file
Upvotes: 3
Reputation: 51904
You can use a conditional expression like so:
awk '$1 >= 1344279903 && $1 <= 1344280204 { print $0 }' data.txt
Upvotes: 4