Reputation: 546
Given a regex, I want to print first occurrence of each unique match with its line number using bash.
For example, lets say the regex is .*Exception
, I want to print,
$./script.sh file.log
6255:2016-09-07 10:05:37,886 ERROR some text java.lang.IllegalMonitorStateException
6714:2016-09-07 10:12:09,514 ERROR some text java.lang.NullPointerException
7013:2016-09-07 10:19:19,950 ERROR some text java.lang.IllegalStateException
I came up with a version, but it is very slow :( (on git-bash). Any pointers on how to increase performance is appreciated.
FILE_NAME=$1
while read line
do
grep "$line" "$FILE_NAME" -m1 -n
done < <(grep '\b[^ ]*Exception\b' "$FILE_NAME" | sort -u) | sort -n
Update (adding sample data):
2016-09-07 23:58:55,674 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:56:16,273 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:26,304 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
Above should produce:
2:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4:2016-09-07 23:56:16,273 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
Upvotes: 1
Views: 118
Reputation: 37394
In Gnu awk:
$ awk '/Exception/ && !seen[gensub(/^([^ ]* ){2}/,"","g")]++ {print NR,$0}' file.log
2 2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4 2016-09-07 23:56:16,273 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5 2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
Print record if:
/Exception/
matches &&
and !seen[...]++
key hasn't been seen earliergensub(/^([^ ]* ){2}/,"","g")
key created by removing from start ^
to second space print NR,$0
print current record number and recordUpvotes: 1
Reputation: 23667
$ cat ip.txt
2016-09-07 23:58:55,674 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:56:16,273 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) Continuing ...
2016-09-07 23:58:26,304 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
$ perl -ne '($e)=/(\w+Exception)/; print "$.:$_" if !$seen{$e}++ && /Exception/' ip.txt
2:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.InstantiationException: java.sql.Timestamp
4:2016-09-07 23:56:16,273 WARN [com.arjuna.ats.jta.logging.loggerI18N] (Thread-12) [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
5:2016-09-07 23:58:55,675 ERROR [STDERR] (pool-18-thread-1) java.lang.RuntimeException: failed to evaluate: <unbound>=Class.new();
($e)=/(\w+Exception)/
saves the type of exception in $e
variable!$seen{$e}++
makes sure only first line matching the exception is printed&& /Exception/
to print only lines containing Exception
print "$.:$_"
print line number, :
and the input line
Edit:
This should work too and faster...
perl -ne 'if(/(\w+Exception)/){print "$.:$_" if !$seen{$1}++}' ip.txt
Upvotes: 1