Reputation: 905
I have a log file (file.log
) with multiple occurrences of ids i.e. 82244956
in a file.
file.log
has been created using the command :
gzip -cd /opt/log.gz | grep "JBOSS1-1" >> ~/file.log
Example :
2012-04-10 09:01:18,196 LOG (7ysdhsdjfhsdhjkwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:02:18,196 LOG (24343sdjjkidgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:03:18,196 LOG (6744443jfhsdgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
2012-04-10 09:04:18,196 LOG (7ysdhsd5677dgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
Likewise we have 10000 rows with different ids (but each id repeating 2-3 times. Example top and bottom 2 rows in this example are repeating with id 82244956 and 82244957 respectively). We need result set based on UNIQUE ids (any row from the matched ids)i.e.:
2012-04-10 09:01:18,196 LOG (7ysdhsdjfhsdhjkwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244956
2012-04-10 09:03:18,196 LOG (6744443jfhsdgyuwe:IN) JBOSS1-1 (RP-yedgdh5567) [PayPalWeb] Fetch data with id: 82244957
I tried to awk program on Linux but not a successful one :
awk ' { arr[$1]=$0 } END { for ( key in arr ) { print arr[key] } } ' file.log >> final-report.log
Or a better way would be to create file.log
with distinct ids Only.
Please advise how can I modify it?
Upvotes: 0
Views: 1372
Reputation: 11
You can get the result by running the following script. To keep the first record, you should do a conditional judgment in the main processing part of the script.
awk -F"\t" '{delete arr;split($0,arr,"id:"); id_num=arr[2];
if(!(id_num in dic)){line[id_num]=$0;dic[id_num];}}
END{for(i in line)print line[i] }' file.log > result.log
Upvotes: 0
Reputation: 753545
$1
is the first field, the date. The id
is the last field, $NF
in awk
parlance. So:
awk '{arr[$NF] = $0} END { for (key in arr) { print arr[key] } }' file.log >> final-report.log
This keeps the last record with the given key. To keep the first record, you'd have to do a conditional assignment in the main processing part of the script.
Upvotes: 3