Reputation: 55

How to extract data from log_file.txt using bash

i have a master_log_file.txt and the sequence is like below :

TransferDate|DeptID|FolderID             |DocID |AFPFileName|NoOfAcct| 
20181024    |1     |LRREM1.20181015.CGLOG|test  |xxxx       |12
20181024    |2     |LRREM2.20181013.CGLOG|home  |XyyX       |2
20181024    |3     |LRREM3.20181013.CGLOG|office|xy         |5
20181024    |4     |LRREM4.20181013.CGLOG|store |yy         |10

I want to create a bash file to separate all the log data according to the FolderID and DeptID into separated log file. Can someone give me an example on how to do this because i'm new with this batch thing. Thanks in advance. Below is my bash file according to mjuarez suggestion.

echo off
for folder in `grep -v TransferDate log_test.txt | cut -d "|" -f3 | sort | uniq`; do 
   grep ${folder} separated.txt > F:/Work/FLP Code/test/folder_${folder}.txt; 
done
pause

Am i missing something?

Upvotes: 1

Answers (3)

Paul Hodges

Reputation: 15438

First, c.f. this link on showing what you've tried and generally making it worth the time for other people to feel you have done your due diligence.

Second: Is that format consistent? It's obviously formatted, so I'm going to assume it is.

cut -c 14-41 logFile | grep -v DeptID | sort -u |
  while read key
  do IFS="$IFS|" read dept folder <<< "$key";
     grep "$key" < logFile > $folder.$dept;
  done

14-41 is the range of the keys you mention, which I pull with cut.

Strip the header with grep -v and sort -u them to get a unique set of combinations defining each output file. Pipe that to a while read loop.

Add the pipe character for a temp assignment of $IFS to assign the department and folder to vars I use to create distinct filenames for your outputs, then grep the key for each combo to a relevant file.

Does that do what you need?

I see someone beat me to it, but I didn't assume the folder values were always consistent, since you mentioned dept separately.

Upvotes: 1

mjuarez

Reputation: 16854

You can basically iterate over the unique elements you want to categorize (I used the FolderID column in this case), and send only those records to their own files, using grep.

for folder in `grep -v TransferDate file.txt | cut -d "|" -f3 | sort | uniq`; do 
   grep ${folder} file.txt > /tmp/folder_${folder}.txt; 
done

That creates the following files:

folder_LRREM1.20181015.CGLOG.txt  folder_LRREM3.20181013.CGLOG.txt
folder_LRREM2.20181013.CGLOG.txt  folder_LRREM4.20181013.CGLOG.txt

You can change the initial grep in the loop to use exactly the unique field or combination of fields you want.

Updated:

This is the finalized script, taking into account the two fields, and creating separate files for each of those categories:

for key in `cat file.txt | grep -v FolderID | awk 'BEGIN { FS="|"} { print $3 "_" $4 }' | sort | uniq` ; do
   value1=`echo $key | cut -d_ -f1`
   value2=`echo $key | cut -d_ -f2`
   grep -E "${value1}.*\|${value2}" file.txt > /tmp/key_${key}.txt;
done

It works a bit differently than the first one. It needs to grep by both keys, but then inside the loop it needs to build a basic regexp to look for lines that match both of those values, and then send them all to a file that has the full key as part of its name.

Upvotes: 0

Bsquare ℬℬ

Reputation: 4487

Like you asked to separate all the log data according to the FolderID and DeptID, you can process the input file (let's call it /tmp/log_file.txt) this way:

#!/bin/bash

for key in $( cat /tmp/log_file.txt |sed -e 's/[ \t]//g;' |awk -F '|' '{print $2"_"$3}' |sort -u ); do
  fileName="$key"
  filter=$( echo "$key" |sed -e "s/\([^_]*\)_\(.*\)$/\1[ \t]*|\2/" )
  grep -re "$filter" /tmp/log_file.txt > "/tmp/$fileName"
done

Don't hesitate if you need further explanation.

Upvotes: 1

How to extract data from log_file.txt using bash

Answers (3)

Related Questions