Symonds
Symonds

Reputation: 194

Shell Script to prepare unformatted data

I have text file TEST.txt which has below data which is unformated:

0411 14:30:00 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for [email protected], [email protected]
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigaben had no results
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigabe 14:30 NOT sent to [email protected], [email protected] since all reports were empty and empty reports should not be send
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [itraderdbint] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [qlp_devp] has been added to datasource map
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO

Now i want to create Shell script which will prepare this unformated data into below format and create for example PrepardFile.txt. I want to separate every string with pipe operator. The first part is date format so i want this as complete string. The second part always start with INF[ and ends with ] or we can take the complete part without spaces starting from INF[ and this would be my second string separated as pipe operator. The third part will be the remaining part which would be my third string. I want to add header for better understanding of what does this field value indicate:

DATE_FORMAT|ROW_EXECUTE|ROW_VALUE
0411 14:30:00|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for [email protected], [email protected]
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigaben had no results
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigabe 14:30 NOT sent to [email protected], [email protected] since all reports were empty and empty reports should not be send
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [itraderdbint] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [qlp_devp] has been added to datasource map
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO

I am very new to Shell script and dont know if this possbile to do with the help of shell script.

Upvotes: 0

Views: 93

Answers (2)

Jagan N
Jagan N

Reputation: 2065

@Symonds

This response is regarding your comment asking for adding a header section and further explanation.

To add header section, you can use echo and create the PreparedFile.txt first. Then use >> operator to append to the file. You can copy the complete code to a file named Script.sh and then run it using bash Script.sh

#!/bin/bash
echo "DATE_FORMAT|ROW_EXECUTE|ROW_VALUE" >  PreparedFile.txt
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' >> PreparedFile.txt

As far as the explanation you have asked for, you can chain commands using the pipe symbol |. The sed command allows you to substitute occurrences of regular expressions you specify with a replacement. In my first pipeline following cat command, I use s/ /|/2. This means replace the second occurence of blank space with |. You can read more about the sed command usage here.

Upvotes: 1

Jagan N
Jagan N

Reputation: 2065

You can use the below Shell script and see if it helps. It uses sed command and combination of pipes to replace second occurrence of space first and then the closing square bracket.

cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' > PreparedFile.txt

Upvotes: 1

Related Questions