Parsing a file with bash script

Question

I have a file with multiple lines which is structured as seen below

MSH|^~\&|Xatidok|V10.0.2.000|OSestra|x-tention|201203060855||ADT^A03|2914|P|2.3^AA&BB
EVN|A03|201203060855|201203060855|01|Fidani
PID|||00019380|2012049008^120005548^302830|PATIDOK-person^InRid^|Rudi|19111111|F|||Rose |A|Pens.
NK1||IRergrun^RROSlf^||Rose ^^Wels^^4600^A|07242123123|||||||||||||||||||||||||||||||
PV1||I|1212^G442^G442-||0|||||||||||2012049008|General|||||||||||||||||||12|||||201202060927|||||||

So basically there are rows with data on it seperated with pipes (|) and i want to parse them by writing a bash script.

So briefly this is the structure

Segment > rows
Field > cells between | field |
Component > each field has (or doesnt) several fields seperated with ^
Sub component > seperated with &

The idea of running the sript is: ./script.sh filename command

command should look like: MSH.2.3.4 or shorter

Meaning: Access the field which starts with MSH, Field number 2, Component number 3, Sub component 4

So my logic of parsing is as follows: I want to create an array which stores every row (segment) from the file as follows:

#!/bin/bash

file_to_be_parsed=$1
command=$2
counter=0

#read the file and split it into lines (segments) by creating an array called segments which holds all the lines (segment) in it
#array segments[] holds every line/segment of the file indexed from 0 to X

while IFS= read -a segment; do
     segments[$counter]=$segment
     counter=$((counter+1)); 
done < $file_to_be_parsed

SECOND: My second step is to seperate each array member one step further based on the delimiter and i can do it by:

IFS="|" read -r field <<< (here i can't figure out)

but i can't actually create 2D array in bash even though I searched a lot. Then i can access the specific fields ...

So can someone help me how to further seperate these array members into fields ...

dash-o · Accepted Answer

Fr puer bash-only solution, can use bash arrays to split the line into fields, components, sub components. Provided that you do not have to run the code on large data sets, should be OK.

Considers switching to more powerful engine (awk, python, perl) for large problems.

#! /bin/bash
file=$1
command=$2
   # Split command into key, so that items are key[0], key[1], ...
IFS="." read -a k <<<"$command"

  # Look for matching line to k[0]
while IFS='|' read -a fa ; do
  # Skip to next row if no match.
  [ "${fa[0]}" = "${k[0]}" ] || continue ;
  # Field
  v=${fa[${k[1]}-1]}
  # Component
  if [ "${#k[@]}" -gt 2 ] ; then
      IFS="^" read -a fb <<<"$v"
      v=${fb[${k[2]}-1]}
  fi
  # Sub component
  if [ "${#k[@]}" -gt 3 ] ; then
      IFS="&" read -a fc <<<"$v"
      v=${fc[${k[3]}-1]}
  fi
  echo "V=$v" ;
  break
done <"$file"

Parsing a file with bash script

Answers (2)

`input.txt`

`script.awk`

running the `script.awk` script:

output:

Related Questions

Parsing a file with bash script

Answers (2)

input.txt

script.awk

running the script.awk script:

output:

Related Questions

`input.txt`

`script.awk`

running the `script.awk` script: