RATNESH TIWARI
RATNESH TIWARI

Reputation: 87

Printing content in two file processing in awk

I am referring this link https://stackoverflow.com/a/54767231/11084572. I have a config file where 2nd column is feature and 3rd column is action. I have another large file where I need to match the 1st column of this file to the 1st column of the config file and perform action according to the feature.

Assumption: In File.txt column are named as Min (3rd col),Median (4th), Max(5th)

Config.txt

Apple  All  Max
Car    abc  Median
Car    xyz  Min
Book   cvb  Median
Book   pqr  Max

File.txt

Apple  first   10  20  30
Apple  second  20  30  40
Car    abc     10  20  30
Car    xyz     20  30  40
Car    wxyz    10  20  30
Book   cvb     60  70  80
Book   pqr     80  90  100

Expected Output:

Apple  first   30
Apple  second  40
Car    abc     20
Car    xyz     20
Car    wxyz    10
Book   cvb     70
Book   pqr     100

The above output is generated on the followinfg approach:

1) Since the file.txt is large, so if the feature (2nd col) of config file is ALL, so all the matching 1st column would perform action according to the 3rd col of config file.

2) Otherwise it perform if the 2nd col of config file matches as **substring** to the 2nd col of file.txt

Here what I have tried:

awk 'BEGIN {m["Min"]=3;m["Median"]=4;m["Max"]=5}

        NR==FNR{ arr[$1]=$2;brr[$1]=$3;next}
                ($1 in arr && arr[$1]=="All") {print $1,$2,$m[brr[$1]]}
                ($1 in arr && $2==arr[$1] ) {print $1 ,$2,$m[brr[$1]]}
' Config.txt File.txt

Code output:

Apple  first   30
Apple  second  40
Book   pqr     100
Car    xyz     20 

The above output is only printing one field of matched 1st col (like Book cvb 70 is not printing). Also how could I matched the string as ending string (Ex. xyz defined in config.txt matches to both xyz and wxyz of file.txt .

Please help me to solve above challenge. Thanks!

Upvotes: 0

Views: 67

Answers (1)

RavinderSingh13
RavinderSingh13

Reputation: 133518

Your expected sample output is NOT looking as per your shown sample of Input_file(eg--> Car abc 200 where there is NO 200 in file.txt), if I got it correctly could you please try following.

awk '
BEGIN{
  b["min"]=3
  b["max"]=5
  b["median"]=4
}
FNR==NR{
  c[$1]
  ++d[$1]
  a[$1 d[$1]]=tolower($NF)
  next
}
($1 in c){
  if(e[$1]<d[$1]){
      ++e[$1]
  }
  else{
      e[$1]!=""?e[$1]:++e[$1]
  }
  print $1,$2,$b[a[$1 e[$1]]]
}' config.txt file.txt

Output will be as follows.

Apple first 30
Apple second 40
Car abc 20
Car xyz 20
Car wxyz 10
Book cvb 70
Book pqr 100

Explanation: Adding explanation for above code now.

awk '                                       ##Starting awk program here.
BEGIN{                                      ##Mentioning BEGIN section here which will be executed once and before reading Input_file only.
  b["min"]=3                                ##Creating an array named b whose index is string min and value is 3.
  b["max"]=5                                ##Creating an array named b whose index is string max and value is 5.
  b["median"]=4                             ##Creating an array named b whose index is string median and value is 4.
}                                           ##Closing BLOCK section here.
FNR==NR{                                    ##Checking condition FNR==NR which will be executed when 1st Input_file named config.txt is being read.
  c[$1]                                     ##Creating an array named c whose index is $1.
  ++d[$1]                                   ##Creating an array named d and with index is $1 whose value is keep increasing with 1 on its each occurence.
  a[$1 d[$1]]=tolower($NF)                  ##Creating an array named a whose index is $1 and value of d[$1] and value is small letters value of $NF(last column) of current line.
  next                                      ##Using next keyword of awk to skip all further statements from here.
}
($1 in c){                                  ##Checking conditions if $1 of current line is present of array c then do following.
  if(e[$1]<d[$1]){                          ##Checking condition if value of e[$1] is lesser than d[$1] then do following.
      ++e[$1]                               ##Creating array named e whose index is $1 and incrementing its value with 1 here.
  }
  else{                                     ##Using else for above if condition here.
      e[$1]!=""?e[$1]:++e[$1]               ##Checking if e[$1] is NULL then increment it with 1 or leave it as it is.
  }
  print $1,$2,$b[a[$1 e[$1]]]               ##Printing 1st, 2nd fields value along with field value of array b whose index is value of array a with index of $1 e[$1] here.
}' config.txt file.txt                      ##Mentioning Input_files here.

Upvotes: 1

Related Questions