LiLi Liu
LiLi Liu

Reputation: 37

search 2 files and output the common and print an extra line unix scripting

I have this code, but it's giving me an error

awk '
    FNR == NR {
     # reading get_ids_only.txt
     values[$1] = ""
     next
    }
BEGIN {
  # reading default.txt
  for (elem in values){
    if ($0 ~ elem){
      if (values[elem] == ""){
        values[elem] = "\"" $0 "\""
        getline;  
        values[elem] = "\n"" $0 ""\n"
        }
      else{
        values[elem] = values[elem] ", \"" $0 "\""
         getline; 
         values[elem] = values[elem] "\n"" $0 ""\n"
         }
    }
 }
END {
  for (elem in values)
    print elem " [" values[elem] "]"
    }
' get_ids_only.txt default.txt

The error says

awk: syntax error at source line 23
 context is
    >>>  END <<<  {
awk: illegal statement at source line 24
awk: illegal statement at source line 24
    missing }

This is where my END{ } function starts...

What I'm trying to do is.. compare the string.... in file 1.. if the string is found in file 2, print the string and print the line after it as well., then skip a space.

input1:

 message id "hello"
 message id "good bye"
 message id "what is cookin"

input2:

 message id "hello"
 message value "greetings"

 message id "good bye"
 message value "limiting"

 message id "what is there"
 message value "looking for me"

 message id "what is cooking"
 message value "breakfast plate"

output:

 should print out all the input1, grabbing the message value from input 2.

can anyone guide me on why this error is occurring?

I'm using the terminal on my mac.

Upvotes: 1

Views: 124

Answers (1)

Thor
Thor

Reputation: 47099

Here's your BEGIN block with recommended indention and comments, can you see the problem?

BEGIN {
  # reading default.txt
  for (elem in values){
    if ($0 ~ elem){
      if (values[elem] == ""){
        values[elem] = "\"" $0 "\""
        getline;  
        values[elem] = "\n"" $0 ""\n"
      }
      else{
        values[elem] = values[elem] ", \"" $0 "\""
        getline; 
        values[elem] = values[elem] "\n"" $0 ""\n"
      } # End inner if
    } # End outer if
  } # End for loop

Your missing a closing brace. Note that in the final concatenation with $0, $0 is actually quoted.

There are some other issues with this, I'm not sure what you are trying to do, but it seems a very un-awky approach. Usually if you find yourself overusing getline, you should be thinking about spreading the code into separate blocks with appropriate conditions. See this article on the uses an misuses of getline for more.

A more awky way to solve it

If I understand you correctly, this is the way I would solve this task:

extract.awk

FNR==NR  { id[$0]; next }  # Collect id lines in the `id' array
$0 in id { f=1 }           # Use the `f' as a printing flag 
f                          # Print when `f' is 1
NF==0    { f=0 }           # Stop printing after an empty line

Run it like this:

awk -f extract.awk input1 input2

Output:

message id "hello"
message value "greetings"

message id "good bye"
message value "limiting"

Upvotes: 1

Related Questions