Marjer
Marjer

Reputation: 1403

Compare two files and adding zero if no values in 2nd file

I have two text files zero.txt and value.txt.

zero.txt:

hour Value1  value2  
0        0       0
1        0       0
2        0       0
3        0       0
4        0       0

upto 24.

and value.txt:

hour Value1  value2  
0        1       1
2        2       2
4        3       4 

I want to compare 1st column in both the files (actually the first column is hour 0-24). If the values.txt has the hour value I need to print that row in output.txt if no values I need print hour with zeros in output.txt. like below:

and output.txt:

hour Value1  value2  
0        1       1
1        0       0
2        2       2
3        0       0
4        3       4 

How to achieve this Unix?

Upvotes: 1

Views: 484

Answers (4)

jaypal singh
jaypal singh

Reputation: 77155

You can make use of join command.

join -o 1.1,2.2,2.3 -a 1 -e 0 zero.txt value.txt

Upvotes: 4

unxnut
unxnut

Reputation: 8839

If we ignore the header and use bash script (you will need to adjust the margins):

while read x1 x2 x3
do
    if grep -q "^$x1\>" value.txt
    then
        grep "^$x1\>" value.txt >> output.txt
    else
        echo "$x1      $x2       $x3" >> output.txt
    fi
done < zero.txt

Upvotes: -2

Birei
Birei

Reputation: 36282

One solution using

Content of script.vim:

set backup
buffer 2
2,$ yank 
buffer 1
2 put!
2,$ ! sort -sun -k1,1
saveas! output.txt
qa!

Run it like:

vim -u NONE -N -S script.vim zero.txt value.txt

And will create a file named output.txt with content:

hour Value1  value2  
0        1       1
1        0       0
2        2       2
3        0       0
4        3       4

How does it work? It copies the content of values.txt, pastes all it just after the header of zero.txt and later sorts it by first column deleting duplicates.

Upvotes: 1

Chris Seymour
Chris Seymour

Reputation: 85875

This is what you want:

$ awk 'NR==FNR{a[$1]=$0;next}($1 in a){print a[$1];next}{print $0}' value zero
hour Value1  value2
0        1       1
1        0       0
2        2       2
3        0       0
4        3       4

Explanation:

The structure of awk scripts are a series of conditionals and blocks in the form conditional{block}. The script is executed per record that is read in the input and if the conditional is evaluated as true the code in the block will be executed. A simple example is awk '/hour/{print $0}' value where the input is the file value and the script /hour/{print $0} executed on every line in the file. The conditional here is a regexp match for the string hour since only the first line in the files matches it will be the only line printed in the output.

  • NR is a special awk variable that is incremented for every record read. By default records are split on new lines in awk. FNR is almost the same but is reset everytime a new file is read. So the condition NR==FNR is only true when we are reading the first file value.
  • a[$1]=$0 is creating a line look-up using the first field as the key.
  • next grabs the next line in file skipping the following blocks.
  • When the second file is being read we check if the first field if the look-up we created from the first file ($1 in a) if it is we print that value and get the next line.
  • If the first field isn't a key in the array we print the line in the file we are looking at zero.

Using some awk idiums we can shorten the script but it's probably best to be verbose whilst starting out:

awk 'NR==FNR{a[$1]=$0;next}$1 in a{$0=a[$1]}1' value zero

You should really start by reading Effective awk programming.

Upvotes: 4

Related Questions